Hi there 👋, I'm

Yunmo Koo

Founding Engineer at FriendliAI

Inference Optimization / Distributed Training / Multi-Cloud / LLMOps

mpbb03@gmail.com

about me

Hi! My name is Yunmo Koo, Founding Engineer at FriendliAI

I am a founding engineer at FriendliAI with end-to-end experience building LLM inference, runtime, and distributed training platforms from the ground up. I specialize in production ML infrastructure that improves latency, reliability, and cost efficiency. I also led customer-facing engineering, product adoption, and technical sales growth in the US market.

I am very passionate and dedicated to my work, I have plenty of problem solving, communication and leadership skills; and I'm very good at teamwork. You can get in touch with me by filling this 📄form, or you can send me an email to
📧mpbb03@gmail.com; also you are more than welcome to follow my work on my GitHub and visit my LinkedIn profile.

🤖 YunmoGPT

Ask about me!

Experience

Feb 2021 - Present

Founding Engineer

FriendliAI

- Led R&D of speculative decoding systems for production LLM inference
- Developed high-performance kernels for LLM operations (attention, sampling, decoding)
- Built inference runtime with memory management, scheduling, KV-cache optimization
- Led development of PeriFlow distributed training platform
- Trained and released FAI-13B before Meta's Llama 2
- Led US solutions architecture, supporting 100+ customer PoCs
- Delivered 30+ talks at industry events; built LangChain/LlamaIndex integrations

Education

2020 - 2022

MS, Computer Science and Engineering

Seoul National University

- @Software Platform Lab
- "Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs", NeurIPS 2021
- Distributed training job orchestration system
- Deep learning computational graph optimization

2014 - 2020

BS, Computer Science and Engineering

Seoul National University

- Double majored with Korean History
- The period includes two years of military service