Director of Engineering · Senior Staff SWE

I build the control plane underneath the GPUs.

Hands-on architect and engineering leader — 17 years in large-scale distributed systems, now owning fleet-scale GPU and LLM inference infrastructure at Coupang Intelligence Cloud. The Kubernetes-native substrate beneath the company's model training and serving: ~5,000 H200/B200 GPUs across 200+ clusters, held at 99.99%. Close enough to the code to debug an OVS flow pipeline or an NCCL collective on a B200 fabric.

Kirkland, WA · Staff / Principal IC + Eng leadership

Download résumé See the work

Experience17 yrs

Fleet~5,000 gpus

Clusters200+

Availability99.99%

Cost saved$8M+

Engineering stories

GPU Virtualization · VFIO01 / 07

Closing the bare-metal gap

Led B200 SKU enablement end to end; validated 397 TFLOPS single-GPU under VFIO passthrough, matching the NVIDIA SU1 bare-metal baseline; unlocked multi-tenant GPU VMs.

397 TFLOPS = bare metal B200 · VFIO · SR-IOV

Custom Scheduler · Go02 / 07

A scheduler that refuses to half-allocate

Designed and wrote CIC's custom K8s GPU scheduler in Go (inspired by NVIDIA KAI-Scheduler): gang scheduling with transactional all-or-nothing allocation (zero partial allocations); an async Binder controller (BindRequest CRD) decoupling placement from API-server latency.

0 partial allocations Gang · Fractional · Preempt-reclaim

Design inspiration: KAI-Scheduler ↗

Patent-Pending · CRD03 / 07

One spec for a whole AI application

CompositeApplication CRD: one declarative spec the control plane composes and reconciles across compute, storage, networking, and identity; patent-pending; the basis for the tenant-facing API.

Patent-pending Declarative · Reconciled · Multi-resource

Patent filing · link on grant

Multi-Tenant SDN04 / 07

Isolating tenants down to the flow

KVM + OVS + VXLAN overlay with BGP EVPN; per-tenant DNS identity, IPAM for InfiniBand/SR-IOV, and egress accounting; served as incident commander for fleet networking.

OVS/OVN · BGP EVPN · VXLAN Per-tenant isolation

DPU-Assisted Bare-Metal Cloud05 / 07

Moving the whole data plane onto the DPU

Offloaded the entire host network and storage data plane to BlueField-3 DPUs (hardware-offloaded OVS): near-line-rate throughput at negligible host CPU. Owned DPU lifecycle end to end — firmware/OS via Redfish and clusterware, network boot via NVIDIA DOCA SNAP, qemu-nbd → virtio-blk — with tenant IP/OS mobility, dual-path RAID-1 to DPU block devices, and active/standby dual-DPU failover tied to host UEFI boot order. Partnered with NVIDIA engineering on converged networking.

Near-line-rate · negligible host CPU BlueField-3 · DOCA SNAP · hw-offload OVS Dual-DPU failover

Built on NVIDIA DOCA ↗

Platform Ownership06 / 07

Cutting the vendor cord

Migrated etcd out of NVIDIA Base Command Manager and transferred Day 0 / Day 2 ownership in-house; eliminated BCM licensing for the Kubernetes layer.

Vendor licensing eliminated etcd migration · Self-owned cadence

Applied ML Platform07 / 07

Finding the same product twice — at catalog scale

Built a duplicate-item-matching platform: parallel image + text deep-embedding pipelines with FAISS vector search across 50M+ catalogs at 3,500 RPS; complementary match sets enabled a union-of-candidates design giving a 106% recall lift over Elasticsearch.

106% recall ↑ Embeddings · FAISS · Vector search Published

Publication · add paper link

The path

2020 — Now

Coupang Intelligence Cloud

Director of Engineering · Senior Staff SWE

Owning fleet-scale GPU and LLM inference infrastructure — the Kubernetes-native control plane beneath model training and serving across ~5,000 H200/B200 GPUs and 200+ clusters.

2018 — 2020

AWS

Senior SWE

WAFV2 + Firewall Manager — building edge security control planes operating at billions of requests per day.

2015 — 2018

Microsoft

Senior SWE

Dynamics CRM Online reliability — hardening a large multi-tenant SaaS platform.

2009 — 2015

Intel

Lead SWE · Foundry Services

Led engineering in Foundry Services, driving $3M in new revenue.

The stack

GPU & LLM Compute

vLLM
H200 / B200
VFIO
Fractional GPU
InfiniBand / SR-IOV
NCCL
RunAI

Kubernetes Control Plane

Custom CRDs / controllers
Custom scheduler
Gang / preempt-reclaim
Admission webhooks
etcd
Self-healing reconciliation

SDN & Overlay

KVM / OVS / OVN
VXLAN
BGP EVPN
OpenFlow
CNI / Calico
IPAM

Languages & Tooling

Go
Java
Python
gRPC / Protobuf
Linux internals
Prometheus / Grafana
Terraform / Helm / ArgoCD

Open to the right room

Let's talk control planes, GPUs, and teams.

Staff / Principal IC · Eng leadership · GPU, ML & distributed-systems infrastructure

Email me LinkedIn Download résumé