Untether AI - Toronto ON
- Led the design and architecture of next-generation inference compiler technologies for large language models (LLMs)
- Delivered industry-leading MLPerf results across BERT Large and ResNet-50 benchmarks
- Drove silicon spin readiness, analyzing ECO cost-risk tradeoffs and guiding compiler-related silicon decisions
- Managed a cross-functional teams responsible for deployment of Llama3 8B and 70B across multi-chip, rack-scale inference systems