Download PDF

Summary

Software engineer with extensive experience in deep learning performance and HW/SW co-design, including leadership of cross-functional teams and large-scale deployment projects. Expertise in inference compiler technologies, neural network optimization, and high-performance computing architecture. Holds a B.A.Sc. in Systems Design Engineering with Summa Cum Laude honors from the University of Waterloo. Aspiring to apply advanced technical proficiency and leadership skills to roles focused on innovative AI system design and performance optimization at scale.

Work experience

Senior Staff Software Engineer - Tech Lead

2024/07Present
Untether AI - Toronto ON
  • Leading the design of next-generation inference compiler technologies for LLMs
  • Managing a cross-functional team responsible for the deployment of Llama3 8b and 70b
  • vLLM runtime and compiler integration
  • Multi-chip and rack scale support
  • MLPerf submission and benchmarking (BERT Large, RN50)
  • Silicon metal spin, risk-cost analysis of ECOs, and decision making

Staff Software Engineer - Deep Learning Performance

2023/012024/07
Untether AI - Toronto, ON
  • Hardware/Software co-design and validation for Untether's generative inference accelerator
  • Team management of 17 contractors deploying CNNs on Untether's inference accelerator
  • High performance computing, kernel development, and software stack architecture design
  • Spatial NN placement, routing, and throughput optimization
  • Client engagement and custom application design
  • Hardware bring up planning and execution

Deep Learning Engineer

2021/092023/01
Untether AI - Toronto, ON
  • Neural Network ingestion, quantization, and optimization
  • INT8 and MF8 layer/algorithm development
  • Quantization aware training and sequential knowledge distillation
  • Graph pruning, fusion, and post-training layer swapping

Support Researcher

2020/042021/09
Huawei R&D Laboratory - Waterloo, ON
  • Remote Differential Compression algorithm creation, benchmarking, and publication

Education

B.A.Sc. in Systems Design Engineering - Artificial Intelligence

2016/082021/05
University of Waterloo
  • Summa Cum Laude / Dean's Honors List.

Publications

  • Bhatt, Ramón et al. "Unsupervised Detection of Lung Nodules in Chest Radiography Using Generative Adversarial Networks", EMBC Annual Meeting 2021
  • Borzov, Ramón et al. "Method and Apparatus for Replicating a Target File between Devices" World Intellectual Property Organization, WO2023000915A1 / US20230087778. Issued Jan 26, 2023
  • Kitamura, Ramón et al. : "Mapping Attention Mechanisms (Transformer) Function to Spatial Architecture (SIMD or At-Memory Processing)" US Patent Office, 63/608,539. Filled Dec 22, 2023, Patent Pending

Notable Projects

ChexScan - Capstone Project

2020/082021/04
University of Waterloo
  • ML model for chest X-Ray disease screening. Finished first place and published results in EMBC 2021.