Senior AI Training Performance Engineer

NVIDIA

NVIDIA

Software Engineering, Data Science
Shanghai, China
Posted on Jan 26, 2024

We are now looking for a Senior AI Training Performance Engineer!

NVIDIA is seeking senior engineers who are obsessed with performance analysis and optimization to help us squeeze every last clock cycle out of AI training, one of the most important workloads in the world. If you are unafraid to work across all layers of the hardware/software stack from GPU architecture to Deep Learning Framework to achieve peak performance, we want to hear from you! This role offers the opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads the AI revolution while helping deep learning users around the globe enjoy ever-higher training speeds.

What you will be doing:

  • Understand, analyze, profile, and optimize AI and deep learning training workloads on state-of-the-art hardware and software platforms.

  • Understand the big picture of training performance on GPUs, prioritizing and then solving problems across many dozens of state-of-the-art neural networks.

  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.

  • Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.

  • Build tools to automate workload analysis, workload optimization, and other critical workflows.

What we want to see:

  • PhD (or equivalent experience) in CS, EE or CSEE and 5+ years; or MS and 8+ years of relevant work experience.

  • Strong background in deep learning and neural networks, in particular training.

  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.

  • Proven experience analyzing and tuning application performance.

  • Experience with processor and system-level performance modelling.

  • Programming skills in C++, Python, and CUDA.

Intelligent machines powered by AI computers that can learn, reason and interact with people are no longer science fiction. Today, a self-driving car powered by artificial intelligence can meander through a country road at night and find its way. An AI-powered robot can learn motor skills through trial and error. This is truly an extraordinary time. The era of AI has begun, and we are powering it. NVIDIA is increasingly known as the AI Computing company and is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. Are you passionate about performance? Are you interested in working on industry-leading Deep Learning products? Come, join our Deep Learning Architecture team, where you can help build real-time, cost-effective computing platforms driving our success in this exciting and rapidly growing field.