Architect - GPU Performance Analysis
NVIDIA
NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities which are hard to seek, which only we can pursue, and which matter to the world. This is our life’s work: to amplify human inventiveness and intelligence.
NVIDIA is now driving innovation at the intersection of visual processing, high performance computing and artificial intelligence. We are looking for passionate, highly motivated, creative engineers to be part of our HW architecture team. As part of this team, you would be working on projects that will help make our next generation visual computing, automotive, GPU, HPC systems better. You will get to work on high performance CPU and Memory sub-systems, Next-Gen GPUs , NOC based Interconnect Fabric etc. Make the choice to join us today.
What you'll be doing:
System level performance analysis/ bottleneck analysis of complex, high performance GPUs and System-on-Chips (SoCs).
Work on hardware models of different levels of abstraction, including performance models, RTL test benches ,emulators and silicon to analyze performance and find performance bottlenecks in the system.
Understand key performance use-cases of the product. Develop workloads and test suits targeting graphics, machine learning, automotive, video, compute vision applications running on these products.
Work closely with the architecture and design teams to explore architecture trade-offs related to system performance, area, and power consumption.
Develop required infrastructure including performance models, testbench components, performance analysis and visualization tools.
Drive methodologies for improving turnaround time, finding representative data-sets and enabling performance analysis early in the product development cycle.
What we need to see:
BE/BTech, or MS/MTech in relevant area, PhD is a plus, or equivalent experience.
3+ years of experience with exposure to performance analysis and complex system on chip and/or GPU architectures.
Strong understanding of System-on-Chip (SoC) architecture, graphics pipeline, memory subsystem architecture and Network-on-Chip (NoC)/Interconnect architecture.
Expert hands on competence in programming (C/C++) and scripting (Perl/Python). Exposure to Verilog/System Verilog, SystemC/TLM is a strong plus.
Strong debugging and analysis (including data and statistical analysis) skills, including use for RTL dumps to debug failures.
Hands on experience developing performance simulators, cycle accurate/approximate models for pre-silicon performance analysis is a strong plus.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, autonomous and love a challenge, we want to hear from you. Come, join our Deep Learning Automotive team and help build the real-time, cost-effective computing platform driving our success in this exciting and quickly growing field.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.