Senior DataCenter System Engineer

NVIDIA

NVIDIA

Multiple locations
Posted on Sep 22, 2023

NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU-accelerated servers acting as the brain of modern data centers. The NVIDIA Datacenter System group is looking for a Senior System Diagnostics Engineer with an interest in design automation, tools, integration, verification and validation. This position offers the opportunity to make a broad impact across many projects while developing on internal solutions to enable system creation in the most optimized way.

What You'll Be Doing:

  • Lead planning, defining test requirements and optimizing the production line to deliver new NVIDIA GPU datacenter products.

  • Be instrumental in driving the best-in-class quality metrics by tracking and ensuring adequate test coverage of all aspects of the product.

  • Collaborate with Test Engineering, Product Engineering and SW teams to ensure successful production test diagnostics SW package releases through various stages of NPI, product ramp and entire lifecycle.

  • Leverage your in-depth experience with statistical analysis tools and data parsing scripts to define manufacturing test spec limits for various parameters.

  • Analyze, debug and resolve critical firmware and software issues, often under tight time schedules.

  • Use your knowledge of system power-up and handshakes during boot to debug sophisticated interactions between HW, FW and SW on faulty boards.

  • Early engagement with HW/FW/SW engineering teams, and other groups, to build end-to-end solutions and optimize datacenter product designs.

  • Parse manufacturing logs and manipulate databases to analyze failures.

  • Collaborate and establish continuous improvements in our manufacturing flows and production test diagnostics.

  • Innovating!

What We Need To See:

  • Bachelors or Masters degree in Math, Computer Science, or Engineering field or equivalent experience

  • 6+ years of experience on server systems.

  • Strong problem solving and software engineering skills, a passion for applying them to new challenges and a commitment to high quality work.

  • Expertise in Python or a similar language and an understanding of object- oriented programming

  • Proven experience on Server architectures, CPU baseboards and GPU technology in order to productize new GPU boards and GPU-accelerated Server architectures.

  • Consistent track record of conceptualizing, designing, and implementing modular and robust software components with well-thought-out APIs and interfaces.

  • Deep knowledge of server systems including SBIOS, BMC, network, power, rack layouts, cabling, and experience with compute, storage and GPU servers in both air and water cooled environments.

  • Knowledge of IPMI/SNMP/Redfish.

  • Ability to multitask effectively in a dynamic environment.

  • You love solving hard problems and can work independently or as part of a team.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people in multiple teams to help us accelerate the next wave of artificial intelligence, in Software, Hardware, Research and more. If you are creative and passionate, we want to hear from you.

The base salary range is $152,000 - $287,500. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.