Senior Software Engineer - Data Center Rack and Power Management Engineering

NVIDIA

NVIDIA

Software Engineering
Multiple locations
Posted on Oct 23, 2024

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

Join NVIDIA, a pioneer in accelerating computing, at the forefront of technological advancement as a Software Engineer - Data Center Rack and Power Management. This role offers a unique opportunity to be part of our mission to design and implement rack-level solutions for next-generation AI supercomputing platforms. NVIDIA’s GB200 superchip exemplifies our commitment to performance and productivity, and we seek exceptionally versatile engineers to advance our high-reaching goals.

What you’ll be doing:

  • Drive next-generation power management solutions for scaling AI infrastructure using NVIDIA GPUs and CPU solutions.

  • Collaborate with customers, product management, and architects to accurately define requirements and ensure high quality products on accelerating schedules.

  • Develop architecture for power management at the server and rack levels, optimizing power consumption at the data center level.

  • Produce detailed architecture specifications and validate through POCs. Educate partners on product architecture and incorporate their feedback.

  • Coordinate the development of comprehensive architecture specs and design documents. Lead all aspects of product delivery by collaborating across teams.

  • Conduct code reviews, improve unit testing, and ensure a robust test plan is in place.

  • Support QA teams in leading product life cycles, ensuring their successful implementation. Effectively use Jira and other tools to articulate requirements and carry out plans.

  • Contribute to all phases of product development, from definition and design to implementation, debugging, testing, and early customer support.

What we need to see:

  • Looking for candidates with a BS, MS, or PhD in EE/CS or a related field (or equivalent experience) and a minimum of 8 years of experience in building rack or server management solutions.

  • Experience evaluating power usage at the component level and reducing power consumption in server systems. Understanding of power metrics retrieval from devices.

  • Expertise in firmware architecture and optimizing firmware for low latency APIs.

  • Strong and proven skill in C/C++ and Python

  • Proficient programming and debugging skills for server platforms.

  • Experience with SCM tools (e.g., Git, Perforce) and project management tools like Jira.

  • Excellent written and oral communication skills, strong work ethics, and a high sense of teamwork.

  • A self-starter who finds creative solutions to complex problems and is hands-on with coding.

Ways to stand out from the crowd:

  • Proven track record of improving perf/watt or TCO/watt for Data Centers.

  • Experience developing OpenBMC solutions ideally with commits that have been upstreamed to the opensource repository.

  • Active OCP and DMTF contributor in relevant areas with hands-on experience in x86 or ARM system architecture.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you!

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.