Product Engineer - Failure Analysis Engineering

NVIDIA

NVIDIA

Product
Mexico · Remote
Posted on Monday, June 24, 2024

We are now looking for a Datacenter Product Engineer! NVIDIA Corporation is a world leader in visual computing technology. The GPU, which the company invented, serves as the visual cortex of modern computers and is at the heart of their products and services. NVIDIA has transformed into a specialized platform company that targets four large markets – Gaming, Professional Visualization, Datacenter and Automotive – where visual computing is essential and deeply valued. Their work also uncovers new universes to explore and enable amazing creativity and discovery by powering what was once thought to be science fiction inventions like artificial intelligence and autonomous cars.

Collaborating with your peers across various engineering groups, you will successfully launch new systems for NVIDIA HGX GPU Accelerated Server Platforms to production. These purpose-built systems are optimized for the growing Deep Learning, Artificial Intelligence, and Analytics environments. With world-class technology enabling never-been-seen-before performance levels, NVIDIA’s HGX portfolio is arguably the most complicated systems platform ever developed by humans. This product family represents the company’s fastest growing line of business as well as its largest total available market opportunity. You will bring to bear your knowledge of system architectures and GPU technology in order to productize new GPU boards for datacenter architectures with GPU-accelerated clusters. Your responsibilities will include planning and establishing processes, defining test requirements and optimizing the production line to deliver new GPU boards. You will also be instrumental in helping the team to achieve the desired cost and quality metrics considered best-in-class.

What you will be doing:

  • Working at a partner managed facility, help analyze and debug complex problems.

  • Use your knowledge of system power-up and handshakes during boot to debug complex interactions between HW, FW and SW on faulty boards

  • Recommend, drive and ensure compliance to DFx requirements for robust signal integrity performance as related to layout, mechanical components, assembly procedures, etc.

  • Develop and deliver test specs for system level manufacturing screens for all new products to meet the required HW coverage, quality and product requirements for various business units.

  • Collaborate with CM to define product assembly line, number of test stations and number of assembly fixtures, optimized for cost and throughput.

  • Craft creative solutions and WARs through volume data analysis and lab experimentation to solve challenging yield and test problems seen on the production floor.

  • Lead optimization and continuous improvement efforts on the production screen spec definition processes to minimize waste and meet test time, yield, DPPM requirements.

  • Support customer facing and quality teams during customer escalations to understand the issue and fix gaps identified in coverage.

What we need to see:

  • BS or MS degree in EE/CE or equivalent experience.

  • 10+ years of meaningful industry experience.

  • Strong EE fundamentals, knowledgeable in digital design, signal integrity, statistics, timing analysis, fault analysis, sampling and computer architecture.

Ways to stand out from the crowd:

  • Prior board/system level electrical design experience.

  • Experience in managing Failure Analysis activities

  • Experience with Perl, C/C++, Windows, and Linux.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers; we have some of the most forward-thinking and hardworking people in the world working for us and, due to unparalleled growth, best-in-class teams are rapidly growing. If you’re creative and autonomous with a real passion for your work, we want to hear from you!