Senior Networking AI Platform Engineer
NVIDIA
We are seeking a highly skilled Senior Networking AI Platform Engineer to join our Applied Networking AI group. In this role, you will help design and develop cutting-edge AI solutions, integrating them seamlessly into a variety of products. You’ll collaborate closely with multi-functional teams of data scientists, software engineers, and DevOps professionals to ensure the efficient deployment, monitoring, and optimization of machine learning (ML) models.
As a key contributor, you will drive the entire software development lifecycle—from conceptualization and architecture to implementation and production—while working closely with engineering teams to solve complex problems and help build a successful NVIDIA practice.
What you'll be doing:
Lead the design, development, and deployment of robust software systems across different platforms and environments
Architect, design, and implement scalable and high-performance software solutions, handling complex requirements and integrating various subsystems
Ensure systems are maintainable, flexible, and well-documented, with an emphasis on performance and security
Adapt to new tools, technologies, and frameworks, and be capable of taking ownership of the development process from conception to deployment
Supply innovative ideas and solutions, driving continuous improvement in both code quality and system efficiency
Develop and maintain scalable infrastructure for handling and deploying security and networking ML models in production, ensuring high availability, scalability, performance.
Design and implement data pipelines to efficiently process and transform large volumes of data for training and inference purposes.
Optimize and fine-tune ML models for performance, scalability, and resource utilization, considering factors such as latency, efficiency, and cost.
Collaborate with data scientists and software engineers to operationalize and deploy ML models, including model versioning, packaging, and integration with existing systems.
What we need to see:
Bachelor’s or master’s degree in computer science, Data Science, or a closely related discipline.
Over 5 years of experience in software development and/or MLOps.
Strong proficiency in programming languages such as Python, Java, C++.
Deep understanding of cloud services architecture and the ability to create real-world applications that include telemetry, authentication, authorization, and security standard methodologies.
Proven track record of leading complex software projects from concept to delivery.
A "can do" attitude with exceptional problem-solving skills and the ability to thrive in fast-paced environments..
Strong problem-solving skills and ability to solve and resolve sophisticated issues in a timely manner.
Excellent communication and collaboration skills, with the ability to work effectively in multi-functional teams.
Attention to detail and a focus on quality, ensuring robustness and reliability in production ML systems.
Experience with Kubernetes architecture and management is a plus.
Ways to stand out from the crowd:
Exude high energy and a positive attitude.
Stellar verbal and written communication skills.
Passionate about data science and implementation.
Have data science and GPU performance experience.
Want to make what was impossible possible!
We are an equal opportunity employer and value diversity at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.