Senior Software Engineer, Data Science--LLM MLOps Platform
NVIDIA is hiring a Senior Data Science Engineer to onboard ML teams on our unified LLM MLOps platform by developing ML pipelines and building the platform in a way that supports a wide range of diverse use cases. NVIDIA is in an outstanding position: we are developing AI-based products across multiple domains. Our vision is to bring to bear the MLOps platform we developed for Autonomous Vehicles and apply it to other use cases: NLP, speech, recommendation systems, computer vision, healthcare. To achieve this, we will embed ourselves into ML teams to help them develop their MLOps pipeline on to a single methodology that uses that unified platform, recognize what is common to all use cases and what is domain-specific and develop MLOps tools and features. The members of that team will realize the MLOps strategy for NVIDIA and understand both MLOps production pipelines and MLOps infrastructure in order to help us build what could be the best and most comprehensive MLOps platform in the world.
What you'll be doing:
We expect you to analyze the datasets, raise and validate hypotheses, extract meaningful features and build models on top of them.
Data science project involve various types of tasks, ranging from LLM prompt, response, and chat analysis to document retrieval data and analysis, metadata collection and augmentation using machine learning techniques, LLM data sample categorization using prompt engineering techniques, product intelligence from production data, outlier/noise detection and much more.
Optimize your algorithms and data structures to make them applicable to large datasets and cluster-based processing. Perfect the models and algorithms until they reach the desired accuracy.
Prepare documentation for the proposed approaches, policies, data formats, test cases and the expected results within the scope of your projects.
What we need to see:
Bachelor's or equivalent experience
5+ years of industrial experience
Ability to write readable and maintainable code (primarily in Python/PySpark), knowledge of scientific libraries (Numpy, SciPy, Pandas, scikit-learn)
Experience with extracting data from storage systems (e.g. Hive, Cassandra, S3, Swiftstack) and understanding of how big data processing systems (e.g. Spark or Map/Reduce) work and help scale the processes.
Experience in solving problems using machine learning techniques (statistics, clustering, classification, outlier analysis, etc.). Experience with Deep Learning and LLM prompt engineering or prompt tuning is a plus. Strong background in taking care of time series data and hypothesis validation is desired.
Stress resistance, strong collaboration, interpersonal, presentation and reporting skills should be your strong sides. You are skilled and eager--a self-educated person with strong self-management skills. Be proactive in proposing innovative approaches, algorithms. Don't be scared of diving into scientific articles to come back with a fresh solution!
Strong communication skills: upper-intermediate oral and written technical English is required
Ways to stand out from the crowd:
Being a champion of user privacy first when handling user data for product analysis and machine learning model development
Experience utilizing differential privacy techniques for processing user data and developing machine learning models
With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working with us and our engineering teams are growing fast in some of the hottest innovative fields: Deep Learning, Artificial Intelligence, and Large Language Models. If you're a creative engineer with a real passion for robust and enjoyable user experiences, we want to hear from you!The base salary range is $144,000 - $270,250. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.