Job Overview
About Voltai
Voltai’s mission is to re-build the physical world through developing super-intelligence to accelerate the pace of hardware innovation. Our focus is to build frontier models that could understand one of the world’s most complex technologies—semiconductor and electronics.
About the Role
We’re looking for a Data Engineer who thrives in tackling complex challenges and is passionate about building innovative data systems. As a Data Engineer at Voltai, your primary responsibility will be to build the world’s largest semiconductor dataset, leveraging your expertise in data pipelines and scalable infrastructure.
You will play a critical role in preparing data for our Machine Learning team and developing systems that manage vast amounts of information across various modalities such as text, images, circuits, and more.
Key Responsibilities:
Build and manage the world’s largest semiconductor dataset.Develop crawlers to scrape data at an internet scale.Extract and clean information from diverse modalities, including text, images, circuits, simulations, and signals.Prepare and preprocess data for the Machine Learning team.Build systems to handle the transfer of customer data and feedback.Parse documents across various formats and structures.Develop data pipelines for data labelers and manage workloads across large cloud compute clusters.Implement and maintain systems for pre-processing datasets for AI training.
Required Skillsets:
Proven experience in building scalable data pipelines.Expertise in PDF parsing and data extraction.Strong engineering skills with a passion for improving data and model performance.Experience working with modalities beyond text and demonstrating exceptional work in those areas.Ability to build custom data processing libraries from scratch.Keeping up with state-of-the-art techniques for preparing AI training data.Proficiency in organizing and meticulously managing data across multiple clouds, modalities, and sources.
Bonus Points:
Background in Electrical Engineering.Experience in connecting machine learning model behavior to data distribution and data quality.Experience in fine-tuning large language models.Experience at a hyper-growth startup.Experience building data pipelines for training foundation models.
Compensation Philosophy
At Voltai, we believe that exceptional work deserves exceptional rewards. Our compensation structure reflects the value each team member brings to our pioneering efforts in the semiconductor and AI industries. For this role, we anticipate the starting annual base salary to be within the range of $150,000 to $350,000, adjusted according to the candidate’s experience, expertise, and impact potential. The final offer may vary to ensure alignment with individual contributions and the long-term success of our mission.
Our Benefits
At Voltai, we believe in taking care of our team so they can focus on pushing the boundaries of innovation. Our benefits package is designed to support your well-being and fuel your professional growth.Unlimited PTO: We trust you to manage your time and know when you need a break. Recharge when you need it, no questions asked.Comprehensive Health Coverage: Your health matters. We offer top-tier medical and dental insurance to keep you and your loved ones covered.Commitment to Your Growth: At Voltai, we’re dedicated to your continuous learning and development. Whether it’s through challenging projects or opportunities for professional advancement, we invest in your journey to becoming a leader in your field.
Job Detail
Related Jobs (2180)
-
Machine Learning Engineer – REMOTE on December 21, 2024
-
Blockchain Engineer – REMOTE on December 19, 2024
-
Research and Development Engineer (DeFi, Distributed Systems) – REMOTE on December 16, 2024
-
Senior Compiler Engineer – REMOTE on December 13, 2024
-
Senior Cryptography Engineer – REMOTE on December 12, 2024
-
AI & Data Scientist Intern – REMOTE on December 22, 2024
-
Senior Data Analyst on December 6, 2024
-
Programmatic Senior Analyst on December 6, 2024
-
Survey Methodologist on December 6, 2024
-
Data Analyst on December 6, 2024