Job Description:
• Architect, build, and operate end-to-end ML pipelines for training, validation and deployment on Google Cloud.
• Define, instrument, and maintain logging, monitoring, and alerting for model performance and data drift.
• Automate CI/CD for ML artifacts and infrastructure using GitHub Actions or equivalent.
• Collaborate with cross-functional teams, including frontend engineers, backend engineers, research engineers, and infrastructure engineers.
• Write clean, well-documented, fast, and maintainable code.
• Help ensure our systems have high availability and performance.
Requirements:
• BS in Computer Science or a related field.
• 5+ years of experience as a AI/ML Ops, DevOps, Infrastructure Engineer or equivalent.
• Expert-level Python and TypeScript skills.
• Experience with Docker, Kubernetes, Terraform, and Google Cloud.
• Deep understanding of large language models (LLMs) and prompt-engineering best practices.
• Experience designing and maintaining CI/CD pipelines to fine-tune or train LLM models.
• Excellent written and verbal communication skills.
Benefits:
Apply Now
Apply Now