About the position
As the leader in animal health, Zoetis is looking to recruit a Senior DevOps/MLOps Engineer into its world-class Veterinary Medicine Research and Development (VMRD) organization to operationalize AI/ML, scientific modeling, and digital twin workloads. You’ll build secure, scalable platforms and data pipelines across cloud and on‑prem/HPC, partnering closely with biologists and data scientists to translate scientific questions into reliable production systems.
Responsibilities
• Build end‑to‑end DevOps/MLOps foundations: CI/CD for code/data/models, containerization/orchestration, artifact/registry management, and secure configuration.
• Design and operate data engineering pipelines (batch/streaming) with data quality checks, lineage, schema contracts, and governance across lake/warehouse environments.
• Productionize scientific and digital twin workflows into services/APIs and lightweight UIs with reproducibility, versioning, auditability, and compliant deployment.
• Implement scalable training/inference (batch/real‑time) with observability, SLIs/SLOs, runbooks, incident response, and automated rollback strategies.
• Run distributed/HPC jobs (including GPU) and optimize storage, throughput, and cost across on‑prem and cloud; collaborate with scientists on experiment design, data/compute needs, and validation.
Requirements
• PhD in a quantitative field (computer science, ML, computational biology, applied math) or MS/BS with equivalent senior engineer level experience working in a scientific domain.
• 6+ years building production systems; strong software engineering fundamentals.
• Expert in Python
• Strong experience with a query language such as SQL, MapReduce, and/or Cypher
• Proficiency in one of: C++, Go, Rust, Java, or Scala.
• Docker, Kubernetes, CI/CD (e.g., GitHub Actions), secure artifact/container registries.
• Data pipeline orchestration (e.g., Databricks, Dagster, Kedro); streaming (Kafka or Redis); data modeling with SQL/NoSQL/graph.
• MLOps: experiment tracking and model versioning (e.g., MLflow), model serving and monitoring.
• Cloud (AWS/Azure/GCP) and on‑prem/HPC (e.g., Slurm) experience.
• Experience on multidisciplinary projects and teams, including scientists and software engineers, with excellent communication with scientific stakeholders.
Nice-to-haves
• APIs and scientific apps: FastAPI; minimal UIs (Streamlit/React); scientific computing (NumPy, Pandas, SciPy).
• DevOps/IaC: Terraform; GitOps (Argo CD/Flux); Helm/Kustomize; Docker/Kubernetes; secure registries and config.
• Data engineering: dbt and feature stores; Parquet/Delta; schema/lineage with Avro/Protobuf, OpenLineage, Great Expectations.
• Observability/SRE: Prometheus/Grafana; ELK/OpenSearch; OpenTelemetry; SLIs/SLOs and performance profiling/optimization.
• Distributed compute and resilience: Dask, Ray, Spark; HPC/Slurm; GPU scheduling; service mesh (Istio/Linkerd), API gateways, ingress; encryption/secrets/KMS, audit trails, backup/restore, DR.
Benefits
• We offer a competitive and comprehensive benefits package, which includes healthcare, dental coverage, and retirement savings benefits along with paid holidays, vacation and disability insurance.
Apply Now
Apply Now