Back to all positions

    Senior Machine Learning Engineer

    Engineering
    Boston, MA
    Full-time
    $220,000 - $320,000Team: ML Systems

    About the Role

    Regenerative AI builds self-adaptive AI systems that continuously monitor, regulate, and improve their own behavior during real-world operation. We are looking for a Senior Machine Learning Engineer to join our ML Systems team in Boston. You will design, build, and operate production ML systems with built-in feedback loops, runtime monitoring, and automated recalibration. This is a hands-on engineering role focused on system-level robustness, reliability, and continuous adaptation in deployment.

    What You'll Do

    • Design and implement end-to-end ML pipelines: data ingestion, training, evaluation, and serving
    • Build and maintain self-adaptive ML systems with runtime monitoring and drift detection
    • Develop automated recalibration workflows that respond to performance degradation
    • Create observability infrastructure for model health, latency, and prediction quality
    • Own incident response and reliability for ML services in production
    • Collaborate with platform engineers to integrate ML workloads into CI/CD pipelines
    • Establish and enforce governance standards for model versioning, rollback, and auditability
    • Optimize training and inference infrastructure for efficiency and cost
    • Contribute to internal tooling that accelerates ML development velocity

    Qualifications

    • 5+ years of experience building and deploying ML systems in production environments
    • Strong proficiency in Python and modern ML frameworks (PyTorch, TensorFlow, or JAX)
    • Hands-on experience with distributed training and GPU infrastructure
    • Deep understanding of MLOps practices: CI/CD for ML, model versioning, feature stores
    • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
    • Solid understanding of data pipelines and streaming systems (Kafka, Spark, Airflow)
    • Track record of owning and operating ML services with high availability requirements
    • Strong software engineering fundamentals and code quality standards

    Nice to Have

    • Experience with online learning, adaptive models, or feedback-driven systems
    • Background in model monitoring, drift detection, or anomaly detection
    • Familiarity with control systems concepts or adaptive control
    • Experience with Kubernetes, Docker, and cloud-native ML infrastructure
    • Knowledge of reliability engineering practices (SLOs, error budgets, on-call)
    • Experience building internal ML platforms or developer tools

    Benefits & Perks

    Competitive salary and meaningful equity
    Comprehensive health, dental, and vision insurance
    Flexible PTO and remote-friendly work arrangements
    Annual learning and development budget ($5,000)
    Home office setup allowance
    401(k) with company match
    Modern tech stack: Python, PyTorch, Docker, Kubernetes, cloud (AWS/GCP)

    Ready to Join Us?

    We're excited to learn more about you. Apply now and take the next step in your career with Regenerative AI.