hero

Work at a Portfolio Company

Senior Software Engineer (Data)

Amigo

Amigo

Software Engineering
New York, NY, USA
Posted on Dec 4, 2025

About Amigo

Amigo builds AI agents for healthcare—systems that handle patient conversations, take clinical actions, and escalate to humans when needed.

Our agents operate autonomously within bounded clinical domains. Clear inclusions, exclusions, and handoff protocols. The scope of autonomous operation expands over time as we validate performance across patient populations.

We own the data foundation end-to-end: patient interactions, agent reasoning traces, outcome data, and synthetic data with known fidelity. This enables population-level analytics and continuous improvement.

Series A from leading investors. Clinical validation and evidence generation in partnership with leading academic medical institutions.

About this role

As a Senior Software Engineer (Data) at Amigo, you'll build the data infrastructure that powers agent improvement, clinical analytics, and research collaboration. You'll own streaming and batch pipelines on Databricks that process agent conversations, clinical events, and patient outcomes at scale.

Our data platform is a strategic differentiator. We own the entire data foundation—from raw interaction data to agent reasoning traces to clinical outcomes. You'll build pipelines that enable population analysis, data mining, and the Research Platform backend.

What you'll do

  • Build and maintain streaming and batch pipelines on Databricks (Delta Lake, Spark)

  • Design CDC pipelines that sync operational databases to Delta Lake for analytics

  • Implement data mining pipelines for persona discovery, scenario extraction, and edge case detection

  • Build the data backend for Research Platform, including natural language to SQL capabilities

  • Create data quality monitoring, staleness detection, and automated alerting

  • Build pipelines for voice and SMS analytics (call quality, engagement metrics)

  • Support multi-region data deployment and compliance requirements

  • Collaborate with agent engineers and data scientists to surface insights that improve agent performance

What we're looking for

  • 4+ years of production data engineering experience

  • Strong experience with Databricks, Spark, and Delta Lake

  • Proficiency in Python and SQL for pipeline development

  • Experience building streaming pipelines and CDC (change data capture) systems

  • Understanding of data modeling, medallion architecture (bronze/silver/gold), and query optimization

  • Experience with data quality frameworks and monitoring

  • Track record of building reliable, production-grade data infrastructure

  • Both execution-oriented and defensive-minded: you ship pipelines while anticipating failure modes

  • Strong debugging skills for distributed data systems

  • Clear communication with data scientists, backend engineers, and product teams

Nice to have

  • Experience with healthcare data or HIPAA compliance requirements

  • Background with ML pipelines (feature engineering, model training infrastructure)

  • Experience building natural language query interfaces or LLM-powered data tools

  • Familiarity with vector search and embedding pipelines

  • Experience with Delta Sharing or data collaboration protocols

Benefits

Health & Wellness

  • Comprehensive health, dental, and vision insurance

  • Mental health support and wellness coaching

  • Flexible wellness stipend for fitness, therapy, or personal growth

  • Daily catered lunch and dinner

Growth & Development

  • Annual learning budget for courses, books, or conferences

  • Conference attendance budget for professional development

  • Development setup of your choice

  • Academic collaboration opportunities