Harness Engineer
Nomic AI
Other Engineering
New York, NY, USA
Location
New York HQ
Employment Type
Full time
Location Type
Hybrid
Department
Technical Staff
Harness Engineer
Location: NYC Reports to: CTO
About Nomic
Nomic builds AI agents and developer tools that power the built world. We help enterprise teams in architecture, engineering, and construction extract structured knowledge from decades of drawings, specs, and project files. Our platform combines embedding models, document parsing, and autonomous agents that reason over real-world data and take action in live environments.
The Role
Our agents reason over massive, messy, real-world document collections — construction drawings, specifications, decades of project history. Getting that right means solving retrieval, context assembly, and evaluation as first-class engineering problems, not afterthoughts bolted onto a prompt.
We're hiring a Harness Engineer to work on the systems that make our agents effective: how they find information, how they assemble context, how we know they're working, and how we make them better over time.
You should be the kind of engineer who knows what a vector database is and when not to use one. Who thinks about retrieval as an architecture problem, not a library call. Who's paying attention to how agent systems actually get built and deployed in 2026 — and has opinions about it.
What You'll Work On
Retrieval systems — search, ranking, chunking strategies, hybrid approaches, knowing which tool fits which problem
Context engineering — assembling the right information for agents operating over large, heterogeneous document sets
Evaluation and harnesses — building the infrastructure to continuously measure agent accuracy, regression-test retrieval quality, and close feedback loops
Agent pipelines — the orchestration layer between retrieval, models, and downstream actions
Scale — making all of the above work across thousands of customer document collections, not just a demo corpus
What We're Looking For
Strong software engineering skills in Python and/or TypeScript
Real experience with retrieval systems — embeddings, vector search, traditional IR, or some combination
You've built systems that had to work on messy, real-world data — not just clean benchmarks
Familiarity with LLMs and agent frameworks in practice, not just in theory
You think in systems — how components interact, where things break, what doesn't scale
Intellectual curiosity about the retrieval and agent tooling landscape as it exists right now
Even better if you have:
Experience with evaluation infrastructure — evals, benchmarks, regression testing for AI systems
Background in search, NLP, or information retrieval
Exposure to the AEC industry or other document-heavy domains