Careers
← All roles

Public Sector

Senior AI Engineer, XTAM Public Sector

Location
Remote
Type
4-month contract, follow-on expected

Hands-on role building a distributed Apache Beam ETL pipeline, Dgraph-based Knowledge Base, and LLM-powered analysis agent for a Tier 1 defense contractor. Must deploy identically across commercial cloud, on-prem, and airgapped environments.

About the role

We are hiring a Senior AI Engineer to help build a distributed ETL pipeline, graph-based Knowledge Base, and LLM-powered analysis agent for a Tier 1 defense contractor. This is a hands-on role producing production-grade AI infrastructure for large-scale engineering models and technical documents, with LLM-powered extraction running in parallel across workers. The system must deploy identically across commercial cloud, on-premise, and airgapped environments with only a runner configuration change.

The initial engagement is a four-month program delivering a distributed Apache Beam ingestion pipeline, a Dgraph-based Knowledge Base, an LLM-powered gap analysis agent, a Go-based API and agent service, and a production-ready docker-compose stack. Follow-on work across the XTAM Public Sector portfolio is expected.

Responsibilities

  • Design and build a distributed Apache Beam pipeline (Python) that parses streaming XMI model exports and runs LLM-powered extractors in parallel across workers, with no full DOM loads and horizontal scaling across thousands of elements.
  • Build runner-agnostic pipeline code that executes identically on DirectRunner (dev), Dataflow (GCP), and Flink (on-prem and airgapped), with runner configuration as the only change.
  • Implement entity resolution and cross-referencing between pre-parsed documents (Docling JSON) and model elements, producing graph edges with confidence scores.
  • Architect and implement a graph-based Knowledge Base on Dgraph (or similar), including schema design, program-level data isolation, and query builders scoped to agent needs.
  • Build and maintain a pluggable extractor framework covering classification, completeness, relationship semantics, and document reference resolution, with a clean path for client domain-specific extractors.
  • Build an LLM provider abstraction supporting Anthropic Claude, Google Gemini, and local inference (Ollama, vLLM), with configuration-driven provider switching and no code changes to swap providers.
  • Build the Go-based API and agent service layer, with clean interfaces between the Python ETL layer and the Go query and agent layer.
  • Package the full system as a docker-compose stack and a standalone ETL pipeline artifact, with integration guides, extractor development guides, and API documentation.
  • Collaborate directly with client engineering teams on V2 REST API integration, sample data validation, and acceptance criteria.

Requirements

  • 5 to 8+ years in AI engineering, data engineering, or applied ML infrastructure, with demonstrated production experience.
  • Expert-level Python.
  • Depth in at least one of the following, with working familiarity in the others: distributed processing frameworks (Apache Beam, Spark, Flink — Beam strongly preferred); graph databases (Dgraph preferred; Neo4j, JanusGraph, or similar acceptable), including schema design and query optimization; production LLM integration, including prompt engineering, provider abstraction, and cost and latency management.
  • Comfortable with distributed systems design, parallelization patterns, and large dataset throughput tuning.
  • Strong SQL and data modeling fundamentals.
  • Experience with Docker and containerized deployments (docker-compose, Kubernetes).
  • Experience with cloud data infrastructure (GCP strongly preferred for Dataflow; AWS or Azure acceptable).

Preferred

  • Working Go experience, sufficient to contribute to the API and agent service layer. Strong plus.
  • Prior work in defense, aerospace, or government systems integration environments.
  • Exposure to MBSE tooling and SysML (V1 XMI, V2 REST API, Cameo, Teamwork Cloud).
  • Experience deploying into airgapped or classified networks, including provisioning and integrating local LLM inference.
  • Hands-on work with local LLM inference stacks (Ollama, vLLM, TensorRT-LLM).
  • Familiarity with document AI and parsing pipelines (Docling, Unstructured, or similar).
  • Modern orchestration experience (Airflow, Dagster, Prefect).

Apply

Apply for Senior AI Engineer, XTAM Public Sector

Send us your resume and a short note. Applications go directly to the partners. We read everything and respond.

Sent directly to the partners