Simplify Foundry

The document foundation for the modern enterprise.

Foundry turns raw documents into governed, structured data — one canonical source of truth your products and teams build on.

Enterprise knowledge is trapped in documents.

Contracts, statements, filings, and reports hold the data that runs your business — locked in PDFs and scans. The pipelines that extract it are brittle, lossy, and impossible to audit. Foundry replaces them with one governed foundation.

One pipeline. Raw documents to governed data.

Foundry ingests any document, reads its text, tables, and figures, and resolves them into a single canonical model — every element typed, located, scored for confidence, and traceable to its source. From ingestion to governed output, end to end.

The pipeline

  1. Ingest

    Bring in PDFs and images at any scale. Foundry fans every page across the pipeline automatically.

  2. Parse

    Vision OCR reads text, tables, and figures — including scanned and low-quality pages — without brittle text extraction.

  3. Structure

    Every page resolves into one canonical document model: typed blocks, reading order, tables, and figures, each with stable identity.

  4. Govern

    Every value traces to its exact source page. Versioned, auditable, and built for regulated workloads.

Built for documents you can't get wrong.

Foundry is built for banking, legal, and regulated industries — where a wrong number isn't an option, and where you have to prove where every answer came from.

  • Provenance

    Every extracted value traces back to its source page and position.

  • Confidence

    Every extraction is scored, so you know what to trust and what to review.

  • Audit & versioning

    Every run is reproducible and fully logged.

  • Control

    Granular access control and data residency, including self-hosting.

A foundation, not a feature.

Everything Foundry processes is available through one API. Simplify Studio is built on it — and so is everything you build next. One source of truth, every product on top.

Explore the API
GET /v1/documents/doc_8f2a…/canonical

{
  "id": "doc_8f2a",
  "pages": 47,
  "blocks": [
    { "type": "table", "page": 12, "confidence": 0.984,
      "lineage": { "page": 12, "bbox": [142, 88, 512, 240] } }
  ]
}

Sectors

  • Banking

    Statements, filings, and compliance documents, structured and governed.

  • Legal

    Contracts and case files, with clauses, parties, and dates resolved and traceable.

  • Regulated enterprise

    Any document workflow that has to be accurate and auditable.

Build on the document foundation.