AI agents are poised to change the world, but they can’t read the documents that run it. Retab, a San Francisco-based startup founded by engineers frustrated by the broken state of document AI, is fixing that. Today, the company announces $3.5 million in pre-seed funding and the launch of its platform.
The round was backed by leading early-stage funds including VentureFriends, Kima Ventures, and K5 Global, alongside Eric Schmidt (via StemAI), Olivier Pomel (CEO, Datadog), and Florian Douetteau (CEO, Dataiku). The new capital will support platform development and community growth, as the company scales its infrastructure to meet rising demand from vertical AI startups and internal innovation teams alike.
Retab is a developer platform and SDK that redefines everything about document processing in the age of large language models. Developers define the schema of the data they need; Retab handles the rest – from dataset labeling and evaluations, to automated prompt engineering & model selection.
Louis de Benoist and his co-founders cut their teeth building internal automation tools for document-heavy workflows in logistics. Over time, they realized their true value wasn’t in the output, but the orchestration layer they’d built to make the models work. That tooling became the foundation of Retab – now used by dozens of companies to extract structured data from messy, real-world inputs.
Retab is not another large language model. It’s the essential intelligence layer that makes the world’s most powerful models—from providers like OpenAI, Google, and Anthropic—usable for critical workflows. Developers define the data they need, and Retab’s platform manages the entire lifecycle to ensure verifiable accuracy.
The platform delivers guaranteed performance through a system of intelligent checks and balances:
Self-Optimizing Schemas: An AI agent automatically tests and refines instructions based on a user’s documents, maximizing accuracy before the system ever goes live.
Intelligent Model Routing: The platform is model-agnostic. It automatically benchmarks and routes each task to the best-performing model for the job, whether the priority is cost, speed, or accuracy. This can make processes up to 100x cheaper than other solutions.
Guided Reasoning & k-LLM Consensus: Retab forces models to “think” step-by-step and uses a consensus mechanism among multiple models to quantify uncertainty, acting as a powerful safety net to ensure trustworthy results.
Customers across logistics, finance, and healthcare are already seeing results. A major trucking company used Retab and found the smallest, fastest model configuration that could meet their 99% accuracy threshold, dramatically lowering operational costs. A financial services firm uses Retab to extract specific quantitative metrics and qualitative risk factors from 200-page quarterly reports – a task that previously took a team of analysts days to complete. Others are automating claims processing, medical records, identity verification, and onboarding with minimal setup.
Looking ahead, Retab is expanding its platform to apply the same reliable extraction methods to websites and is launching integrations with automation platforms like n8n, Zapier, and Dify.
Retab is also building toward its long-term vision: to serve as the intelligent middleware layer between the world’s unstructured data and the AI agents that need to understand them. Whether it’s parsing a loan file, a contract, or a customs manifest, Retab makes unstructured data usable, safe, and programmable.
With just ten employees and a fast-growing developer base, Retab is already positioning itself as a foundational layer in the AI infrastructure stack – a tool that doesn’t just show what’s possible, but lets anyone build with it.
Retab raises $3.5M and launches most powerful document AI platform on the market
Posted in Commentary with tags Retab on July 30, 2025 by itnerdAI agents are poised to change the world, but they can’t read the documents that run it. Retab, a San Francisco-based startup founded by engineers frustrated by the broken state of document AI, is fixing that. Today, the company announces $3.5 million in pre-seed funding and the launch of its platform.
The round was backed by leading early-stage funds including VentureFriends, Kima Ventures, and K5 Global, alongside Eric Schmidt (via StemAI), Olivier Pomel (CEO, Datadog), and Florian Douetteau (CEO, Dataiku). The new capital will support platform development and community growth, as the company scales its infrastructure to meet rising demand from vertical AI startups and internal innovation teams alike.
Retab is a developer platform and SDK that redefines everything about document processing in the age of large language models. Developers define the schema of the data they need; Retab handles the rest – from dataset labeling and evaluations, to automated prompt engineering & model selection.
Louis de Benoist and his co-founders cut their teeth building internal automation tools for document-heavy workflows in logistics. Over time, they realized their true value wasn’t in the output, but the orchestration layer they’d built to make the models work. That tooling became the foundation of Retab – now used by dozens of companies to extract structured data from messy, real-world inputs.
Retab is not another large language model. It’s the essential intelligence layer that makes the world’s most powerful models—from providers like OpenAI, Google, and Anthropic—usable for critical workflows. Developers define the data they need, and Retab’s platform manages the entire lifecycle to ensure verifiable accuracy.
The platform delivers guaranteed performance through a system of intelligent checks and balances:
Self-Optimizing Schemas: An AI agent automatically tests and refines instructions based on a user’s documents, maximizing accuracy before the system ever goes live.
Intelligent Model Routing: The platform is model-agnostic. It automatically benchmarks and routes each task to the best-performing model for the job, whether the priority is cost, speed, or accuracy. This can make processes up to 100x cheaper than other solutions.
Guided Reasoning & k-LLM Consensus: Retab forces models to “think” step-by-step and uses a consensus mechanism among multiple models to quantify uncertainty, acting as a powerful safety net to ensure trustworthy results.
Customers across logistics, finance, and healthcare are already seeing results. A major trucking company used Retab and found the smallest, fastest model configuration that could meet their 99% accuracy threshold, dramatically lowering operational costs. A financial services firm uses Retab to extract specific quantitative metrics and qualitative risk factors from 200-page quarterly reports – a task that previously took a team of analysts days to complete. Others are automating claims processing, medical records, identity verification, and onboarding with minimal setup.
Looking ahead, Retab is expanding its platform to apply the same reliable extraction methods to websites and is launching integrations with automation platforms like n8n, Zapier, and Dify.
Retab is also building toward its long-term vision: to serve as the intelligent middleware layer between the world’s unstructured data and the AI agents that need to understand them. Whether it’s parsing a loan file, a contract, or a customs manifest, Retab makes unstructured data usable, safe, and programmable.
With just ten employees and a fast-growing developer base, Retab is already positioning itself as a foundational layer in the AI infrastructure stack – a tool that doesn’t just show what’s possible, but lets anyone build with it.
Leave a comment »