December 3, 2024

How to Extract Data from PDF Tables with AI

Kavian Braanaas
Reading time: 3 min.

In this guide, we’ll show you how to use Cradl AI to automate data extraction from PDF tables into JSON and make it accessible for your other apps.

Whether you’re processing transaction lists, inventory reports, invoices, or any other document, this guide will walk you through setting up Cradl AI to extract tabular data with minimal effort.

Screenshot of an invoice document containing a long table of transactional information

Create a Cradl AI model for your documents

Before we begin, make sure you’ve created a Cradl AI account.

Once you're inside the app, the first step is to define the information you want the AI model to extract from your documents, including information stored in tables.

To do so, add the Line Items field to your AI model. This setting allows the AI to recognise and process tabular sections of your documents.

Screenshot of the AI model configuration UI inside Cradl AI


When processing a table, the AI model scans all relevant columns and rows as key-value pairs.

For example: in an invoice containing a list of purchased products, one Line Items field captures multiple data points for each row—such as "description," "unit price," "quantity," and "VAT amount."

This means a single Line Items field can extract hundreds of data points across rows, all in one go.

Extracting data and reviewing the output

Once your AI model is set up and saved, you can extract data from a document by simply uploading it. When the parsing is complete, you can review the results in the validation interface.

A particularly handy feature is the visual data-location mapping. When you click on an extracted data field, the corresponding area in the original document lights up:


This, along with the confidence scores assigned to each data field by the AI, makes it easy for you to verify the AI’s output before exporting it as JSON to your other apps.

Integrating Cradl AI with your other apps

From here, you can easily integrate with other tools to automate your workflows entirely and further streamline internal data processing.

Cradl AI supports input and output integrations from a variety of popular automation tools and apps, such as Power Automate, UiPath, Zapier, Excel, Google Sheets, Email, APIs, Webhooks, and more.

Screenshot of the import and export configuration user interface inside Cradl AI

Summary

Extracting data from PDF tables can be set up within minutes. With Cradl AI, you can quickly set up an AI model, add the Line Items field for tabular data, and extract structured JSON data from your documents. Once the data is validated, integrate it into your workflows using Cradl AI’s pre-built integrations or APIs.

Get started for free

We’ll help get you started with your document automation journey.

Schedule a free demo with our team today!