Feb 4, 2026
How to Convert PDFs to JSON using AI
How to Convert PDFs to JSON using AI
Stig Zerener

Converting PDFs to JSON is a common challenge when working with documents like invoices, bills of lading, receipts, contracts, or forms. PDFs aren’t built for structured data exchange, and manual extraction into JSON is slow, brittle, and hard to maintain at scale. AI makes it possible to extract structured data from PDFs and convert documents into clean, machine-readable JSON with far fewer errors.
In this post, we show how to set up an AI model that automatically extracts data from PDFs and outputs validated JSON in minutes, using large language models with human-in-the-loop review.
Before we get started
You'll need:
A Cradl AI account
(Optionally) An HTTP endpoint (webhook)
Running your first PDF extraction
To get started, create a new agent and choose either one of the pre-configrued agents or the Custom option. If prompted for integrations, just skip it for now - we'll add it later. When you create the agent, a document will automatically be uploaded and parsed. Clik the document in the Runs tab to review the extracted data.

Now we're going to customize the fields that we want our agent to extract. Open the Workflow tab and select the Extract with AI node. From there, you can add, remove, or refine fields based on your needs. After you're satifsied with your model, go back to Runs to upload a new document to see how your agent performs.
Configuring human-in-the-loop validation
By default, your agent is created iwht human-in-the-loop validation. This means that if AI is uncertain about whether it has extracted the right information or not, the document will be routed to a manual reviewer. You can define which uncertainty level you tolerate in your agent by clicking the "Review by human"-node in the workflow builder.
If you want all values to be sent direclty to your webhook without any human-in-the-loop you can remove all validators from all your fields.
Exporting the data to JSON
With your AI model set up, converting a PDF or image to JSON is as simple as clicking "Run," uploading a document, and waiting for your AI model process the document. Once processed, you can view the extracted data. You can now manually download the extrafcted data on a JSON format or you can choose to set up an automatic export through the Webhook integration. In order to do this you'll need to have a HTTP endpoint.
This can easily be done in platforms like Zapier, Power Automate, N8N and more by choosing a Webhook-trigger or HTTP Request trigger. For these platforms we also have native Cradl AI integrations so if you're planning on integrating with them I'd recommending going with them instead.

xxx

Automating PDF to JSON conversion with webhooks
In many industries, the volume of incoming documents makes manual PDF to JSON conversion impractical. Fortunately, Cradl AI allows you to easily automate this process entirely. Let's use our webhook integration as an example:
Visit webhook.site, which auto-generates a unique webhook URL that you can use for testing. Copy it.
Back in Cradl AI, select “Webhook” from the export options and paste the webhook URL.
Upload and validate a document. Its JSON out will be sent directly to your test webhook, ready for integration with your other apps!

Cradl AI’s integrates seamlessly with other popular apps and platforms like Excel, Google Sheets, Zapier, Power Automate, UiPath, any mailbox, and APIs.
Summary
Cradl AI makes it easy to convert and automate PDF to JSON workflows using AI models to accurately extract information as JSON from your PDFs, and sends it to any app that supports webhooks. to create AI models that understands your empowers businesses to become more data-driven by making key information readily accessible for insights and analytics.