How to extract data from PDFs in Power Automate

Learn how to automate data extraction from PDF documents using Power Automate and the Cradl AI Document OCR Connector - complete with Human-in-the-Loop (HITL) capabilities to ensure human-level accuracy in your document processing workflow. Let’s dive in!

What we’re building

In this guide, we’ll show you how to add an AI-powered document data extraction step within a Power Automate flow. This setup requires no coding skills and can be completed in less than 30 minutes.

By the end, you’ll have a Power Automate flow that:

  1. Sends PDFs through a Power Automate trigger
  2. Extracts key information from each document and converts it into structured JSON
  3. Routes uncertain predictions to a dedicated human-in-the-loop interface for validation
  4. Returns the validated JSON response back to Power Automate

Step 1: Set up your Cradl AI model with Human-in-the-Loop

Before jumping into Power Automate, you need a Cradl AI model that understands your document type.

  • Create a new project: Log in to Cradl AI and start a new project. You can use a pre-defined template for popular document types like invoices, order confirmations, or ID cards.
  • Configure the fields: Define which fields to extract — e.g., invoice number, due date, total amount.
  • Enable HITL: In your Cradl AI workflow,  Human-in-the-Loop validation is enabled by default. This ensures that when the model is unsure, the document is routed to a human reviewer before the final output is sent. Yourself is added as a Validator by default, but you can also invite your colleagues to review AI predictions.


Step 2: Connect Cradl AI to Power Automate

Now let’s connect your Cradl AI workflow to Power Automate.

In Cradl AI:

  • Open your workflow in the visual builder.
  • Add an Input Trigger and select Power Automate.
  • This will automatically generate the Client ID and Client Secret. We will use these later.

In Power Automate:

  • Create a new flow and choose your preferred trigger (e.g., when a file is uploaded to OneDrive, received via email, or added to SharePoint).
  • Add a new action: search for Cradl AI in the list of connectors and select Cradl AI → Create Document.
  • Configure the connection:
    • Give it a name
    • Paste your Client ID and Client Secret
  • In the action settings:
    • Select your Cradl AI workflow
    • Pass in the file content from your trigger (e.g., file content from SharePoint or an email attachment

This will send the document to Cradl AI for parsing and validation.

Step 3: Create a Webhook in Power Automate to receive the response

To receive the structured (and validated) data back in Power Automate, we’ll set up a webhook.

In Power Automate:

  1. Add a new flow with the trigger: “When an HTTP request is received”.
  2. This will generate a webhook URL — copy it.

Back in Cradl AI:

  • In your workflow, add an Output Destination and select Power Automate.
  • Paste the webhook URL from Power Automate into the Webhook URL field.
  • Cradl AI will also generate a JSON schema based on your model’s output — copy this and paste it into the Request Body JSON Schema section in Power Automate.

Now Cradl AI knows where to send the results — and Power Automate knows what to expect.

Step 4: Choose your output destination in Power Automate

Once the validated data is received, you can send it wherever it needs to go. Add a connector to destinations like:

  • 📊 Excel / Google Sheets
  • 🗃️ Dataverse / SQL
  • 🧾 SharePoint List
  • 📧 Email

Wrapping up

With this setup, you’ve added an end-to-end document automation workflow to your Power Automate flow that is scalable, accurate and reliable.

Need help setting up a workflow like this? Book a free call with an expert

You might also be interested in

Try for free today

We’ll help get you started with your document automation journey.

Schedule a free demo with our team today!