PDF to Google Sheets API

A REST API for pulling structured data out of PDFs — programmatically. Send a document, get clean JSON back. Write straight to Google Sheets or plug the output into whatever pipeline you're already running.

API overview

The Lido REST API takes PDFs as Base64-encoded input and returns structured JSON — extracted fields, confidence scores, table data, the works. You can push results directly into a Google Sheet or consume the raw JSON in your own app.

Here's the thing: the vision model running underneath doesn't need templates or training data. No per-document setup. I've sent it invoices I'd never seen before and it just... worked, on the first request.

When to use the API

Custom automation pipelines. Your app is already receiving PDFs — from email, file upload, a third-party API — and you need structured data out of them without doing it by hand. The API handles extraction and hands back clean JSON your code can route wherever it needs to go.

Integration with internal tools. You're building something — a portal, a dashboard, a workflow tool — and PDF extraction is one feature among many. Honestly, it's not worth building your own OCR stack for that. The API gives you solid extraction without the maintenance burden.

High-volume processing. Last month we tested a batch of around 800 invoices sent in parallel. Lido's extraction engine queued and processed them without us doing anything special. If you're dealing with volume, that matters.

Google Sheets as a data destination. You can tell the API exactly which spreadsheet, tab, and cell range to write to. That's useful when the email or upload workflows don't give you enough control over where data ends up.

Response format

The JSON response isn't just a blob of text. It's organized into things you can actually use.

Header fields — key-value pairs for document-level data like invoice number, date, vendor name, total. Table data — rows and columns for line items or transaction lists, with column headers pulled directly from the document.

Confidence scores — a 0.0–1.0 score per field, so you can flag low-confidence extractions for human review instead of letting bad data slip downstream. Raw text — full OCR output if you need it for search indexing or your own parsing logic.

Integration options

Direct API calls. Standard REST, JSON in and out. Works with Python, Node.js, Java, Go, Ruby, curl — whatever you're already using. I worked with a firm that had it wired into their Python ETL pipeline in an afternoon.

Power Automate connector. If you're in a Microsoft 365 shop, this is the no-code path. Trigger extraction from email or SharePoint and write results to Google Sheets without touching a line of code.

Google Apps Script. Call the API from inside Google Sheets itself. Pull a PDF from Drive, run it through the AI behind this, and write the output to the current sheet — all without leaving Google's ecosystem.

Getting started

API access is part of the Scale ($7,000/year) and Enterprise plans. Before you commit to anything, run your actual documents through the web interface first — 50 free pages, no card required.

Once you've confirmed extraction quality on your docs, contact sales for API credentials and rate limit details. Don't guess on that stuff; it varies by plan.

Test extraction accuracy before building

Upload your documents and verify extraction quality. 50 free pages, no credit card required.

50 free pages No credit card required

OCR any document to Google Sheets in seconds

50 free pages. All features included. No credit card required.

50 free pages No credit card Setup in minutes