Pdf To Data Table

Extracts tables from a PDF file and converts them to data table format.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

PDF File Path - The file path to the PDF file containing tables to be extracted.

Pages - Specifies which pages to extract tables from. Options include:
- "all" - Extract tables from all pages
- "1-2,3" - Extract tables from pages 1, 2, and 3
- "[1,2]" - Extract tables from pages 1 and 2

The Pdf To Data Table node extracts tables from a PDF file and converts them to data table format. When executed, the node:

The node will return specific errors in the following cases:

The output is a list of tables, as a single PDF file may contain multiple tables
The Pages option allows you to specify which pages to extract tables from
NaN values in the PDF tables are automatically converted to empty strings
The node uses the tabula library for PDF table extraction
The first row of each extracted table is used as column headers