Skip to main content

Notebook Read

Reads a Jupyter notebook file and returns all cells with their outputs.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.

Inputs

  • Notebook Path - string - Absolute path to the Jupyter notebook file (.ipynb) to read (required).
  • Cell ID - string - ID of a specific cell to read; if not provided, all cells will be read (optional).

Outputs

  • Cells - array - Array of notebook cells with their content and outputs. Each cell includes:
    • id - Cell ID
    • cellType - Type of cell: "code" or "markdown"
    • source - Cell source code or markdown content
    • metadata - Cell metadata
    • outputs - Cell outputs (code cells only)
    • executionCount - Execution count (code cells only)
  • Cell Count - int - Total number of cells read.
  • Metadata - object - Notebook metadata including kernel info, language, etc.

How It Works

The Notebook Read node reads Jupyter notebook files (.ipynb format). When executed, the node:

  1. Validates the notebook path is absolute and ends with .ipynb
  2. Checks if the file exists
  3. Reads and parses the JSON notebook format
  4. If Cell ID is provided:
    • Finds the specific cell by ID
    • Returns only that cell
  5. If Cell ID is not provided:
    • Returns all cells in the notebook
  6. Converts source to string format (handles array and string formats)
  7. Includes outputs and execution counts for code cells
  8. Returns notebook metadata (kernel info, language version, etc.)

Requirements

  • Valid absolute path to .ipynb file
  • File must be a valid Jupyter notebook (JSON format)
  • Read permissions on the file
  • Valid cell ID (if reading specific cell)

Error Handling

The node will return specific errors in the following cases:

  • Missing notebook path - "Notebook path is required"
  • Relative path - "Notebook path must be absolute"
  • Wrong file extension - "File must be a Jupyter notebook (.ipynb)"
  • File not found - "Notebook not found: {{path}}"
  • Read error - "Failed to read notebook: {{error}}"
  • Invalid format - "Invalid notebook format: {{error}}"
  • Cell not found - "Cell with ID '{{id}}' not found"

Usage Examples

Read Entire Notebook

Notebook Path: /home/user/analysis/data_analysis.ipynb

Read Specific Cell

Notebook Path: /home/user/project/notebook.ipynb
Cell ID: a1b2c3d4

Extract Code Cells

1. Notebook Read: read all cells
2. Filter: cells where cellType = "code"
3. Process code cells

Check Cell Outputs

1. Notebook Read: read all cells
2. For each cell:
- If cellType = "code":
- Check outputs for errors

Notebook Structure

Jupyter notebooks (.ipynb) are JSON files containing:

Notebook Level

  • cells - Array of cell objects
  • metadata - Notebook-level metadata
  • nbformat - Notebook format version
  • nbformat_minor - Minor version number

Cell Structure

  • cell_type - "code" or "markdown"
  • id - Unique cell identifier
  • metadata - Cell-specific metadata
  • source - Cell content (code or markdown)
  • outputs - Execution outputs (code cells only)
  • execution_count - Execution number (code cells only)

Source Formats

Notebook source can be in two formats:

String Format

"source": "print('Hello World')"

Array Format (Common)

"source": [
"import pandas as pd\n",
"df = pd.read_csv('data.csv')\n",
"df.head()"
]

The node handles both formats and returns source as a single string.

Output Types

Code cells can have various output types:

Stream Output

Text output from print statements

Display Data

Rich output (plots, images, HTML)

Execute Result

Return values from expressions

Error Output

Error messages and tracebacks

Usage Notes

  • All paths must be absolute
  • Only .ipynb files are supported
  • Cell IDs are unique within a notebook
  • Source is always returned as string (arrays are joined)
  • Markdown cells don't have outputs or execution counts
  • Metadata structure varies by notebook version and kernel
  • Empty notebooks return empty cells array
  • Cell order is preserved from the notebook

Common Use Cases

Code Extraction

Extract code from notebooks for testing or conversion to scripts.

Output Validation

Check notebook outputs for errors or expected results.

Documentation Generation

Extract markdown cells to generate documentation.

Notebook Analysis

Analyze notebook structure, cell types, and execution order.

Quality Assurance

Verify notebooks before committing to version control.

Automated Testing

Read test notebooks and verify outputs match expected values.

Best Practices

  • Validate paths before reading notebooks
  • Check cellType before accessing type-specific fields
  • Handle both formats for source (string and array)
  • Parse outputs according to output type
  • Use Cell ID for targeted cell access
  • Check nbformat version for compatibility
  • Handle metadata carefully as structure varies
  • Combine with Notebook Edit for notebook automation

Example Workflows

Extract All Code

1. Notebook Read: read entire notebook
2. Filter cells: cellType = "code"
3. Extract source from each cell
4. Concatenate into single script
5. Write to .py file

Find Errors

1. Notebook Read: read all cells
2. For each code cell:
- Check outputs array
- Look for output_type = "error"
- Collect error messages
3. Report all errors found

Generate Documentation

1. Notebook Read: read all cells
2. Filter cells: cellType = "markdown"
3. Extract source from each cell
4. Concatenate markdown content
5. Write to .md file

Comparison with Other Nodes

  • Notebook Read vs Read: Notebook Read parses .ipynb format; Read reads plain text files
  • Notebook Read vs Notebook Edit: Notebook Read extracts content; Notebook Edit modifies notebooks
  • Notebook Read vs Grep: Notebook Read parses structured notebooks; Grep searches plain text