Skip to main content
Version: 2.3

Databricks Storage Nodes

Databricks Storage nodes enable pipelines to read and write files on Unity Catalog Volumes via the Databricks Files API. Use these nodes to automate file-based data ingestion, export, and exchange workflows through your Databricks workspace.

Configuration Quick Reference

FieldWhat you chooseDetails
ParametersConnection, Function, Function Parameters, Timeout OverrideSelect the connection profile, function, configure function parameters with expression support, and optionally override timeout.
SettingsDescription, Timeout (seconds), Retry on Timeout, Retry on Fail, On ErrorNode description, maximum execution time, retry behavior on timeout or failure, and error handling strategy. All execution settings default to pipeline-level values.

Databricks Storage Read node configuration

Databricks Volume Read Node

Databricks Storage Read Node

Read files from Unity Catalog Volumes with automatic content encoding detection.

Supported Function Types:

Function NamePurposeCommon Use Cases
Read FileDownload file content from a Unity Catalog VolumeData imports, configuration retrieval, report processing, log analysis

Node Configuration

ParameterTypeRequiredDescription
ConnectionSelectionYesDatabricks Storage connection profile to use
FunctionSelectionYesRead function from the selected connection
Function ParametersDynamicVariesAuto-populated from the function schema (e.g., volumePath). See your Databricks Storage connection functions for full parameter details.
Timeout OverrideNumber (seconds)NoOverride the default function timeout

All function parameters support expression syntax ({{ expression }}) for dynamic values from the pipeline context.

Input

The node receives the output of the previous node as input. Use expressions like {{ $input.data.fileName }} to dynamically specify which file to read.

Output Structure

On success the node produces:

{
"success": true,
"functionId": "<function-id>",
"data": {
"data": "header1,header2\nvalue1,value2\n...",
"metadata": {
"volumePath": "/Volumes/my_catalog/my_schema/my_volume/data/report.csv",
"fileName": "report.csv",
"sizeBytes": 4096,
"encoding": "text"
}
},
"durationMs": 312,
"timestamp": "2026-04-09T10:30:00Z"
}
FieldTypeDescription
datastringFile content (text for text-based files, base64-encoded for binary files)
metadataobjectFile metadata (see below)

Metadata Fields:

FieldTypeDescription
volumePathstringFull volume path of the file
fileNamestringFile name extracted from the path
sizeBytesnumberFile size in bytes
encodingstringContent encoding used (text or base64)

Databricks Storage Write node configuration

Databricks Volume Write Node

Databricks Storage Write Node

Write files to Unity Catalog Volumes with overwrite control.

Supported Function Types:

Function NamePurposeCommon Use Cases
Write FileUpload data to a Unity Catalog VolumeData exports, pipeline result storage, report generation, artifact archival

Node Configuration

ParameterTypeRequiredDescription
ConnectionSelectionYesDatabricks Storage connection profile to use
FunctionSelectionYesWrite function from the selected connection
Function ParametersDynamicVariesAuto-populated from the function schema (e.g., volumePath, data, overwrite). See your Databricks Storage connection functions for full parameter details.
Timeout OverrideNumber (seconds)NoOverride the default function timeout

Input

The node receives the output of the previous node as input. Use expressions like {{ $input.data }} to dynamically pass content to write.

Output Structure

On success the node produces:

{
"success": true,
"functionId": "<function-id>",
"data": {
"volumePath": "/Volumes/my_catalog/my_schema/my_volume/output/result.json",
"bytesWritten": 2048,
"overwritten": false
},
"durationMs": 245,
"timestamp": "2026-04-09T10:30:00Z"
}
FieldTypeDescription
volumePathstringFull path of the written file
bytesWrittennumberNumber of bytes written
overwrittenbooleanWhether an existing file was overwritten

Settings Tab

Both Databricks Storage node types share the same Settings tab:

SettingTypeDefaultDescription
DescriptionTextOptional description displayed on the node
Timeout (seconds)NumberPipeline defaultMaximum time the node may run before timing out
Retry on TimeoutTogglePipeline defaultAutomatically retry the node if it times out
Retry on FailTogglePipeline defaultAutomatically retry the node if it fails
On ErrorSelectionPipeline defaultError strategy: stop the pipeline, continue to the next node, or follow the error output path

When left at their defaults, these settings inherit from the pipeline-level execution configuration.