Skip to main content
Version: 2.0-dev

S3 AWS S3 Integration Guide

Use MaestroHub's S3 connector to work with AWS S3 or S3‑compatible object storage (MinIO, Wasabi, DigitalOcean Spaces). This guide covers connection setup, function authoring, and pipeline integration.

Overview

The S3 connector provides:

  • AWS S3 and S3‑compatible support with region/endpoint controls and path‑style addressing
  • Reusable functions to fetch and write objects with templates and overrides
  • Secure credential handling with masked edits and server‑side encryption (SSE‑S3 and SSE‑KMS)
  • Operational limits for discovery and maximum file size to protect pipelines

Connection Configuration

Creating an AWS S3 Connection

From ConnectionsNew ConnectionAWS S3, configure the fields below.

S3 Connection Creation Fields

1. Profile Information
FieldDefaultDescription
Profile Name-A descriptive name for this connection profile (required, max 100 characters)
Description-Optional description for this S3 connection
2. S3 Configuration
FieldDefaultDescription
Region-AWS region (e.g., us-east-1) or Custom region string for S3‑compatible services – required
Bucket-Target bucket name (3–63 chars, lowercase, DNS‑compliant; cannot be an IP address) – required
Prefix-Optional base path prefix used as a security boundary for all operations
3. AWS Credentials
FieldDefaultDescription
Access Key ID-Required. Masked on edit; leave empty to keep stored value
Secret Access Key-Required. Masked on edit; leave empty to keep stored value
Session Token-Optional STS token. Masked on edit; leave empty to keep stored value
4. Advanced
FieldDefaultDescription
Discovery Limit2000Maximum objects to list during discovery (1–100000) – required
Max File Size (MB)25Maximum allowed object size for functions (1–25) – required
Connection Timeout (seconds)30Timeout for S3 API calls (5–300) – required
5. Server‑Side Encryption
FieldDefaultDescription
Enable Server‑Side EncryptiontrueEnable SSE for all writes by default
Encryption TypeAES256AES256 (SSE‑S3), aws:kms (SSE‑KMS), or none – required when SSE is enabled
KMS Key ID-Required when Encryption Type = aws:kms. Masked on edit; leave empty to keep stored value
6. S3‑Compatible Services
FieldDefaultDescription
Quick PresetCustomMinIO, Wasabi, DigitalOcean Spaces, or Custom presets to apply recommended flags
Custom Endpoint-Optional endpoint URL for S3‑compatible services (e.g., http://localhost:9000). Leave empty for AWS S3
Force Path StylefalseUse path‑style addressing (often required for MinIO)
Disable SSLfalseDisable HTTPS (local/dev only)
7. Connection Labels
FieldDefaultDescription
Labels-Key‑value pairs to categorize and organize this S3 connection (max 10 labels)

Example Labels

  • environment: production
  • team: data-platform
  • storage: s3
  • region: us-east-1
Notes
  • Bucket validation: Must start/end with letter or number; allowed chars are letters, numbers, hyphens, and periods. Disallow sequences .., .-, -..
  • Security: Credentials and KMS Key IDs are stored encrypted and displayed as masked on edit. Leave fields empty to keep stored values.
  • S3‑compatible tips: Use Quick Preset; enable Path Style for MinIO; consider Disable SSL for local development only.
  • Limits: Discovery Limit and Max File Size protect pipelines from large scans/uploads.

Function Builder

Creating S3 Functions

After saving the connection:

  1. Go to FunctionsNew Function
  2. Choose S3 Fetch or S3 Write as the function type
  3. Select the S3 connection profile
  4. Configure object keys, limits, and overrides
S3 Function Creation

Design reusable S3 fetch and write operations with object key templates and overrides

Fetch Object

Purpose: Retrieve objects or lists of matching objects from the configured bucket/prefix using exact keys, wildcards, regex, or templates.

Configuration Fields

FieldTypeRequiredDefaultDescription
Object KeyStringYes-Object path. Supports exact (data/report.csv), wildcards (logs/2024-*.log), regex (data/\\d{4}-\\d{2}-\\d{2}.csv), and templates (((date))/report.xlsx)
Timeout (ms)NumberNo60000Operation timeout (1000–600000)
Max File Size (MB)NumberNo-Overrides connection max (1–25)
Discovery LimitNumberNo-Overrides connection limit (1–100000)
Version IDStringNo-Specific object version for versioned buckets

Use Cases: Retrieve daily reports, collect log partitions, hydrate datasets into pipelines

Write Object

Purpose: Upload objects to the configured bucket/prefix with optional overwrite policy, metadata/tags, storage class, and encryption overrides.

Configuration Fields

FieldTypeRequiredDefaultDescription
Object KeyStringYes-Target path (max 1024). Supports templates e.g., data/output_((date)).csv
DataStringYes-Content to upload. Supports plain text and base64‑encoded content
Content TypeStringNo-MIME type (auto‑detected if omitted)
Overwrite ExistingBooleanNotrueIf false and object exists, a timestamped filename is created
Storage ClassEnumNoSTANDARDOne of: STANDARD, REDUCED_REDUNDANCY, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, GLACIER, DEEP_ARCHIVE, GLACIER_IR
Server‑Side EncryptionStringNo-Optional override: AES256, aws:kms, or none (inherits connection if omitted). For aws:kms, provide KMS Key ID
Cache ControlStringNo-Value for HTTP Cache-Control header
MetadataObjectNo{}User metadata as key/value pairs
TagsObjectNo{}Tag set as key/value pairs (up to 10 tags)
Timeout (ms)NumberNo60000Operation timeout (1000–600000)
Max File Size (MB)NumberNo-Overrides connection max (1–25)
Generate Presigned URLBooleanNofalseReturn a presigned URL for the uploaded object
Presigned URL Expiry (s)NumberNo3600Expiry time for presigned URL (60–604800)

Use Cases: Export pipeline results, publish artifacts, archive backups with lifecycle tiers

Using Parameters

Use ((parameterName)) in object keys, content templates, or metadata values to expose parameters for validation and runtime binding.

ConfigurationDescriptionExample
TypeValidate incoming valuesstring, number, boolean, datetime, json, buffer
RequiredEnforce presenceRequired / Optional
Default ValueProvide fallbacks'reports', '{}', NOW()
DescriptionDocument intent"Object key suffix (YYYY‑MM‑DD)", "Custom metadata JSON"
S3 Function Parameters

Parameter validation, defaults, and helper text for S3 object keys and payloads

Pipeline Integration

Use the S3 connection functions you configure here as nodes inside the Pipeline Designer to move files between systems. Drag in the fetch or write node, bind parameters to upstream outputs or constants, and tune retries or error branches.

For broader orchestration patterns that mix S3 with SQL, REST, or MQTT steps, see the Connector Nodes page.

S3 node in pipeline designer

S3 fetch node with connection, function, and parameter bindings

Common Use Cases

Data Lake Ingestion

Ingest CSV/JSON objects from S3 into analytical stores or trigger downstream normalization pipelines.

Backup and Restore

Store pipeline outputs and model artifacts with versioning; retrieve specific versions for rollback.

Report Distribution

Publish generated reports to S3 and optionally return presigned URLs for external access.