Quick Start¶
This guide walks you through initializing a data catalog and deploying it to remote storage.
Prerequisites¶
- Python 3.13+
- uv (recommended) or pip
- dbt (for pipeline execution)
- S3-compatible storage (for remote push/pull)
1. Install¶
2. Initialize Project¶
This generates:
fdl.toml— Project config (tracked in Git).fdl/— DuckLake catalog and artifacts (auto-added to.gitignore)
For dlt integration, use a SQLite catalog:
3. Configure Remotes¶
Register remote storage as a Named Remote.
S3 Storage¶
Set S3 endpoint and credentials:
fdl config s3.endpoint https://your-s3-endpoint.com
fdl config s3.access_key_id YOUR_ACCESS_KEY
fdl config s3.secret_access_key YOUR_SECRET_KEY
Register the remote:
Tip
S3 credentials are stored in ~/.fdl/config (user level).
Remote URLs can be stored with --local in .fdl/config (workspace level),
or written directly in fdl.toml to share with the team.
Local Storage¶
4. Build Pipeline¶
Run your dbt pipeline. fdl run automatically injects the required environment variables
(FDL_STORAGE, FDL_DATA_PATH, etc.):
5. Generate Metadata¶
Generate metadata from dbt artifacts:
This creates .fdl/metadata.json containing table/column definitions and lineage information.
6. Push to Remote¶
Upload the catalog and metadata to the remote:
SQLite catalogs are automatically converted to DuckDB format during push.
7. Pull from Remote¶
To retrieve a catalog in a different environment:
Typical Workflow¶
graph LR
A[fdl init] --> B[fdl config]
B --> C[fdl run -- dbt run]
C --> D[fdl metadata]
D --> E[fdl push origin]
fdl init— Initialize the projectfdl config— Configure remotes and credentialsfdl run -- dbt run— Execute the pipelinefdl metadata— Generate metadatafdl push origin— Deploy to remote
Next Steps¶
- Configuration — Details on 3-layer config management
- CLI Reference — Full command reference