Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added troubleshooting section to filesystem docs #1900

Merged
merged 6 commits into from
Oct 1, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/website/docs/dlt-ecosystem/destinations/filesystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -699,4 +699,27 @@ You will also notice `init` files being present in the root folder and the speci

**Note:** When a load generates a new state, for example when using incremental loads, a new state file appears in the `_dlt_pipeline_state` folder at the destination. To prevent data accumulation, state cleanup mechanisms automatically remove old state files, retaining only the latest 100 by default. This cleanup process can be customized or disabled using the filesystem configuration `max_state_files`, which determines the maximum number of pipeline state files to retain (default is 100). Setting this value to 0 or a negative number disables the cleanup of old states.
burnash marked this conversation as resolved.
Show resolved Hide resolved

## Troubleshooting
When running your pipeline, you might encounter an error like `[Errno 36] File name too long Error`. This error occurs because the generated file name exceeds the maximum allowed length on your filesystem.

### Solution
burnash marked this conversation as resolved.
Show resolved Hide resolved
To prevent the file name length error, set the `max_identifier_length` parameter for your destination. This truncates all identifiers (including filenames) to a specified maximum length.
For example:

```py
from dlt.destinations import duckdb

pipeline = dlt.pipeline(
pipeline_name="your_pipeline_name",
destination=duckdb(
max_identifier_length=200, # Adjust the length as needed
),
)
```

**Notes**
- `max_identifier_length` truncates all identifiers (tables, columns). Ensure the length maintains uniqueness to avoid collisions.

- Adjust `max_identifier_length` based on your data structure and filesystem limits.
burnash marked this conversation as resolved.
Show resolved Hide resolved

<!--@@@DLT_TUBA filesystem-->
Loading