tap-csv-folder
is a Singer tap for CSV files stored in a folder on a local or remote filesystem.
Built with the Meltano Tap SDK for Singer Taps.
Install from GitHub:
pipx install git+https://github.com/MeltanoLabs/tap-csv-folder.git@main
Install in a Meltano project:
meltano add extractor tap-csv-folder --from-ref https://raw.githubusercontent.com/MeltanoLabs/tap-csv-folder/refs/heads/main/plugin.yml
Setting | Required | Default | Description |
---|---|---|---|
delimiter | False | , | Field delimiter character. |
quotechar | False | " | Quote character. |
escapechar | False | None | Escape character. |
doublequote | False | true | Whether quotechar inside a field should be doubled. |
lineterminator | False | ||
Line terminator character. | |||
filesystem | False | local | The filesystem to use. |
path | False | None | Path to the directory where the files are stored. |
read_mode | False | None | Use one_stream_per_file to read each file as a separate stream, or merge to merge all files into a single stream. |
stream_name | False | files | Name of the stream to use when read_mode is merge . |
The following settings are provided by the Singer SDK for filesystems and supported by the tap.
Setting | Required | Default | Description |
---|---|---|---|
ftp | False | None | FTP connection settings |
ftp.host | True | None | FTP server host |
ftp.port | False | 21 | FTP server port |
ftp.username | False | None | FTP username |
ftp.password | False | None | FTP password |
ftp.timeout | False | 60 | Timeout of the FTP connection in seconds |
ftp.encoding | False | utf-8 | FTP server encoding |
sftp | False | None | SFTP connection settings |
sftp.host | True | None | SFTP server host |
sftp.ssh_kwargs | False | None | SSH connection settings |
sftp.ssh_kwargs.port | False | 22 | SFTP server port |
sftp.ssh_kwargs.username | True | None | SFTP username |
sftp.ssh_kwargs.password | False | None | SFTP password |
sftp.ssh_kwargs.pkey | False | None | Private key |
sftp.ssh_kwargs.timeout | False | 60 | Timeout of the SFTP connection in seconds |
The following settings are provided by the Singer SDK and automatically supported by the tap.
Setting | Required | Default | Description |
---|---|---|---|
stream_maps | False | None | Config object for stream maps capability. For more information check out Stream Maps. |
stream_map_config | False | None | User-defined config values to be used within map expressions. |
faker_config | False | None | Config for the Faker instance variable fake used within map expressions. Only applicable if the plugin specifies faker as an addtional dependency (through the singer-sdk faker extra or directly). |
faker_config.seed | False | None | Value to seed the Faker generator for deterministic output: https://faker.readthedocs.io/en/master/#seeding-the-generator |
faker_config.locale | False | None | One or more LCID locale strings to produce localized output for: https://faker.readthedocs.io/en/master/#localization |
flattening_enabled | False | None | 'True' to enable schema flattening and automatically expand nested properties. |
flattening_max_depth | False | None | The max depth to flatten schemas. |
batch_config | False | None | |
batch_config.encoding | False | None | Specifies the format and compression of the batch files. |
batch_config.encoding.format | False | None | Format to use for batch files. |
batch_config.encoding.compression | False | None | Compression format to use for batch files. |
batch_config.storage | False | None | Defines the storage layer to use when writing batch files |
batch_config.storage.root | False | None | Root path to use when writing batch files. |
batch_config.storage.prefix | False | None | Prefix to use when writing batch files. |
A full list of supported settings and capabilities for this tap is available by running:
tap-csv-folder --about
This Singer tap will automatically import any environment variables within the working directory's
.env
if the --config=ENV
is provided, such that config values will be considered if a matching
environment variable is set either in the terminal context or in the .env
file.
You can easily run tap-csv-folder
by itself or in a pipeline using Meltano.
tap-csv-folder --version
tap-csv-folder --help
tap-csv-folder --config CONFIG --discover > ./catalog.json
Follow these instructions to contribute to this project.
pipx install poetry
poetry install
Create tests within the tests
subfolder and
then run:
poetry run pytest
You can also test the tap-csv-folder
CLI interface directly using poetry run
:
poetry run tap-csv-folder --help
Testing with Meltano
Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.
Next, install Meltano (if you haven't already) and any needed plugins:
# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-csv-folder
meltano install
Now you can test and orchestrate using Meltano:
# Test invocation:
meltano invoke tap-csv-folder --version
# OR run a test `elt` pipeline:
meltano run tap-csv-folder target-jsonl
See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.