From 0dec85f3331df5a1d9fcc2648d88a68383fa1ebf Mon Sep 17 00:00:00 2001 From: HarshvardhanJ Date: Mon, 3 Mar 2025 16:40:37 +0530 Subject: [PATCH] Update README.md Fixed inconsistency of the full stops --- basic_pitch/data/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/basic_pitch/data/README.md b/basic_pitch/data/README.md index 3466be0..4f1eaea 100644 --- a/basic_pitch/data/README.md +++ b/basic_pitch/data/README.md @@ -1,11 +1,11 @@ # Data / Training The code and scripts in this section deal with training basic pitch on your own. Scripts in the `datasets` folder allow one to download and process a selection of the datasets used to train the original model. Each of these download scripts has the following keyword arguments: -* **--source**: Source directory to download raw data to. It defaults to `$HOME/mir_datasets/{dataset_name}` +* **--source**: Source directory to download raw data to. It defaults to `$HOME/mir_datasets/{dataset_name}`. * **--destination**: Directory to write processed data to. It defaults to `$HOME/data/basic_pitch/{dataset_name}`. * **--runner**: The method used to run the Beam Pipeline for processing the dataset. Options include `DirectRunner`, running directly in the code process running the pipeline, `PortableRunner`, which can be used to run the pipeline in a docker container locally, and `DataflowRunner`, which can be used to run the pipeline in a docker container on Dataflow. * **--timestamped**: If passed, the dataset will be put into a timestamp directory instead of 'splits'. * **--batch-size**: Number of examples per tfrecord when partitioning the dataset. -* **--sdk_container_image**: The Docker container image used to process the data if using `PortableRunner` or `DirectRunner` . +* **--sdk_container_image**: The Docker container image used to process the data if using `PortableRunner` or `DirectRunner`. * **--job_endpoint**: the endpoint where the job is running. It defaults to `embed` which works for `PortableRunner`. Additional arguments that work with Beam in general can be used as well, and will be passed along and used by the pipeline. If using `DataflowRunner`, you will be required to pass `--temp_location={Path to GCS Bucket}`, `--staging_location={Path to GCS Bucket}`, `--project={Name of GCS Project}` and `--region={GCS region}`.