Skip to content

Commit c706234

Browse files
authored
Merge pull request #4 from Donavin97/feature/auto-dataset
- **Updated readme to reflect dataset creation.
2 parents 6f421e2 + 8b0f624 commit c706234

File tree

1 file changed

+44
-1
lines changed

1 file changed

+44
-1
lines changed

eqcctpro/README.md

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -237,7 +237,7 @@ eqcct_runner.run_eqcctpro()
237237
- To use the optimal parameter value for this param, use the **EvaluateSystem** class (can be found below)
238238
- **`number_of_concurrent_timechunk_predictions (int)`: default = None**
239239
- The number of timechunks running in parallel
240-
- Avoids the sequential processing of timechunks by processing multiple timechunks in parallel, exponetially reducing runtime
240+
- Avoids the sequential processing of timechunks by processing multiple timechunks in parallel, exponentially reducing runtime
241241
- **`best_usecase_config (bool)`: default = False**
242242
- If True, will override inputted cpu_id_list, number_of_concurrent_predictions, intra_threads, inter_threads values for the best overall use-case configurations
243243
- Best overall use-case configurations are defined as the best overall input configurations that minimize runtime while doing the most amount of processing with your available hardware
@@ -492,6 +492,49 @@ For **OptimalGPUConfigurationFinder.find_optimal_for()**, the function requires
492492
## **Configuration**
493493
The `environment.yml` file specifies the dependencies required to run EQCCTPro. Ensure you have the correct versions installed by using the provided conda environment setup.
494494

495+
##Dataset creation
496+
It is now possible to create the necesary dataset structure with your own data using the provided script 'create_dataset.py'.
497+
The script:
498+
1. Retrieves waveform data from a user defined FDSNWS webservice.
499+
2. Selects data according to network, station, channel and location codes.
500+
3. Has the option for defining time chunks according to the users requirements.
501+
4. Automatically downloads and creates the required folder structure for eqcctpro.
502+
5. Optionally denoises the data using seisbench as backend.
503+
An example is provided below
504+
```sh
505+
python create_dataset.py -h
506+
```
507+
output:
508+
````
509+
usage: create_dataset.py [-h] [--start START] [--end END] [--networks NETWORKS] [--stations STATIONS] [--locations LOCATIONS]
510+
[--channels CHANNELS] [--host HOST] [--output OUTPUT] [--chunk CHUNK] [--denoise]
511+
512+
Download FDSN waveforms in equal-time chunks.
513+
514+
options:
515+
-h, --help show this help message and exit
516+
--start START Start time, e.g. 2024-12-03T00:00:00Z
517+
--end END End time, e.g. 2024-12-03T02:00:00Z
518+
--networks NETWORKS Comma-separated network codes or *
519+
--stations STATIONS Comma-separated station codes or *
520+
--locations LOCATIONS
521+
Comma-separated location codes or *
522+
--channels CHANNELS Comma-separated channel codes or *
523+
--host HOST FDSNWS base URL
524+
--output OUTPUT Base output directory
525+
--chunk CHUNK Chunk size in minutes. Splits start■end into N windows.
526+
--denoise If set, apply seisbench.DeepDenoiser to each chunk.
527+
```
528+
An example to download waveforms from a local fdsnws server is given below:
529+
```sh
530+
python create_dataset.py --start 2025-10-31T00:00 --end 2025-10-31T04:00 --networks TX --stations "*" --locations "*" --channels HH?,HN? --host http://localhost:8080 --output waveforms_directory --chunk 60
531+
```
532+
The resulting output folder contains the data to be processed by Eqcctpro.
533+
Note: Please make sure that you set a consistant chunk size in the download script, as well as in eqcctpro itself to avoid issues.
534+
E.G.: If you set a time chunk of 20 minutes in the download script, then also use 20 minutes as chunk size when calling eqcctpro.
535+
This is so that data won't be processed eroniusly.
536+
537+
495538
## **License**
496539
EQCCTPro is provided under an open-source license. See LICENSE for details.
497540

0 commit comments

Comments
 (0)