You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/contributing.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -116,8 +116,8 @@ When coding a new scraper, there are a few important conventions to follow:
116
116
- If it's a new state folder, add an empty `__init__.py` to the folder
117
117
- Create a `Site` class inside the agency's scraper module with the following attributes/methods:
118
118
-`name` - Official name of the agency
119
-
-`scrape_meta` - generates a CSV with metadata about videos and other available files (file name, URL, and size at minimum)
120
-
-`scrape` - uses the CSV generated by `scrape_meta` to download videos and other files
119
+
-`scrape_meta` - generates a JSON with metadata about videos and other available files (file name, URL at a minimum)
120
+
-`download_agency` - uses the JSON generated by `scrape_meta` to download videos and other files
121
121
122
122
Below is a pared down version of San Diego's [Site](https://github.com/biglocalnews/clean-scraper/blob/main/clean/ca/san_diego_pd.py) class to illustrate these conventions.
123
123
@@ -285,6 +285,7 @@ Options:
285
285
Commands:
286
286
list List all available agencies and their slugs.
287
287
scrape-meta Command-line interface for generating metadata CSV about...
288
+
download_agency Downloads assets retrieved in scrape-meta
288
289
```
289
290
290
291
Running a state is as simple as passing arguments to the appropriate subcommand.
@@ -299,7 +300,7 @@ pipenv run python -m clean.cli list
299
300
pipenv run python -m clean.cli scrape-meta ca_san_diego_pd
300
301
301
302
# Trigger file downloads using agency slug
302
-
pipenv run python -m clean.cli scrape ca_san_diego_pd
303
+
pipenv run python -m clean.cli download_agency ca_san_diego_pd
303
304
```
304
305
305
306
For more verbose logging, you can ask the system to show debugging information.
0 commit comments