Skip to content

Commit 3a7e2f1

Browse files
jschlomanclaude
andauthored
feat: plugin fetch CLI, dotenv support, location assumptions plugin, and plugin READMEs (#47)
- Add `python autobiographer.py fetch <plugin>` CLI subcommand; fetchable plugins download data, non-fetchable ones print manual export steps - Add python-dotenv support so .env is loaded automatically by both the CLI and Streamlit app — no more manual export in every terminal session - Add Fetch Latest Data button in sidebar with live per-page progress shown in a sidebar-level placeholder (always visible, even when the plugin expander is collapsed) - Show fetch identity (username) and save path before fetching - Auto-populate config field after a successful fetch - Auto-expand plugin expander and show warning when saved path no longer exists on disk - Extract location assumptions into its own AssumptionsPlugin (location-context type) with its own sidebar section and config field; remove assumptions_file from SwarmPlugin - Add pytest pythonpath = ["."] so tests run without AUTOBIO_PYTHONPATH - Add per-plugin READMEs (lastfm, swarm, assumptions) and a Data Sources table in the main README linking to all three Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 8edfc18 commit 3a7e2f1

19 files changed

Lines changed: 1328 additions & 105 deletions

.env.example

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
1-
# Copy to .env and fill in your values, then:
2-
# docker compose run --rm dashboard python autobiographer.py --user YOUR_USERNAME
1+
# Copy this file to .env and fill in your values.
2+
#
3+
# The app and CLI load .env automatically — no manual `export` needed.
4+
#
5+
# Fetch your Last.fm listening history:
6+
# python autobiographer.py fetch lastfm
7+
#
8+
# Or from Docker:
9+
# docker compose run --rm dashboard python autobiographer.py fetch lastfm
10+
#
11+
# Get a Last.fm API key at: https://www.last.fm/api/account/create
12+
313
AUTOBIO_LASTFM_API_KEY=your_api_key_here
414
AUTOBIO_LASTFM_API_SECRET=your_api_secret_here
515
AUTOBIO_LASTFM_USERNAME=your_username_here

README.md

Lines changed: 45 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,18 @@ Autobiographer is a Python-based toolkit that allows you to fetch, store, and ex
2424

2525
<img src="assets/flythrough.gif" style="width: 500">
2626

27+
## Data Sources
28+
29+
Autobiographer is built around a plugin system. Each plugin owns a specific data source — its format, how to obtain it, and how it maps into the common schema. All path configuration happens in the sidebar; nothing is hardcoded.
30+
31+
| Plugin | Type | Description |
32+
|--------|------|-------------|
33+
| [Last.fm Music History](plugins/sources/lastfm/README.md) | what-when | Your complete listening history, fetched automatically via the Last.fm API. One click in the sidebar downloads everything. |
34+
| [Foursquare / Swarm Check-ins](plugins/sources/swarm/README.md) | where-when | Your check-in history from the Swarm app. Requires a one-time manual data export request from Foursquare. |
35+
| [Location Assumptions](plugins/sources/assumptions/README.md) | location-context | A user-authored JSON file that fills in your location for periods not covered by Swarm — trips, recurring holidays, and home residency rules. |
36+
37+
Each plugin has its own README with setup instructions, data format details, and schema documentation.
38+
2739
## Quickstart (Docker)
2840

2941
No Python knowledge required — just [Docker](https://www.docker.com/products/docker-desktop/).
@@ -45,7 +57,7 @@ To download your listening history directly into the mounted data volume, pass y
4557
```bash
4658
cp .env.example .env # fill in your API key, secret, and username
4759
docker compose run --rm dashboard \
48-
python autobiographer.py --user YOUR_USERNAME
60+
python autobiographer.py fetch lastfm
4961
docker compose up
5062
```
5163

@@ -79,21 +91,45 @@ docker compose up
7991

8092
### 3. Configuration
8193

82-
Set your Last.fm credentials in your environment:
94+
Set your credentials as environment variables. Required for Last.fm fetching:
8395
```bash
8496
export AUTOBIO_LASTFM_API_KEY="your_api_key"
8597
export AUTOBIO_LASTFM_API_SECRET="your_api_secret"
8698
export AUTOBIO_LASTFM_USERNAME="your_username"
8799
```
88100

101+
Get a Last.fm API key at: https://www.last.fm/api/account/create
102+
89103
### 4. Usage
90104

91105
#### Fetch Your Data
92-
Download your listening history to a local CSV file:
106+
107+
The `fetch` command targets a specific plugin. Plugins that support automatic download
108+
will retrieve your data; plugins that require a manual export will print step-by-step
109+
instructions instead.
110+
93111
```bash
94-
python autobiographer.py --user your_username
112+
# Download Last.fm listening history (requires env vars above)
113+
python autobiographer.py fetch lastfm
114+
115+
# Print Foursquare/Swarm manual export instructions
116+
python autobiographer.py fetch swarm
117+
118+
# Limit to recent pages or a date range
119+
python autobiographer.py fetch lastfm --pages 5
120+
python autobiographer.py fetch lastfm --from-date 2024-01-01 --to-date 2024-12-31
121+
122+
# Save to a custom location
123+
python autobiographer.py fetch lastfm --output data/my_tracks.csv
95124
```
96125

126+
If a plugin is not yet configured the command will list exactly which environment
127+
variables are missing and what each one is for.
128+
129+
The app sidebar also exposes this directly: a **Fetch Latest Data** button appears for
130+
plugins that support automatic retrieval, and step-by-step manual instructions are
131+
shown for plugins that require a data export from the provider.
132+
97133
#### Launch the Streamlit Dashboard
98134
Start the interactive Streamlit application:
99135
```bash
@@ -205,7 +241,7 @@ Additional utility scripts are available in the `tools/` directory:
205241
## Project Structure
206242

207243
```
208-
autobiographer.py # Last.fm API fetch + data save CLI
244+
autobiographer.py # Data-fetching CLI (`fetch <plugin>`) + Last.fm API client
209245
visualize.py # Streamlit dashboard (assembles views from plugins)
210246
export_html.py # Static HTML export — single self-contained report file
211247
analysis_utils.py # Shared data processing and caching logic
@@ -215,8 +251,9 @@ plugins/
215251
sources/
216252
base.py # SourcePlugin ABC + validate_schema()
217253
__init__.py # REGISTRY + @register decorator + load_builtin_plugins()
218-
lastfm/loader.py # Last.fm source plugin
219-
swarm/loader.py # Foursquare/Swarm source plugin
254+
lastfm/loader.py # Last.fm source plugin
255+
swarm/loader.py # Foursquare/Swarm source plugin
256+
assumptions/loader.py # Location assumptions plugin
220257
notebooks/ # Jupyter notebooks for custom analysis
221258
tools/ # Utility scripts (audio muxing, etc.)
222259
data/ # Local data storage (CSVs, cache, Swarm JSON exports)
@@ -389,6 +426,7 @@ Selected paths are persisted to `data/config.json` so they survive application r
389426
|---|---|
390427
| `what-when` | `timestamp`, `label`, `sublabel`, `category`, `source_id` |
391428
| `where-when` | `timestamp`, `lat`, `lng`, `place_name`, `place_type`, `source_id` |
429+
| `location-context` | No required columns — defines enrichment data, not a primary stream |
392430

393431
`validate_schema()` raises `ValueError` at load time if any required column is absent.
394432

autobiographer.py

Lines changed: 164 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,13 @@
11
import argparse
22
import os
33
import time
4-
from typing import Optional
4+
from typing import Callable, Optional
55

66
import pandas as pd
77
import requests
8+
from dotenv import load_dotenv
9+
10+
load_dotenv()
811

912

1013
class Autobiographer:
@@ -30,10 +33,26 @@ def fetch_recent_tracks(
3033
pages: Optional[int] = None,
3134
from_ts: Optional[int] = None,
3235
to_ts: Optional[int] = None,
36+
progress_callback: Optional[Callable[[int, int], None]] = None,
3337
) -> list[dict]:
34-
"""Fetch recent tracks for the user."""
38+
"""Fetch recent tracks for the user.
39+
40+
Args:
41+
limit: Tracks per API page (max 200).
42+
pages: Stop after this many pages; None fetches all.
43+
from_ts: Unix timestamp lower bound (inclusive).
44+
to_ts: Unix timestamp upper bound (inclusive).
45+
progress_callback: Optional callable invoked after each page with
46+
``(current_page, total_pages)`` so callers can report progress.
47+
Also receives ``(0, total_pages)`` before the first page fetch
48+
once the total is known.
49+
50+
Returns:
51+
List of raw track dicts from the Last.fm API.
52+
"""
3553
all_tracks = []
3654
current_page = 1
55+
total_pages = 1
3756

3857
while True:
3958
print(f"Fetching page {current_page}...")
@@ -54,6 +73,9 @@ def fetch_recent_tracks(
5473
all_tracks.extend(tracks)
5574

5675
total_pages = int(data.get("recenttracks", {}).get("@attr", {}).get("totalPages", 1))
76+
if progress_callback:
77+
progress_callback(current_page, total_pages)
78+
5779
if pages and current_page >= pages:
5880
break
5981
if current_page >= total_pages:
@@ -87,52 +109,152 @@ def save_tracks_to_csv(self, tracks: list[dict], filename: Optional[str] = None)
87109
print(f"Saved {len(df)} tracks to {filename}")
88110

89111

90-
def main() -> None:
91-
parser = argparse.ArgumentParser(description="Fetch Last.fm listening history.")
92-
parser.add_argument(
93-
"--user", help="Last.fm username (defaults to AUTOBIO_LASTFM_USERNAME env var)"
94-
)
95-
parser.add_argument("--pages", type=int, help="Limit number of pages to fetch")
96-
parser.add_argument("--from_date", help="Start date (YYYY-MM-DD)")
97-
parser.add_argument("--to_date", help="End date (YYYY-MM-DD)")
112+
def _parse_date(date_str: str, label: str) -> Optional[int]:
113+
"""Parse a YYYY-MM-DD date string and return a Unix timestamp.
114+
115+
Args:
116+
date_str: Date string in YYYY-MM-DD format.
117+
label: Human-readable label used in error messages (e.g. "from_date").
118+
119+
Returns:
120+
Unix timestamp as int, or None if date_str is empty.
121+
122+
Raises:
123+
SystemExit: If the date string is not in the expected format.
124+
"""
125+
if not date_str:
126+
return None
127+
try:
128+
return int(time.mktime(time.strptime(date_str, "%Y-%m-%d")))
129+
except ValueError as exc:
130+
print(f"Error: Invalid {label} format '{date_str}'. Use YYYY-MM-DD.")
131+
raise SystemExit(1) from exc
98132

99-
args = parser.parse_args()
100133

101-
api_key = os.getenv("AUTOBIO_LASTFM_API_KEY")
102-
api_secret = os.getenv("AUTOBIO_LASTFM_API_SECRET")
103-
username = args.user or os.getenv("AUTOBIO_LASTFM_USERNAME")
134+
def _run_fetch(args: argparse.Namespace) -> None:
135+
"""Execute the ``fetch`` subcommand.
104136
105-
if not all([api_key, api_secret, username]):
137+
Routes to the named plugin's fetch() method if it supports programmatic
138+
retrieval, or prints manual download instructions if it does not.
139+
140+
Args:
141+
args: Parsed CLI arguments including ``plugin``, ``output``, ``pages``,
142+
``from_date``, and ``to_date``.
143+
"""
144+
from plugins.sources import REGISTRY, load_builtin_plugins
145+
146+
load_builtin_plugins()
147+
148+
plugin_id: str = args.plugin
149+
if plugin_id not in REGISTRY:
150+
available = ", ".join(sorted(REGISTRY))
151+
print(f"Error: Unknown plugin '{plugin_id}'. Available plugins: {available}")
152+
raise SystemExit(1)
153+
154+
plugin = REGISTRY[plugin_id]()
155+
156+
if not plugin.FETCHABLE:
157+
print(f"{plugin.DISPLAY_NAME} does not support automatic fetching.\n")
158+
print(plugin.get_manual_download_instructions())
159+
return
160+
161+
# Validate env vars before attempting the fetch.
162+
env_vars = plugin.get_fetch_env_vars()
163+
missing = [v for v in env_vars if not os.getenv(v["var"])]
164+
if missing:
165+
print("Error: Missing required configuration for fetching. Set the following:\n")
166+
for v in missing:
167+
print(f" {v['var']}: {v['description']}")
106168
print(
107-
"Error: AUTOBIO_LASTFM_API_KEY, AUTOBIO_LASTFM_API_SECRET, and "
108-
"AUTOBIO_LASTFM_USERNAME must be set."
169+
f"\nThen re-run: python autobiographer.py fetch {plugin_id}\n"
170+
"See README.md for full configuration instructions."
109171
)
110-
return
172+
raise SystemExit(1)
111173

112-
from_ts = None
113-
if args.from_date:
114-
try:
115-
# Beginning of the day
116-
from_ts = int(time.mktime(time.strptime(args.from_date, "%Y-%m-%d")))
117-
except ValueError:
118-
print(f"Error: Invalid from_date format '{args.from_date}'. Use YYYY-MM-DD.")
119-
return
120-
121-
to_ts = None
122-
if args.to_date:
123-
try:
124-
# End of the day (23:59:59)
125-
to_struct = time.strptime(args.to_date, "%Y-%m-%d")
126-
to_ts = int(time.mktime(to_struct)) + 86399
127-
except ValueError:
128-
print(f"Error: Invalid to_date format '{args.to_date}'. Use YYYY-MM-DD.")
129-
return
130-
131-
if not api_key or not api_secret or not username:
132-
return
133-
visualizer = Autobiographer(api_key, api_secret, username)
134-
tracks = visualizer.fetch_recent_tracks(pages=args.pages, from_ts=from_ts, to_ts=to_ts)
135-
visualizer.save_tracks_to_csv(tracks)
174+
from_ts = _parse_date(args.from_date or "", "from_date")
175+
to_ts_raw = _parse_date(args.to_date or "", "to_date")
176+
# Shift to-date to end of day so the full day is included.
177+
to_ts = to_ts_raw + 86399 if to_ts_raw is not None else None
178+
179+
print(f"Fetching {plugin.DISPLAY_NAME} data...")
180+
plugin.fetch(
181+
output_path=args.output or None,
182+
pages=args.pages,
183+
from_ts=from_ts,
184+
to_ts=to_ts,
185+
)
186+
187+
188+
def main() -> None:
189+
"""Entry point for the Autobiographer data-fetching CLI.
190+
191+
Subcommands
192+
-----------
193+
fetch <plugin>
194+
Fetch data for the named plugin. For plugins that support programmatic
195+
retrieval (e.g. ``lastfm``) this downloads and saves data locally.
196+
For manual-export plugins (e.g. ``swarm``) this prints step-by-step
197+
instructions for obtaining the data from the provider.
198+
199+
Examples
200+
--------
201+
::
202+
203+
python autobiographer.py fetch lastfm
204+
python autobiographer.py fetch lastfm --pages 5
205+
python autobiographer.py fetch lastfm --from-date 2024-01-01
206+
python autobiographer.py fetch swarm
207+
"""
208+
parser = argparse.ArgumentParser(
209+
description="Autobiographer data-fetching CLI.",
210+
formatter_class=argparse.RawDescriptionHelpFormatter,
211+
)
212+
subparsers = parser.add_subparsers(dest="command", metavar="COMMAND")
213+
214+
fetch_parser = subparsers.add_parser(
215+
"fetch",
216+
help="Fetch or get instructions for a plugin's data.",
217+
description=(
218+
"Fetch data for a registered source plugin. "
219+
"Plugins that support automatic retrieval will download and save data. "
220+
"Plugins that require a manual export will print step-by-step instructions."
221+
),
222+
)
223+
fetch_parser.add_argument(
224+
"plugin",
225+
metavar="PLUGIN",
226+
help="Plugin ID to target (e.g. lastfm, swarm).",
227+
)
228+
fetch_parser.add_argument(
229+
"--output",
230+
metavar="PATH",
231+
help="Output file or directory path (overrides the plugin's default location).",
232+
)
233+
fetch_parser.add_argument(
234+
"--pages",
235+
type=int,
236+
metavar="N",
237+
help="Limit to N pages of results (Last.fm only).",
238+
)
239+
fetch_parser.add_argument(
240+
"--from-date",
241+
dest="from_date",
242+
metavar="YYYY-MM-DD",
243+
help="Only fetch records on or after this date.",
244+
)
245+
fetch_parser.add_argument(
246+
"--to-date",
247+
dest="to_date",
248+
metavar="YYYY-MM-DD",
249+
help="Only fetch records on or before this date.",
250+
)
251+
252+
args = parser.parse_args()
253+
254+
if args.command == "fetch":
255+
_run_fetch(args)
256+
else:
257+
parser.print_help()
136258

137259

138260
if __name__ == "__main__":

0 commit comments

Comments
 (0)