You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
0.4.0-alpha: save on windows, remove form history, cli (#44)
breaking/important changes:
* bugfix: stream didnt print newline after each item
* removed form-history backup, see #43
* lib: `browserexport.save.backup_history` can return None, if you passed `to="-"` (this tries to print the database to STDOUT)
New Features/Improvements:
* Supports lots more windows paths
* Added opera, librewolf, floorp
* better CLI error handling/help text
* can parse jsonl, jsonl.gz, json.gz files
* can write database to STDOUT/read databases from STDIN
--form-history [firefox] Browser name to backup form (input field)
73
-
history for
74
-
--pattern TEXT Pattern for the resulting timestamped
75
-
filename, should include an str.format
76
-
replacement placeholder
77
-
-p, --profile TEXT Use to pick the correct profile to back up.
78
-
If unspecified, will assume a single profile
79
-
[default: *]
80
-
--path FILE Specify a direct path to a database to back
81
-
up
82
-
-t, --to DIRECTORY Directory to store backup to [required]
83
-
--help Show this message and exit.
79
+
--pattern TEXT Pattern for the resulting timestamped filename, should include an
80
+
str.format replacement placeholder for the date [default:
81
+
browser_name-{}.extension]
82
+
-p, --profile TEXT Use to pick the correct profile to back up. If unspecified, will assume a
83
+
single profile [default: *]
84
+
--path FILE Specify a direct path to a database to back up
85
+
-t, --to DIRECTORY Directory to store backup to. Pass '-' to print database to STDOUT
86
+
[required]
87
+
-h, --help Show this message and exit.
84
88
```
85
89
86
-
Must specify one of `--browser`, `--form-history`or `--path`
90
+
Must specify one of `--browser`, or `--path`
87
91
88
92
After your browser history reaches a certain size, browsers typically remove old history over time, so I'd recommend backing up your history periodically, like:
89
93
90
94
```shell
91
-
$ browserexport save -b firefox --to ~/data/browser_history
92
-
$ browserexport save -b chrome --to ~/data/browser_history
93
-
$ browserexport save -b safari --to ~/data/browser_history
95
+
$ browserexport save -b firefox --to ~/data/browsing
96
+
$ browserexport save -b chrome --to ~/data/browsing
97
+
$ browserexport save -b safari --to ~/data/browsing
94
98
```
95
99
96
100
That copies the sqlite databases which contains your history `--to` some backup directory.
@@ -99,7 +103,7 @@ If a browser you want to backup is Firefox/Chrome-like (so this would be able to
99
103
100
104
```shell
101
105
$ browserexport save --path ~/.somebrowser/profile/places.sqlite \
102
-
--to ~/data/browser_history
106
+
--to ~/data/browsing
103
107
```
104
108
105
109
The `--pattern` argument can be used to change the resulting filename for the browser, e.g. `--pattern 'places-{}.sqlite'` or `--pattern "$(uname)-{}.sqlite"`. The `{}` is replaced by the browser name.
@@ -125,19 +129,7 @@ For Firefox Android [Fenix](https://github.com/mozilla-mobile/fenix/), the datab
125
129
126
130
### `inspect`/`merge`
127
131
128
-
```
129
-
Usage: browserexport inspect [OPTIONS] SQLITE_DB
130
-
131
-
Extracts visits from a single sqlite database
132
-
133
-
Provide a history database as the first argument
134
-
Drops you into a REPL to access the data
135
-
136
-
Options:
137
-
-s, --stream Stream JSON objects instead of printing a JSON list
138
-
-j, --json Print result to STDOUT as JSON
139
-
--help Show this message and exit.
140
-
```
132
+
These work very similarly, `inspect` is for a single database, `merge` is for multiple databases.
-s, --stream Stream JSON objects instead of printing a JSON list
154
148
-j, --json Print result to STDOUT as JSON
155
-
--help Show this message and exit.
149
+
-h, --help Show this message and exit.
156
150
```
157
151
158
-
Logs are hidden by default. To show the debug logs set `export BROWSEREXPORT_LOGS=10` (uses [logging levels](https://docs.python.org/3/library/logging.html#logging-levels)) or pass the `--debug` flag.
[D 210417 21:12:18 merge:38] merging information from 24 sources...
165
157
[D 210417 21:12:18 parse:19] Reading visits from /home/sean/data/firefox/places-20200828223058.sqlite...
@@ -180,12 +172,35 @@ Use vis to interact with the data
180
172
[1] ...
181
173
```
182
174
175
+
You can also read from STDIN, so this can be used in conjunction with `save`, to merge databases you've backed up and combine your current browser history:
Or, use [process substitution](https://tldp.org/LDP/abs/html/process-sub.html) to save multiple dbs in parallel and then merge them:
191
+
192
+
```bash
193
+
$ browserexport merge <(browserexport save -b firefox -t -) <(browserexport save -b chrome -t -)
194
+
```
195
+
196
+
Logs are hidden by default. To show the debug logs set`export BROWSEREXPORT_LOGS=10` (uses [logging levels](https://docs.python.org/3/library/logging.html#logging-levels)) or pass the `--debug` flag.
Merged files like `history.json` above can also be used as inputs files themselves, this reads those by mapping the JSON onto the `Visit` schema directly. If you don't care about keeping the raw databases for any other auxiliary info like form, bookmark data, or [from_visit](https://github.com/seanbreckenridge/browserexport/issues/30) info and just want the URL, visit date and metadata, you could use `merge` to periodically merge the bulky `.sqlite` files into a JSON dump:
212
+
Merged files like `history.json` can also be used as inputs files themselves, this reads those by mapping the JSON onto the `Visit` schema directly.
213
+
214
+
In addition to `.json` files, this can parse `.jsonl` ([JSON lines](http://jsonlines.org/)) files, which are files which contain newline delimited JSON objects. This allows you to parse JSON objects one at a time, instead of loading the entire file into memory. The `.jsonl` file can be generated with the `--stream` flag:
_Additionally_, this can parse gzipped versions of those files - files like `history.json.gz` or `history.jsonl.gz`
221
+
222
+
If you don't care about keeping the raw databases for any other auxiliary info like form, bookmark data, or [from_visit](https://github.com/seanbreckenridge/browserexport/issues/30) info and just want the URL, visit date and metadata, you could use `merge` to periodically merge the bulky `.sqlite` files into a gzipped JSONL dump:
198
223
199
224
```bash
200
-
cd~/data/browsing
201
225
# backup databases
202
226
rsync -Pavh ~/data/browsing ~/.cache/browsing
203
-
# merge all sqlite databases into a single JSON file
I do this every couple months with a script [here](https://github.com/seanbreckenridge/bleanser/blob/master/bin/merge-browser-history), and then sync my old databases to a harddrive for more long-term storage
214
239
240
+
## Shell Completion
241
+
242
+
This uses `click`, which supports [shell completion](https://click.palletsprojects.com/en/8.1.x/options/) for `bash`, `zsh` and `fish`. To generate the completion on startup, put one of the following in your shell init file (`.bashrc`/`.zshrc` etc)
_BROWSEREXPORT_COMPLETE=fish_source browserexport | source # fish
248
+
```
249
+
250
+
Instead of `eval`ing, you could of course save the generated completion to a file and/or lazy load it in your shell config, see [bash completion docs](https://github.com/scop/bash-completion/blob/master/README.md#faq), [zsh functions](https://zsh.sourceforge.io/Doc/Release/Functions.html), [fish completion docs](https://fishshell.com/docs/current/completions.html). For example for `zsh` that might look like:
# update fpath to include the directory you saved the completion file to
260
+
fpath=(~/.config/zsh/functions $fpath)
261
+
autoload -Uz compinit && compinit
262
+
```
263
+
215
264
## HPI
216
265
217
266
If you want to cache the merged results, this has a [module in HPI](https://github.com/karlicoss/HPI) which handles locating/caching and querying the results. See [setup](https://github.com/karlicoss/HPI/blob/master/doc/SETUP.org#install-main-hpi-package) and [module setup](https://github.com/karlicoss/HPI/blob/master/doc/MODULES.org#mybrowser).
@@ -257,7 +306,7 @@ from browserexport.merge import read_and_merge
You can also use [`sqlite_backup`](https://github.com/seanbreckenridge/sqlite_backup) to copy your current browser history into a sqlite connection in memory, without ever writing to disk:
309
+
You can also use [`sqlite_backup`](https://github.com/seanbreckenridge/sqlite_backup) to copy your current browser history into a sqlite connection in memory, as a `sqlite3.Connection`
0 commit comments