Skip to content

Commit c908875

Browse files
authored
Update README.md
1 parent 93a58dd commit c908875

1 file changed

Lines changed: 10 additions & 9 deletions

File tree

README.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@ add specific fields (like DOIs, or ISSN) to a manually curated bib file.
4242

4343
It is designed to be as simple to use as possible: just give it a bib file and
4444
let **btac** work its magic! It combines multiple sources and runs consistency
45-
and normalization checks on the added fields (check that URLs lead to a valid
46-
webpage, that DOIs exist at https://dx.doi.org/).
45+
and normalization checks on the added fields (only adds URLs that lead to a valid
46+
webpage, DOIs that exist at https://dx.doi.org/, ISSN/ISBN with valid check
47+
digits...).
4748

4849
It attempts to complete a BibTeX file by querying the following domains:
4950
- [openalex.org](https://openalex.org/): ~240 million entries
@@ -101,11 +102,11 @@ entries that don't have one of those two fields *will not* be completed.
101102
for full details)
102103
- If the year is known, entries with different years will also not match.
103104

104-
**Disclaimers**
105+
**Disclaimers:**
105106

106107
- There is no guarantee that the script will find matches for your entries, or
107108
that the websites will have any data to add to your entries, (or even that the
108-
website data is correct, but that's not for me to say...)
109+
website data is correct).
109110

110111
- The script is designed to minimize the chance of false positives - that is
111112
adding data from another similar-ish entry to your entry. If you find any such
@@ -119,21 +120,21 @@ by performing a majority vote among the sources. To do so it uses smart
119120
normalization and merging tactics for each field:
120121
- Authors (and editors) match if they have same last names and, if both first
121122
names present, the first name of one is equal/an abbreviation of the other.
122-
Author list match they have at least one author in common.
123+
Author lists match they have at least one author in common.
123124
- ISSN and ISBN are normalized and have their check digits verified. ISBN are converted
124-
to their 13 digit representation
125+
to their 13 digit representation.
125126
- URL and DOI are checked for valid format, and further validated by querying
126127
them online to ensure they exist. DOI are normalized to strip any leading URL
127128
and converted to lowercase.
128129
- Many fields match with abbreviation detection (journal, institution, booktitle,
129130
organization, publisher, school and series). So `ACM` will match
130-
`Association for Computer Machinery`
131-
- Pages are normalized to use `--` as separator
131+
`Association for Computer Machinery`.
132+
- Pages are normalized to use `--` as separator.
132133
- All other fields are compared excluding case and punctuation.
133134

134135
The script will not overwrite any user given non-empty fields, unless the
135136
`-f/--force-overwrite` flag is given. If you want to check what fields are
136-
added, you can use `-v/--verbose` to have them printed to stdout (with
137+
added, you can use `-v/--verbose` to have them printed to standard output (with
137138
source information), or `-p/--prefix` to have the new fields be prefixed with
138139
`BTAC` in the output file.
139140

0 commit comments

Comments
 (0)