Skip to content

Commit e29c3af

Browse files
committed
Fix issues #50, #53, and #58
Issue #50: Add validation for invalid page separators - Add detection for en-dash, em-dash, minus sign, and other invalid dash characters in page numbers - Provide suggested corrections by replacing invalid dashes with standard hyphens - Return error message when invalid dash characters are detected Issue #53: Fix capitalization validation with surrounding punctuation - Add strip_leading_trailing_non_letters() helper function to extract leading/trailing non-alphabetic characters - Update format_title() to strip punctuation before checking caps.txt, then reapply after adding braces - Update format_journal_name() with same fix - Fixes incorrect brace placement like {(BHI}) → ({BHI}) Issue #58: Add MacOS TeXShop/TeX Live setup instructions - Add new section "MacOS Setup with TeXShop and TeX Live" to README - Document TeX Live's ~/Library/texmf/ approach for GUI applications - Explain why environment variables don't work with Mac GUI apps - Provide step-by-step instructions with symbolic link setup - Update table of contents with new subsections
1 parent ce48365 commit e29c3af

File tree

2 files changed

+117
-23
lines changed

2 files changed

+117
-23
lines changed

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ The main bibtex file ([cdl.bib](https://raw.githubusercontent.com/ContextLab/CDL
1616
- [`compare`](#compare)
1717
- [`commit`](#commit)
1818
- [Using the bibtex file as a common bibliography for all *local* LaTeX files](#using-the-bibtex-file-as-a-common-bibliography-for-all-local-latex-files)
19+
- [General Unix/Linux Setup (Command Line Compilation)](#general-unixlinux-setup-command-line-compilation)
20+
- [MacOS Setup with TeXShop and TeX Live](#macos-setup-with-texshop-and-tex-live)
1921
- [Using the bibtex file on Overleaf](#using-the-bibtex-file-on-overleaf)
2022
- [Acknowledgements](#acknowledgements)
2123

@@ -190,6 +192,8 @@ called, and a pull request must be submitted in order to integrate the changes
190192
into the main ContextLab fork.
191193

192194
# Using the bibtex file as a common bibliography for all *local* LaTeX files
195+
196+
## General Unix/Linux Setup (Command Line Compilation)
193197
1. Check out this repository to your home directory
194198
2. Add the following lines to your `~/.bash_profile` (or `~/.zshrc`, etc.):
195199
```
@@ -208,6 +212,28 @@ latex filename
208212
pdflatex filename
209213
```
210214

215+
## MacOS Setup with TeXShop and TeX Live
216+
217+
Mac GUI applications like TeXShop don't execute within your shell environment, which means the environment variable approach described above won't work when compiling through the TeXShop GUI. Instead, use TeX Live's built-in support for personal files:
218+
219+
1. Check out this repository (we'll assume you cloned it to your home directory: `~/CDL-bibliography`)
220+
2. Create the TeX Live personal texmf directory structure for bibliography files:
221+
```bash
222+
mkdir -p ~/Library/texmf/bibtex/bib
223+
```
224+
3. Create a symbolic link from your personal texmf directory to the CDL-bibliography repository. **Important**: You must use the absolute path (not relative paths or `~`):
225+
```bash
226+
ln -s /Users/YOUR_USERNAME/CDL-bibliography/cdl.bib ~/Library/texmf/bibtex/bib/cdl.bib
227+
```
228+
Replace `YOUR_USERNAME` with your actual macOS username, or use `$HOME` instead:
229+
```bash
230+
ln -s $HOME/CDL-bibliography/cdl.bib ~/Library/texmf/bibtex/bib/cdl.bib
231+
```
232+
4. In your .tex file, use the line `\bibliography{cdl}` to generate a bibliography using the citation keys defined in cdl.bib
233+
5. Compile your document using TeXShop's GUI or from the command line
234+
235+
**Note**: This approach also works for command-line compilation, so you don't need to set up the environment variables if you use this method.
236+
211237
# Using the bibtex file on Overleaf
212238
You can use [git submodules](https://blog.github.com/2016-02-01-working-with-submodules/) to maintain a reference to the cdl.bib file in this repository that you can easily keep in sync with latest version. This avoids the need to maintain a separate .bib file in each Overleaf project.
213239

bibcheck/helpers.py

Lines changed: 91 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -427,6 +427,21 @@ def roman2int(s):
427427

428428

429429
def valid_pages(p):
430+
# Check for invalid dash characters (en-dash, em-dash, minus sign, etc.)
431+
invalid_dashes = {
432+
'\u2013': 'en-dash (–)',
433+
'\u2014': 'em-dash (—)',
434+
'\u2212': 'minus sign (−)',
435+
'\u2010': 'hyphen (‐)',
436+
'\u2011': 'non-breaking hyphen (‑)',
437+
}
438+
439+
for dash_char, dash_name in invalid_dashes.items():
440+
if dash_char in p:
441+
# Return False with error message
442+
suggested_fix = p.replace(dash_char, '-')
443+
return False, [p, suggested_fix]
444+
430445
valid, kind, val = valid_page(p)
431446
if valid: #"single" page
432447
return True, [p, p]
@@ -483,21 +498,39 @@ def format_journal_name(n, key=journal_key, force_caps=force_caps):
483498
#words = ['-'.join([format_journal_name(x) for x in w.split('-')]) if len(w.split('-')) > 1 else w for w in words] #deal with hyphens
484499

485500
for i, w in enumerate(words):
486-
words[i] = w.capitalize()
487-
488-
#deal with hyphens
489-
if len(w.split('-')) > 1:
490-
words[i] = '-'.join(format_journal_name(c, key=key, force_caps=force_caps) for c in w.split('-'))
491-
492-
if (i > 0) and (w.lower() in uncaps):
493-
words[i] = words[i].lower()
494-
495-
correct_caps = [f for f in force_caps if f.lower() == remove_non_letters(w.lower())]
501+
# Check if word is fully braced (starts and ends with braces around the whole word)
502+
is_fully_braced = before_letters(w, '{') and after_letters(w, '}')
503+
504+
if is_fully_braced:
505+
# Remove outer braces for processing, we'll check caps on the content
506+
unbraced = remove_curlies(w, join=' ')
507+
prefix, core, suffix = strip_leading_trailing_non_letters(unbraced)
508+
else:
509+
# Not fully braced, process normally
510+
prefix, core, suffix = strip_leading_trailing_non_letters(w)
511+
512+
# Skip if no core (word is only punctuation)
513+
if not core:
514+
words[i] = w
515+
continue
516+
517+
correct_caps = [f for f in force_caps if f.lower() == remove_non_letters(core.lower())]
518+
496519
if len(correct_caps) >= 1:
497520
c = correct_caps[-1]
498521
if not (c[0] == '{' and c[-1] == '}'):
499-
c = insert_non_letters('{' + c + '}', remove_curlies(w, join=' '))
500-
words[i] = c
522+
c = insert_non_letters('{' + c + '}', remove_curlies(core, join=' '))
523+
# Add back prefix and suffix (but not the outer braces we removed, since c already has them)
524+
words[i] = prefix + c + suffix
525+
else:
526+
words[i] = w.capitalize()
527+
528+
#deal with hyphens
529+
if len(w.split('-')) > 1:
530+
words[i] = '-'.join(format_journal_name(c, key=key, force_caps=force_caps) for c in w.split('-'))
531+
532+
if (i > 0) and (w.lower() in uncaps):
533+
words[i] = words[i].lower()
501534
return ' '.join(words)
502535

503536
#rearrange author name (first middle last suffix)
@@ -624,7 +657,36 @@ def before_letters(s, c): #true if c occurs before the first letter in s
624657
def after_letters(s, c): #true if c occurs after the last letter in s
625658
return before_letters(s[::-1], c)
626659

627-
def insert_non_letters(x, y):
660+
def strip_leading_trailing_non_letters(s):
661+
"""Strip leading and trailing non-alphabetic characters from a string.
662+
Returns (prefix, core, suffix) where core contains only letters and internal punctuation."""
663+
if len(s) == 0:
664+
return '', '', ''
665+
666+
# Find first letter
667+
first_letter = -1
668+
for i, c in enumerate(s):
669+
if c.lower() in ascii_lowercase:
670+
first_letter = i
671+
break
672+
673+
if first_letter == -1: # No letters found
674+
return s, '', ''
675+
676+
# Find last letter
677+
last_letter = -1
678+
for i in range(len(s) - 1, -1, -1):
679+
if s[i].lower() in ascii_lowercase:
680+
last_letter = i
681+
break
682+
683+
prefix = s[:first_letter]
684+
core = s[first_letter:last_letter + 1]
685+
suffix = s[last_letter + 1:]
686+
687+
return prefix, core, suffix
688+
689+
def insert_non_letters(x, y):
628690
z = ''
629691
i = 0 #position in x
630692
j = 0 #position in y
@@ -641,15 +703,15 @@ def insert_non_letters(x, y):
641703
j += 1
642704
else: #x[i] and y[j] are both in ascii_lowercase but x[i] != x[j] -- throw an error
643705
raise Exception(f'"{y}" is not a compatable template for "{x}"')
644-
706+
645707
#insert trailing punctuation from y
646708
if (j < len(y)) and (remove_non_letters(y[j:]) == ''):
647709
z += y[j:]
648-
710+
649711
#insert trailing punctuation from x
650712
if (i < len(x)) and (remove_non_letters(x[i:]) == ''):
651713
z += x[i:]
652-
714+
653715
return z
654716

655717
def format_title(title):
@@ -689,16 +751,22 @@ def ends_in_punctuation(s):
689751

690752
#leave "a" and specified caps unchanged
691753
if w.lower() == 'a' or (before_letters(w, '{') and after_letters(w, '}')):
692-
reformatted_title.append(w)
754+
reformatted_title.append(w)
693755
#if w contains curly braces, just append it unchanged
694756
elif (w.count('{') > 0) or (w.count('}') > 0):
695-
reformatted_title.append(w)
757+
reformatted_title.append(w)
696758
else:
697-
caps_match = [f for f in force_caps if f.lower() == remove_non_letters(w.lower())]
698-
if len(caps_match) > 0:
699-
w = insert_non_letters('{' + caps_match[-1] + '}', w)
700-
elif not ends_in_punctuation(prev_w):
701-
w = w.lower()
759+
# Strip leading/trailing non-letters before checking caps
760+
prefix, core, suffix = strip_leading_trailing_non_letters(w)
761+
# Only check core if it has letters
762+
if core:
763+
caps_match = [f for f in force_caps if f.lower() == remove_non_letters(core.lower())]
764+
if len(caps_match) > 0:
765+
# Apply braces only to the core, then add back prefix and suffix
766+
core_with_braces = insert_non_letters('{' + caps_match[-1] + '}', remove_curlies(core, join=' '))
767+
w = prefix + core_with_braces + suffix
768+
elif not ends_in_punctuation(prev_w):
769+
w = w.lower()
702770
reformatted_title.append(w)
703771
prev_w = w
704772

0 commit comments

Comments
 (0)