Skip to content

Commit c65133f

Browse files
authored
Merge pull request #39 from uic-evl/mep
Mep
2 parents 43602cc + 2e1d1e7 commit c65133f

File tree

8 files changed

+5565
-294
lines changed

8 files changed

+5565
-294
lines changed

_bibliography/README-bib.md

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,41 @@
1-
# Do not make changes to this file, if you have additions or corrections to the EVL master bibliography make them in its repo ([https://github.com/uic-evl/evl_biblio](https://github.com/uic-evl/evl_biblio))
1+
# EVL Bibliography Directory
2+
3+
## Important Note
4+
Do not make changes to this file, if you have additions or corrections to the EVL master bibliography make them in its repo ([https://github.com/uic-evl/evl_biblio](https://github.com/uic-evl/evl_biblio))
5+
6+
## Files in this Directory
7+
8+
### Bibliography Files
9+
- **papers.bib** - Main bibliography file containing all EVL publications in BibTeX format. This is synchronized from the EVL bibliography repository.
10+
- **papers-enhanced.bib** - Enhanced version of papers.bib with additional fields (bibtex_show, selected) added by enhanceBib.py for website display purposes
11+
- **sage3.bib** - SAGE3-specific bibliography file for SAGE3-related publications
12+
13+
### Scripts
14+
- **enhanceBib.py** - Python script that processes BibTeX files to add website-specific fields:
15+
- `bibtex_show = {true}` - enables BibTeX display on website
16+
- `selected = {false}` - marks papers as selected/featured
17+
- Usage: `python enhanceBib.py <bibtex_file>`
18+
- Outputs: `<filename>-enhanced.bib`
19+
20+
- **cleanbib.py** - Python script that cleans BibTeX files:
21+
- Removes fields with empty values
22+
- Converts abbreviated month names to full names (e.g., apr → April, jan → January)
23+
- Handles both braced and unbraced month formats
24+
- Usage: `python cleanbib.py <bibtex_file>`
25+
- Outputs: `<filename>-updated.bib`
26+
27+
- **checkBranches.sh** - Shell script to compare papers.bib across all git branches
28+
- Shows which branches have identical versions to master
29+
- Identifies branches with different versions
30+
- Lists branches missing the file
31+
- Useful for tracking bibliography synchronization across branches
32+
33+
- **tidyBib.sh** - BibTeX formatting and cleanup script using bibtex-tidy
34+
- Formats BibTeX files with EVL-specific configuration
35+
- Removes duplicates, empty fields, and standardizes formatting
36+
- Sorts entries by year and applies consistent field ordering
37+
- Setup: `npm install -g bibtex-tidy`
38+
- Usage: `./tidyBib.sh <file.bib>`
39+
40+
### Documentation
41+
- **README-bib.md** - This file, explaining the directory structure and file purposes

_bibliography/checkBranches.sh

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
#!/bin/sh
2+
3+
# File to compare
4+
targetFile="./papers.bib"
5+
6+
echo "🔍 Comparing '$targetFile' across all branches..."
7+
echo "⚠️ Reminder: Run 'git fetch --all --prune' first to ensure remotes are up to date."
8+
echo
9+
10+
results=$(mktemp)
11+
12+
# Collect hash per branch
13+
for branch in $(git branch -a --format='%(refname:short)' | sort -u); do
14+
fileHash=$(git ls-tree -r "$branch" -- "$targetFile" | awk '{print $3}')
15+
if [ -n "$fileHash" ]; then
16+
echo "$fileHash $branch" >> "$results"
17+
else
18+
echo "MISSING $branch" >> "$results"
19+
fi
20+
done
21+
22+
echo "===== Branches analyzed ====="
23+
git branch -a --format='%(refname:short)'
24+
echo
25+
26+
# Get master hash
27+
masterHash=$(grep " master$" "$results" | awk '{print $1}')
28+
29+
if [ -z "$masterHash" ]; then
30+
echo "❌ Could not find '$targetFile' in master branch."
31+
rm -f "$results"
32+
exit 1
33+
fi
34+
35+
echo "===== Version Groups ====="
36+
37+
# Same as master
38+
sameAsMaster=$(grep "^$masterHash " "$results" | awk '{print $2}' | tr '\n' ' ')
39+
echo "Same as master: $sameAsMaster"
40+
echo
41+
42+
# Different from master
43+
grep -v '^MISSING' "$results" | grep -v "^$masterHash " | awk '
44+
{
45+
hash=$1
46+
branch=$2
47+
groups[hash]=groups[hash] branch " "
48+
}
49+
END {
50+
if (length(groups) == 0) {
51+
print "No branches differ from master."
52+
} else {
53+
groupNum=1
54+
for (h in groups) {
55+
print "Different group " groupNum ": " groups[h]
56+
groupNum++
57+
}
58+
}
59+
}
60+
'
61+
echo
62+
63+
# Missing file
64+
if grep -q '^MISSING' "$results"; then
65+
echo "Branches missing file '$targetFile':"
66+
grep '^MISSING' "$results" | cut -d' ' -f2- | tr '\n' ' '
67+
echo
68+
fi
69+
70+
rm -f "$results"
71+
echo
72+
echo "✅ Summary complete."

_bibliography/cleanbib.py

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
#!/usr/bin/env python3
2+
import argparse
3+
import os
4+
import re
5+
6+
# Mapping for month abbreviations to their full names.
7+
MONTH_MAPPING = {
8+
'jan': 'January',
9+
'january': 'January',
10+
'feb': 'February',
11+
'february': 'February',
12+
'mar': 'March',
13+
'march': 'March',
14+
'apr': 'April',
15+
'april': 'April',
16+
'may': 'May',
17+
'jun': 'June',
18+
'june': 'June',
19+
'jul': 'July',
20+
'july': 'July',
21+
'aug': 'August',
22+
'august': 'August',
23+
'sep': 'September',
24+
'september': 'September',
25+
'oct': 'October',
26+
'october': 'October',
27+
'nov': 'November',
28+
'november': 'November',
29+
'dec': 'December',
30+
'december': 'December'
31+
}
32+
33+
def clean_bib_file(input_file):
34+
"""
35+
Reads the input BibTeX file line-by-line, performs two tasks:
36+
1. Removes any lines defining a field with an empty value (e.g. "editor = {}").
37+
2. Processes month field lines: if the month value is an abbreviated word
38+
(and not numeric), it is replaced with the full month name.
39+
The cleaned lines are written to a new file named <original_file>-updated.bib.
40+
"""
41+
# Regex for an empty field line.
42+
empty_field_pattern = re.compile(r'^\s*[\w\-]+\s*=\s*\{\s*\}\s*,?\s*$')
43+
44+
# Regex to capture the month field.
45+
# It will match a line beginning with "month", then "=", then optionally "{", then capture a complete word,
46+
# then optionally "}" and optional trailing comma and whitespace.
47+
# This handles both "month = apr," and "month = {apr}," formats
48+
# Word boundaries ensure we only match complete month names, not substrings
49+
month_field_pattern = re.compile(r'^(\s*month\s*=\s*)(\{?)(\b[a-zA-Z]+\b)(\}?)(\s*,?\s*)$', re.IGNORECASE)
50+
51+
with open(input_file, "r", encoding="utf-8") as f:
52+
lines = f.readlines()
53+
54+
cleaned_lines = []
55+
for line in lines:
56+
# First, if the line matches an empty field, skip it.
57+
if empty_field_pattern.match(line):
58+
continue
59+
60+
# Next, if it is a month field, try to process it.
61+
m = month_field_pattern.match(line)
62+
if m:
63+
prefix, open_brace, month_val, close_brace, suffix = m.groups()
64+
# Remove extra whitespace from the month value.
65+
month_val_clean = month_val.strip()
66+
# If the month value is purely numeric, leave it as is.
67+
if month_val_clean.isdigit():
68+
cleaned_lines.append(line)
69+
else:
70+
# Check if it is an abbreviated month; if found, replace it.
71+
lower_val = month_val_clean.lower()
72+
if lower_val in MONTH_MAPPING and lower_val != MONTH_MAPPING[lower_val].lower():
73+
new_month = MONTH_MAPPING[lower_val]
74+
# Build the new line with the full month name, preserving brace style.
75+
new_line = prefix + open_brace + new_month + close_brace + suffix
76+
cleaned_lines.append(new_line)
77+
else:
78+
cleaned_lines.append(line)
79+
else:
80+
# All other lines are left unchanged.
81+
cleaned_lines.append(line)
82+
83+
# Build the new file name: original basename with "-updated" appended before the extension.
84+
base, ext = os.path.splitext(input_file)
85+
output_file = base + "-updated.bib"
86+
87+
with open(output_file, "w", encoding="utf-8") as f:
88+
f.writelines(cleaned_lines)
89+
90+
print(f"Updated file created: {output_file}")
91+
92+
if __name__ == '__main__':
93+
parser = argparse.ArgumentParser(
94+
description="Clean a BibTeX file by removing fields with empty values and process month fields while preserving formatting."
95+
)
96+
parser.add_argument("bibtex_file", help="Path to the BibTeX file to clean")
97+
args = parser.parse_args()
98+
clean_bib_file(args.bibtex_file)

0 commit comments

Comments
 (0)