- PR #93
- Removed ML/BERT code, documentation, and tests.
- PR #92
- Update bibcat llm metrics to collect bibcodes for confusion matrix cells
- PR #86
- github action for auto-release when git tag pushed and minor doc updates
- PR #80
- Refactored check_truematch for readability
- Unpinned tensorflow-metal version.
- PR #82
- a field_validator to ensure that the LLM output classification is only one of the allowed keywords
- PR #82
- Updated Confusion matrix (CM) plot text annotation of the mission names.
- PR #82
- The bug in
grouped_df["llm_mission"] = mission_df["llm_mission"].str.upper()in prepare_output() was fixed. This bug caused KeyError:nan error due to mismatched index betweenmission_dfandgrouped_dfwhen there were both the papertypes for the same bibcode ingrouped_df. - When there is no llm output for a bibcode, but human classification exists, the output still outputs human classification in summay_output.
- Updated metrics.py and stats.py to account human classifications but not to count when source paper is not found;
- The bug in
0.2.1 - 2025-09-26
- PR #85
- Changed Sphinx theme to
bookand updated documentation and updated docs.
- Changed Sphinx theme to
0.2.0 - 2025-09-22
-
- Removed conda env file.
-
- Deleted test_bibcat.py
- Deleted the same set of test gloabal variables assigned in multiple test scripts
-
- Deleted all previous codes and files for a fresh start
-
- The combined dataset link is updated to use updated papertrack with flagship gold sample verdict
-
- Reorganizing the bibcat CLI commands
- All llm-based grouped under
llmsub-command - Batch llm commands grouped under
llm batchsub-command - All
_or-command names shortened, e.g.run-gpttollm run, oraudit_llmtollm audit - Added a new
mlsub-command group and moved the NLP cli commands underneath
- All llm-based grouped under
- Reorganizing the bibcat CLI commands
-
- Expanding the list of keyword objects in
parameters.py - Fix a bug that falsely identify mission names used in
kw_missionin user_prompt,in_text, andhallucinated_by_llmin summary_output due to uppercasing mission names and the relevant tests. Missions that we pass intoidentify_missions_in text()need to be original case so that paper processing correctly handles ambiguous keyword phrases. - A minor update on
user_promptto spell out IUE - Pip installation updates in
README.md
- Expanding the list of keyword objects in
-
PR #68 -
Inconsistent_classifications.jsonwas revised and separated frombibcat stats-llms- Updated
metrics_summary.jsonto include confusion matrix metrics
- Updated
-
- Moved _process_database_ambig, _extract_core_from_phrase, _streamline_phrase, and _check_truematch from base.py to paper.py
- Updated tests to read from Paper() object instead of Base() object
-
PR #64 Update ROC input and docs
-
PR #63 Refactored to use newer OpenAI Responses API and remove deprecated Assistants API.
-
PR #62 Update
metrics.pyand its pytest -
PR #61 Update InfoModel response with enum
-
PR #56adds the MAST mission simple keyword text match to the user prompt
-
PR #54 Sanitizing keywords
-
PR #53 ROC curve bug fixes, add more evaluation metrics, etc
-
PR #47 New calculations for evaluation confidence values for multiple GPT runs
-
- Grouping the BERT model method into the pretrained folder
- Created PRETRAINED_README.md and updated the main README.md
-
- Refactored the ML classifier to allow for other
tensorflowmodels, and for adding other libraries, e.g.pytorch, down the line.
- Refactored the ML classifier to allow for other
-
- Setting a new config for the directory of papers for operational classification with a fake JSON file
- Refactored
fetch_paper.py - Other relevant updates and minor updates
-
- The
is_keywordmethod is replaced with theidentify_keywordmethod.
- The
-
evaluateandclassifyare now separate CLI options.
-
get_config()error fix_add_word()temporary fixmergererorr fix for config parameters
-
- Fix ddict type errors
-
- Consolidated all config into a single
bibcat_config.yamlYAML file. - Moved
bibcatoutput to outside the package directory - Added support for user custom configuration and settings
- Migrated code to use new
configobject, a dottable dictionary to retain the old config syntax
- Consolidated all config into a single
-
- fixed various type annotation errors while refactoring
classify_papers.pyand other related modules such asperformance.pyoroperator.py. - all output results will be saved under a subdirectory of the given model run in the
outputdirectory. - classify_papers.py will produce both evaluation results and classification results per method, rather than combined results of both the RB and ML methods. This way will allow users to choose a classification method using CLI once CLI is enabled.
- fixed various type annotation errors while refactoring
-
- Enabling build_model.py to be both a module and a main script.
-
- Refactoring build_model.py has started, the first part includes to
- extract generate_direcotry_TVT() from
core/classifiers/textdata.pyto create a stand-alone module,split_dataset.py - modify to store the training, validation, and test (TVT) data set directories under the
data/partitioned_datasetsdirectory
- extract generate_direcotry_TVT() from
- The second part in refactoring required some relevant changes to implement the new modules and updating build_module.py accordingly.
build_model.py,base.py,operator.py,config.py, etc.
- Refactoring build_model.py has started, the first part includes to
-
- Renamed create_model.py to build_model.py
- Updated README.md
- Updated config.py to create variables to support the new script, build_dataset.py
-
- Renamed
test_coretocore - Renamed
test_datatodata
- Renamed
-
- The test global variables are called directly in the script rather than using redundantly reassigned to other variables.
- Moved test Keyword-object lookup variables to parameters.py
-
- Refactored classes.py into several individual class files below, which were moved to the new folder names,
coreandcore/classifiers.core: base.py, grammar.py, keyword.py, operator.py, paper.py, performance.pycore/classifiers:- _Classfier(): ClassifierBase() in textdata.py,
- Classifier_ML: MachineLearningClassifier() in ml.py,
- Classifier_Rules: RuleBasedClassifier() in rules.py
- Continued formatting and styling these class files using
Ruffand the line length set to 120 - Updated module updates according to the refactoring
- Updated CHANGELOG.MD and pyproject.toml
- Refactored classes.py into several individual class files below, which were moved to the new folder names,
-
- Updated the main README file
- updated formatting and styling
-
- added .readthedoc.yaml for the readthedoc documentation pages
-
- Added support for chunk planning/submission for large batches to OpenAI Batch API
-
- Adding new batch cli commands for submitting a batch job using OpenAI's Batch API
- Added new
bibcat llm batch submitandbibcat llm batch retreivefor submitting and retrieving batch jobs
-
PR #74 Add a bash script to run bibcat multiple batch input files serially
-
PR #68 Add a new CLI,
audit-llmsto create a json file to store failure modes stats and the breakdown information for failed bibcodes. -
- Set up Sphinx autodoc build
-
- Updated LLM prompt to include its rationale and reasoning in the output
- Switch to OpenAI Structured Response output, using pydantic models to control output
-
- pre-commit-hook setup
- GitHub CI/CD action pipeline for linting/formatting and pytests
-
- Add
stats-llm.pyto output statistics results from the evaluation summary output and the operational gpt results - pytests (
test_stats_llm.py) and llmREADME.mdupdated
- Add
-
- Add option to run gpt-batch multiple times
-
- Implement performance evaluation metrics and plots
-
- Added a summary output code for evaluation
-
- Added unit test for
build_dataset.py
- Added unit test for
-
- Implemented ChatGPT agent prompt engineering approach to classify papers
- Added a basic classification output
-
- Added a CLI option to build the combined dataset from the papertrack data and papertext (from ADS) data and refactored
build_dataset.py. - Enabled dynamic version control
- Readme update: clarify the workflow in Quick Start; the use of fetching papers using the
do_evaluationkeyword whenbibcat classifyandbibcat evaluate
- Added a CLI option to build the combined dataset from the papertrack data and papertext (from ADS) data and refactored
-
- Added new
clickcli forbibcat
- Added new
-
- Refactored
classify_papers.pyand created a few modules, which are called inclassify_papers.py. These modules could be executed based on CLI options once they are employed.fetch_papers.py: fetching papers from thedir_testdata directory to the bibcat pipeline. This needs an update to fetch operational data using thedir_datasetsargument in this module.operate_classifier.py: the main purpose of this module is to use only one method, classify the input papers, and output classification results as a JSON file for operation.evaluate_basic_performance.py: this module employes two performance functions to evaluate test paper classification and produce relevant files and a confusion matrix if a ML method is used.
- created
fakedata.txtin/bibcat/data/operational_data/to test operational classification with simple ascii text - created
fake_testdata.json, which has paper classification with its associated simple text, for testing and performance evaluation. - included additional VS code ruff setting to
pyproject.toml
- Refactored
-
- The second part of refactoring
build_model.pyincludes- create a new module,
model_settings.pyto set up various model related variables. This eventually will relocating other model related variables inconfig.pyto this module in the near future. - Created
streamline_dataset.pyto streamline the source data equipped to be ML input dataset. It does load the source dataset and streamline the dataset. - Created
partition_dataset.pyto split the streamlined dataset into the train, validation, and test dataset for DL models.
- create a new module,
- The second part of refactoring
-
- Create a new script to build the input dataset. It's called build_dataset.py
- Added some information about the data folder in README.rst
- Added init.py
-
- test_bibcat.py was refactored into several sub test scripts.
- tests/test_core/test_base.py
- tests/test_core/test_grammar.py
- tests/test_core/test_keyword.py
- tests/test_core/test_operator.py
- tests/test_core/test_paper.py
- tests/test_data/test_dataset.py
- test_bibcat.py was refactored into several sub test scripts.
-
- Created a new folder named
coreto store all refactored class scripts - Added more description to each class script and other main scripts.
- Created a new folder named
-
- Started with open astronomy cookiecutter template for bibcat
- Re-organized the file structure (e.g., bibcat/bibcat/) and modified the file names
- bibcat_classes.py to classes.py
- bibcat_config.py to config.py
- bibcat_parameters.py to parameters.py
- bibcat_tests.py to test_bibcat.py
- Refactor classes.py into several individual class script under the
coredirectory - Created two main scripts
- create_model.py : this script can be run to create a new training model
- classify_papers.py : this script will fetch input papers, classify them into the designated paper categories, and produce performance evaluation materials such as confusion matrix and plots
- Created CHANGELOG.md
0.1.0 - 2024-01-29
Initial tag to preserve code before refactor