Add Python API usage examples (issue #25) (#50)

coding-ai-assistant[bot] · florath · web-flow · commit 6899a1873c1e · 2025-11-21T09:13:07.000+01:00
* Add Python API usage examples for journal assessment - Create examples/ directory with standalone demonstration scripts - Add basic_assessment.py: single journal and batch assessment examples - Add bibtex_processing.py: BibTeX file processing and result aggregation - Add comprehensive README.md with setup instructions and usage guidance Addresses issue #25: enhances JOSS submission and improves developer onboarding * Add SPDX license identifiers to example files Fixes quality check failures by adding required MIT license headers to both Python example files. [AI-assisted] * Fix API usage in example files - Fix import in basic_assessment.py to use dispatcher.query_dispatcher - Fix bibtex_processing.py to use correct result attributes (insufficient_data_count instead of unknown_count, assessment_results instead of entries, is_retracted instead of retracted) - Both examples now execute successfully and demonstrate the API correctly [AI-assisted] * Add example execution validation to quality checks and CI/CD - Create check-examples.py script to validate all example files execute successfully - Add example execution check to run-quality-checks.sh script - Add example execution step to CI/CD pipeline - Fix API usage in example files: - Fix import in basic_assessment.py to use dispatcher.query_dispatcher - Fix bibtex_processing.py to use correct result attributes - All examples now execute successfully and are validated in CI/CD This ensures the Python API examples provided for JOSS remain functional and serve as reliable documentation for users. [AI-assisted] * Address PR review comments for examples README Changes based on review feedback: 1. Fix setup requirements - make sync mandatory, config optional (displays config, doesn't configure) 2. Add sync step to download backend data before running examples 3. Update expected outputs to indicate they may vary based on enabled backends 4. Fix configuration section to correctly describe precedence (local .aletheia-probe first) 5. Remove Support section as requested 6. Add reference to main configuration documentation All review comments addressed and replied to on GitHub. [AI-assisted] --------- Co-authored-by: florath-ai-assistant[bot] <Andreas.Florath@telekom.de>
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -55,6 +55,10 @@ jobs:
       run: |
         python scripts/check-logging.py
 
+    - name: Check example execution
+      run: |
+        python scripts/check-examples.py
+
   test:
     name: Tests
     runs-on: ubuntu-latest
diff --git a/examples/README.md b/examples/README.md
@@ -0,0 +1,126 @@
+# Aletheia Probe Python API Examples
+
+This directory contains standalone Python scripts demonstrating how to use the Aletheia Probe Python API for journal assessment.
+
+## Setup Requirements
+
+Before running the examples, ensure you have:
+
+1. **Installed Aletheia Probe:**
+   ```bash
+   pip install aletheia-probe
+   ```
+
+2. **Synced backend data:**
+   ```bash
+   aletheia-probe sync
+   ```
+   This downloads the predatory journal lists and other data sources needed for assessment.
+
+3. **(Optional) View configuration:**
+   ```bash
+   aletheia-probe config
+   ```
+   This displays the current configuration. The tool works with default settings, but you can customize backends and other options.
+
+## Examples
+
+### basic_assessment.py
+
+Demonstrates core journal assessment functionality:
+
+- **Single journal assessment** - Assess one journal with detailed results
+- **Batch assessment** - Process multiple journals efficiently
+- **Result interpretation** - How to work with assessment results
+
+**Usage:**
+```bash
+python basic_assessment.py
+```
+
+**Expected Output (similar to):**
+```
+=== Single Journal Assessment ===
+Journal: Nature Communications
+Assessment: legitimate
+Confidence: 98%
+Backend Results: 13 sources checked
+
+=== Batch Assessment ===
+Science: legitimate (98% confidence)
+PLOS ONE: legitimate (100% confidence)
+Journal of Biomedicine: legitimate (83% confidence)
+```
+
+*Note: Results may vary based on enabled backends and available data.*
+
+### bibtex_processing.py
+
+Shows how to process BibTeX bibliography files:
+
+- **BibTeX file parsing** - Extract journals from bibliography files
+- **Batch journal assessment** - Assess all journals in a bibliography
+- **Result aggregation** - Summarize findings and generate reports
+
+**Usage:**
+```bash
+python bibtex_processing.py
+```
+
+**Expected Output (similar to):**
+```
+=== BibTeX File Processing ===
+Created sample BibTeX file: /tmp/tmplbecdqnw.bib
+
+=== Assessment Summary ===
+Total entries processed: 3
+Legitimate journals: 2
+Predatory journals: 1
+Insufficient data: 0
+
+=== Detailed Results ===
+Journal: Nature Communications
+  Assessment: legitimate
+  Confidence: 93%
+  Risk Level: LOW - Safe to publish
+```
+
+*Note: Results may vary based on enabled backends and available data.*
+
+## Integration Tips
+
+### Error Handling
+
+Always wrap API calls in try-catch blocks:
+
+```python
+try:
+    result = await query_dispatcher.assess_journal(query)
+    # Process result...
+except Exception as e:
+    print(f"Assessment failed: {e}")
+```
+
+### Configuration
+
+The API uses configuration in this order of precedence:
+1. Local `.aletheia-probe/` directory (project-specific settings)
+2. User configuration directory (`~/.config/aletheia-probe/` or platform equivalent)
+3. Default settings
+
+For more details, see the [Configuration documentation](https://github.com/sustainet-guardian/aletheia-probe#configuration).
+
+### Async/Await
+
+All assessment functions are asynchronous. Use within async functions or with `asyncio.run()`:
+
+```python
+import asyncio
+
+async def main():
+    result = await query_dispatcher.assess_journal(query)
+    return result
+
+# Run the async function
+result = asyncio.run(main())
+```
diff --git a/examples/basic_assessment.py b/examples/basic_assessment.py
@@ -0,0 +1,84 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: MIT
+"""
+Basic journal assessment examples using the Aletheia Probe Python API.
+
+This script demonstrates:
+1. Single journal assessment
+2. Batch assessment of multiple journals
+3. Result interpretation examples
+"""
+
+import asyncio
+from aletheia_probe.dispatcher import query_dispatcher
+from aletheia_probe.models import QueryInput
+
+
+async def single_assessment():
+    """Assess a single journal and interpret results."""
+    print("=== Single Journal Assessment ===")
+
+    # Create query for Nature Communications
+    query = QueryInput(
+        raw_input="Nature Communications",
+        normalized_name="nature communications",
+        identifiers={"issn": "2041-1723"}
+    )
+
+    # Perform assessment
+    result = await query_dispatcher.assess_journal(query)
+
+    # Display results
+    print(f"Journal: {query.raw_input}")
+    print(f"Assessment: {result.assessment}")
+    print(f"Confidence: {result.confidence:.0%}")
+    print(f"Backend Results: {len(result.backend_results)} sources checked")
+
+    return result
+
+
+async def batch_assessment():
+    """Assess multiple journals in batch."""
+    print("\n=== Batch Assessment ===")
+
+    # List of journals to assess
+    journals = [
+        {"name": "Science", "issn": "1095-9203"},
+        {"name": "PLOS ONE", "issn": "1932-6203"},
+        {"name": "Journal of Biomedicine", "issn": None}  # Potentially suspicious
+    ]
+
+    results = []
+
+    for journal in journals:
+        query = QueryInput(
+            raw_input=journal["name"],
+            normalized_name=journal["name"].lower(),
+            identifiers={"issn": journal["issn"]} if journal["issn"] else {}
+        )
+
+        result = await query_dispatcher.assess_journal(query)
+        results.append((journal["name"], result))
+
+        print(f"{journal['name']}: {result.assessment} ({result.confidence:.0%} confidence)")
+
+    return results
+
+
+async def main():
+    """Run all examples."""
+    try:
+        # Single assessment
+        await single_assessment()
+
+        # Batch assessment
+        await batch_assessment()
+
+        print("\n=== Assessment Complete ===")
+
+    except Exception as e:
+        print(f"Error: {e}")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/examples/bibtex_processing.py b/examples/bibtex_processing.py
@@ -0,0 +1,121 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: MIT
+"""
+BibTeX processing examples using the Aletheia Probe Python API.
+
+This script demonstrates:
+1. BibTeX file processing
+2. Journal extraction and assessment
+3. Result aggregation and reporting
+"""
+
+import asyncio
+import tempfile
+from pathlib import Path
+from aletheia_probe.batch_assessor import BibtexBatchAssessor
+
+
+def create_sample_bibtex():
+    """Create a sample BibTeX file for demonstration."""
+    bibtex_content = """
+@article{smith2023nature,
+    title={A groundbreaking study},
+    author={Smith, John},
+    journal={Nature Communications},
+    volume={14},
+    year={2023},
+    issn={2041-1723}
+}
+
+@article{doe2023plos,
+    title={Another important study},
+    author={Doe, Jane},
+    journal={PLOS ONE},
+    volume={18},
+    year={2023},
+    issn={1932-6203}
+}
+
+@article{unknown2023suspicious,
+    title={Suspicious research},
+    author={Unknown, Author},
+    journal={International Journal of Advanced Research},
+    volume={1},
+    year={2023}
+}
+"""
+
+    # Create temporary file
+    temp_file = tempfile.NamedTemporaryFile(mode='w', suffix='.bib', delete=False)
+    temp_file.write(bibtex_content)
+    temp_file.close()
+
+    return Path(temp_file.name)
+
+
+async def process_bibtex_file():
+    """Process a BibTeX file and assess all journals."""
+    print("=== BibTeX File Processing ===")
+
+    # Create sample BibTeX file
+    bibtex_path = create_sample_bibtex()
+    print(f"Created sample BibTeX file: {bibtex_path}")
+
+    try:
+        # Initialize the batch assessor
+        assessor = BibtexBatchAssessor()
+
+        # Process the BibTeX file
+        result = await assessor.assess_bibtex_file(bibtex_path, verbose=True)
+
+        # Display summary results
+        print(f"\n=== Assessment Summary ===")
+        print(f"Total entries processed: {result.total_entries}")
+        print(f"Legitimate journals: {result.legitimate_count}")
+        print(f"Predatory journals: {result.predatory_count}")
+        print(f"Insufficient data: {result.insufficient_data_count}")
+
+        return result
+
+    finally:
+        # Clean up temporary file
+        bibtex_path.unlink()
+
+
+async def analyze_results(result):
+    """Analyze and display detailed results."""
+    print("\n=== Detailed Results ===")
+
+    for bibtex_entry, assessment in result.assessment_results:
+        print(f"\nJournal: {bibtex_entry.journal_name}")
+        print(f"  Assessment: {assessment.assessment}")
+        print(f"  Confidence: {assessment.confidence:.0%}")
+
+        if bibtex_entry.is_retracted:
+            print(f"  Warning: Contains retracted articles")
+
+        if assessment.assessment == "predatory":
+            print(f"  Risk Level: HIGH - Avoid this journal")
+        elif assessment.assessment == "legitimate":
+            print(f"  Risk Level: LOW - Safe to publish")
+        else:
+            print(f"  Risk Level: UNKNOWN - Requires manual review")
+
+
+async def main():
+    """Run all examples."""
+    try:
+        # Process BibTeX file
+        result = await process_bibtex_file()
+
+        # Analyze results
+        await analyze_results(result)
+
+        print("\n=== Processing Complete ===")
+
+    except Exception as e:
+        print(f"Error: {e}")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/scripts/check-examples.py b/scripts/check-examples.py
diff --git a/scripts/run-quality-checks.sh b/scripts/run-quality-checks.sh