A Python tool that verifies claims made in Wikipedia articles by checking if the cited sources support those claims using AI language models.
- Extracts claims and their corresponding sources from Wikipedia articles
- Fetches content from cited sources (HTML, PDF support)
- Uses AI models to verify if sources support the claims
- Supports both local execution and Google Colab
Install the required dependencies:
pip install -r requirements.txtThis project can use Google's Generative AI API. To use it:
- Get a Google API key from Google AI Studio
- Set it as an environment variable:
export GOOGLE_API_KEY="your-api-key-here"
- Or modify the code to use your preferred method of API key management
Important: Never commit API keys to version control. Always use environment variables or secure configuration files.
python main.pyBy default, it will analyze the "Albert Einstein" Wikipedia article. You can modify the wikipedia_article_title variable in the main.py file to analyze different articles.
Open wikipedia_verifier.ipynb in Jupyter or Google Colab and follow the instructions in the notebook.
- Article Fetching: Downloads the HTML content of a Wikipedia article
- Claim Extraction: Identifies sentences with citations and extracts the corresponding source URLs
- Source Fetching: Downloads content from the cited sources
- AI Verification: Uses language models to determine if the source content supports the claim
- Results: Provides a summary of verified vs unverified claims
- HallOumi-8B (default in main.py)
- Google Generative AI (used in notebook version)
- Can be extended to support other language models
- Some sources may be behind paywalls or require authentication
- PDF extraction may not be perfect for all document formats
- AI verification accuracy depends on the model used
- Rate limiting may apply to external APIs
- Fork the repository
- Create a feature branch
- Make your changes
- Ensure no sensitive information is committed
- Submit a pull request
This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.
- Never commit API keys, passwords, or other sensitive information
- Use environment variables for configuration
- Review all changes before committing to ensure no secrets are exposed