Skip to content

Conversation

@sokole
Copy link
Collaborator

@sokole sokole commented Jan 28, 2026

Initial commit of the FAIR4AI evaluation agent, including core scripts, GUI, example usage, requirements, form checklist, template outputs, and sample evaluation results. Provides full documentation in README.md and an environment variable template for API configuration.

Initial commit of the FAIR4AI evaluation agent, including core scripts, GUI, example usage, requirements, form checklist, template outputs, and sample evaluation results. Provides full documentation in README.md and an environment variable template for API configuration.
@egrace479
Copy link
Member

egrace479 commented Jan 28, 2026

Could you remove the __pycache__ and update the .gitignore accordingly? (code to do that is an example in the guide cheat-sheet)? Pasting in the Python template gitignore should cover everything moving forward (after removing the cached files).

@sokole
Copy link
Collaborator Author

sokole commented Jan 28, 2026

ah, thanks, will do!

Copy link
Member

@egrace479 egrace479 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see in-line suggestions and comments. I'll do the docs extraction we discussed.

Done! I think we should pull all the .env setup and API token instructions to a docs page too, then we can just link to it.

@@ -0,0 +1,991 @@
# FAIR4AI Automated Evaluation Agent

An AI-powered system that automatically evaluates datasets against the FAIR4AI checklist using Large Language Models (LLMs). The agent analyzes metadata files or dataset landing pages to provide comprehensive FAIR (Findable, Accessible, Interoperable, Reusable) assessments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An AI-powered system that automatically evaluates datasets against the FAIR4AI checklist using Large Language Models (LLMs). The agent analyzes metadata files or dataset landing pages to provide comprehensive FAIR (Findable, Accessible, Interoperable, Reusable) assessments.
An AI-powered system that automatically evaluates datasets against the [FAIR4AI checklist form](https://forms.gle/P3MWmJJAi5vq248E8) using Large Language Models (LLMs). The agent checks metadata files or dataset landing pages for the information requested in the checklist to provide comprehensive FAIR (Findable, Accessible, Interoperable, Reusable) assessments.

- Generates structured outputs in JSON and CSV formats with FAIR score estimates

**Key Features:**
- ✅ Multiple LLM providers (OpenAI, Azure OpenAI, Anthropic Claude)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ✅ Multiple LLM providers (OpenAI, Azure OpenAI, Anthropic Claude)
- ✅ Multiple LLM providers supported (OpenAI, Azure OpenAI, Anthropic Claude)

**System Requirements:**
- Python 3.8 or higher
- API key for OpenAI, Azure OpenAI, or Anthropic
- Dependencies: pandas, openai, anthropic, requests, beautifulsoup4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Dependencies: pandas, openai, anthropic, requests, beautifulsoup4

This is covered by the install and doesn't need to be included in the README

Comment on lines +35 to +41

Or install individually:
```bash
pip install openai pandas requests beautifulsoup4 # For OpenAI
# OR
pip install anthropic pandas requests beautifulsoup4 # For Anthropic Claude
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Or install individually:
```bash
pip install openai pandas requests beautifulsoup4 # For OpenAI
# OR
pip install anthropic pandas requests beautifulsoup4 # For Anthropic Claude
```

This seems like an unnecessary complication -- I don't think there should be issues installing openai and anthropic with the requirements file.

## Quick Start

### 1. Install Dependencies

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We recommend creating a [virtual environment](https://imageomics.github.io/Collaborative-distributed-science-guide/wiki-guide/Virtual-Environments/) in which to install the requirements as described below.

5. Save all extracted metadata to output directory
6. Run the evaluation using extracted metadata

Supported URLs include NEON, DataONE, Zenodo, Dryad, and other repositories with structured metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Supported URLs include NEON, DataONE, Zenodo, Dryad, and other repositories with structured metadata.
Supported URLs include those for NEON, DataONE, Zenodo, Dryad, Hugging Face Datasets, and other repositories with structured metadata.

- **Progress Bar**: Shows completion percentage
- **Log Window**: Real-time messages and status updates

#### Example Workflow: URL-Based Evaluation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be cool to add a screenshot or two of the GUI, but that'd be for bigger docs. If this is more fleshed-out and public then we could use MkDocs to set something up (more user-friendly than a super long README).

- `options`: Multiple choice options (pipe-separated)
- `required`: Whether required (TRUE/FALSE)

Default form: `form_ai_checklist_automated.csv` (135 questions across 9 sections)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Default form: `form_ai_checklist_automated.csv` (135 questions across 9 sections)
Default form: `form_ai_checklist_automated.csv` (135 questions across 9 sections)
The code used to create this file from the [FAIR4AI Checklist form](), is in the [checklist workflow directory](../checklist-workflow/00_README.md).

├── {prefix}.json # FAIR4AI evaluation results
└── {prefix}.csv # FAIR4AI evaluation results
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`prefix` is the name passed to the `--output` parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants