Skip to content

Conversation

@caufieldjh
Copy link
Collaborator

@caufieldjh caufieldjh commented Dec 22, 2025

Fixes the following:

caufieldjh and others added 11 commits December 20, 2025 13:05
Transform minimal stub into comprehensive registry entry with detailed documentation of Bio2RDF's semantic web infrastructure for biomedical integration.

**Bio2RDF Overview:**
- Large-scale open-source semantic web integration project
- Converts 35+ biomedical databases into standardized RDF linked data
- 11 billion RDF triples across Release 3 (July 2014)
- Maintained by Maastricht University Institute of Data Science

**Data Integration:**
- Core databases: MGI, HGNC, KEGG, Entrez Gene, OMIM, Gene Ontology, OBO, PDB, ChEBI
- On-demand sources: UniProt, Reactome, Prosite, PubMed, GenBank, PubChem
- Release 3 additions: ClinicalTrials.gov, dbSNP, GenAge, GenDR, LSR, OrphaNet, SIDER, WormBase

**Access Methods:**
- SPARQL 1.1 endpoint (bio2rdf.org/sparql) powered by OpenLink Virtuoso
- REST API for programmatic access
- Download service (download.bio2rdf.org) with multiple RDF formats
- BioSearch semantic search engine
- Web interface for exploration

**Technical Architecture:**
- RDF triplestore with OpenLink Virtuoso
- Semanticscience Integrated Ontology (SIO) framework
- SPARQL, OWL, and W3C standards
- ETL pipeline using Common Workflow Language (CWL)
- Docker containerization
- PHP, Perl, Ruby conversion tools

**Use Cases:**
- Drug discovery and repurposing
- Disease research and mechanistic understanding
- Chemical-biological integration (Chem2Bio2RDF)
- Complex biomedical queries and hypothesis generation
- Knowledge discovery and data mashups

**Status:**
- Active project with ongoing maintenance
- Updated from "inactive" to "active" to reflect current state
- Continuous updates to datasets and infrastructure
- Community-driven development model

**Publications:**
- 5 peer-reviewed papers on Bio2RDF, SIO ontology, and Chem2Bio2RDF
- Foundational work: Belleau & Dumontier (2011)
- Release 2 improvements: Callahan et al. (2013)
- SIO ontology: Dumontier et al. (2014)
- BioSearch: Hu et al. (2017)

**Licensing and Access:**
- CC-BY 3.0 license
- Open-source code on GitHub
- Freely accessible endpoints and data
Create comprehensive resource entry for Pathway Commons, the centralized aggregation of biological pathway and molecular interaction data from 22 major public databases.

**Pathway Commons Overview:**
- Aggregates data from 22 major public pathway and interaction databases
- 4,794 pathways and 2.3+ million molecular interactions (Version 11)
- Standardized BioPAX Level 3 format for pathway representation
- Human-focused with intentional curation for supported species

**Integrated Databases:**
- Core sources: BioGRID, CORUM, CTD, DIP, DrugBank, HPRD, HumanCyc, IntAct, KEGG, MSKCC Cancer Cell Map, MINT, MirTarBase, NCI/Nature PID, PhosphoSitePlus, Reactome, RECON, SBCNY IMID, TRANSFAC

**Data Content:**
- Biochemical reactions and metabolic pathways
- Signaling pathways and molecular interactions
- Protein-protein interactions
- Gene regulatory networks
- Post-translational modifications
- Drug-target interactions
- 18,490 genes (HGNC identifiers)
- 11,437 small molecules (ChEBI, HMDB, KEGG, DrugBank)

**Access Methods:**
- Web interface (https://www.pathwaycommons.org/)
- RESTful API (http://www.pathwaycommons.org/pc/webservice.do)
- Bulk downloads in multiple formats
- Cytoscape, R/Bioconductor, and Python library integrations

**Data Formats:**
- BioPAX Level 3 (primary)
- SIF (Simple Interaction Format)
- GMT (Gene Set Enrichment)
- JSON-LD (Linked Data)
- SBGN-ML, SBML

**Products (7 total):**
- REST API for programmatic access
- Web interface for interactive exploration
- Data downloads in multiple formats
- Integrated BioPAX model
- SIF network format
- GMT gene set format
- JSON-LD linked data
- Comprehensive API documentation

**Technical Infrastructure:**
- Built on cPath2 open-source platform (Java)
- Paxtools library for BioPAX handling
- OpenLink Virtuoso for RDF support (optional)
- Docker containerization
- GitHub open-source development

**Use Cases:**
- Cancer genomics and precision medicine
- Systems biology network analysis
- Pathway enrichment analysis (GSEA)
- Gene and protein neighborhood discovery
- Drug target identification
- Hypothesis generation from integrated data
- Multi-omics integration

**Organizations:**
- Memorial Sloan Kettering Cancer Center (MSKCC) - Primary host
- University of Toronto - Collaborative partner

**Publications:**
- 2020 Nucleic Acids Research update (Rodchenkov et al.)
- 2011 NAR foundational paper (Cerami et al.)
- 2006 BMC Bioinformatics cPath original publication

**Status:**
- Active and actively maintained
- Open source (cpath2, paxtools, paxtoolsr, etc.)
- 40+ tools support BioPAX format
- Continuous updates and improvements
- Community-driven development

**License:**
- CC BY 3.0 for aggregated content
- Individual sources follow their respective licenses
- Completely free and open access
Create comprehensive resource entry for Catalogue of Life, the global authoritative taxonomic checklist of all known species on Earth.

**Catalogue of Life Overview:**
- Most comprehensive global taxonomic resource
- 2.2+ million living species + 153,000 extinct species
- 5.9+ million scientific names
- 165 peer-reviewed taxonomic databases
- 500+ global taxonomic experts

**Data Content:**
- All major organism groups (Animalia, Plantae, Fungi, Chromista, Bacteria, Archaea, Protozoa)
- 95%+ coverage of vertebrates and vascular plants
- 59,668 data sources worldwide
- Annual + monthly release cycle

**Access Methods:**
- Web portal for interactive exploration
- RESTful API for programmatic access
- Multiple download formats (ColDP, Darwin Core Archive, ACEF, TextTree, MySQL)
- ChecklistBank repository infrastructure

**Products (6 total):**
- Web portal
- RESTful API
- Annual releases
- Data downloads
- ChecklistBank repository
- Documentation

**Data Formats:**
- Catalogue of Life Data Package (ColDP) - primary
- Darwin Core Archive
- Multiple serialization options
- Permanent archival with DOI

**Use Cases:**
- Species identification and taxonomy
- IUCN Red List (172,000+ assessed species)
- Conservation planning
- Global policy implementation (CBD, CITES, IPBES)
- GBIF backbone
- Educational and citizen science

**Organizations:**
- Naturalis Biodiversity Center (Netherlands)
- Illinois Natural History Survey (USA)
- GBIF Secretariat (administrative host)

**Status:**
- Active and continuously maintained
- Monthly + annual releases
- Latest: July 2025
- Open-source development

**License:**
- CC-BY 4.0 (transitioning to CC-0)
- Free and open access
- FAIR data principles
…Issue #502)

Updated YAML frontmatter description with key statistics: 1,913 curated pathways across 27 species, 36,334 gene products, 7,052 metabolites, and active community of 201+ authors. Added extensive markdown documentation covering database content and scale, species coverage, pathway types, data access methods, technical infrastructure, use cases, organizational structure, funding, standards compliance, and citation recommendations.

Generated with Claude Code
…documentation (Issue #495)

Updated YAML description with key details: community-driven OWL-based ontology with 6,000+ terms covering environments from microscopic to planetary scales, supporting FAIR data practices and semantic interoperability. Added extensive markdown documentation covering:

- Community-driven mission and FAIR data support
- Multi-scale environmental coverage (microscopic to planetary)
- Core concepts (environmental systems, components, processes, properties)
- Disciplinary applications (microbiology, ecology, genomics, environmental health)
- OWL 2 technical development and templating methodology
- OBO Foundry membership and community governance
- Multiple download formats (OWL, OBO, JSON)
- Subset versions for specialized use cases
- Standards compliance and semantic web integration
- Cross-ontology integration with GO, CHEBI, UBERON, Mondo, etc.
- Collaborative partnerships with ESIP, UN Environment, IOC-UNESCO
- Leadership and community model
- Research applications and use cases
- Integration with KG-Microbe and bioinformatics workflows
- Citation recommendations and key resources

Generated with Claude Code
…tion

Update the GO resource entry with detailed documentation covering:
- Enhanced metadata with creation and modification dates
- Expanded description highlighting GO's role as a foundational ontology
- Comprehensive overview of the three main ontologies (BP, MF, CC)
- Detailed scope and coverage information with scale metrics
- Complete data access and products documentation
- Information on organizational structure and governance
- Integration and interoperability details with other ontologies
- Use cases and applications across research domains
- Citation information and key resources
- Leadership and community structure details

This resolves issue #496 by providing a fully documented Gene Ontology resource
entry matching the quality standards of other resource entries in the registry.

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Update the Reactome resource entry with detailed documentation covering:
- Enhanced metadata with creation date (2003) and modification date
- Expanded description highlighting Reactome's role as a curated pathway knowledgebase
- Comprehensive overview of human biological pathway coverage
- Detailed scope and coverage of pathway domains and species
- Information on data organization and reaction representation
- Complete data access and products documentation (BioPAX, SBML, Neo4j, etc.)
- Details on curation process and quality assurance standards
- Integration and interoperability with other databases and standards
- Use cases across research and clinical applications
- Leadership structure and international consortium information
- Citation information and key resources
- Standards compliance and FAIR principles

This resolves issue #489 by providing comprehensive documentation of the Reactome
pathway database resource entry matching registry quality standards.

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add comprehensive product entries for Reactome's primary data formats:
- BioPAX (OWL-based pathway exchange format)
- SBML (Systems Biology Markup Language for computational modeling)
- RDF/XML (Semantic web representation)
- Neo4j Graph Database (Native graph database format)
- Flat Files TSV (Tabular pathway data)
- REST API (Programmatic web service access)
- Pathway Browser (Interactive visualization interface)

These entries document Reactome's diverse data access methods referenced in
https://reactome.org/download-data and provide users with comprehensive
information about available formats and their use cases.

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@caufieldjh caufieldjh marked this pull request as ready for review December 22, 2025 16:13
@caufieldjh caufieldjh merged commit 6f49954 into main Dec 22, 2025
6 checks passed
@caufieldjh caufieldjh deleted the resource_updates_20122025 branch December 22, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants