Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add credit class schemas to LinkML model #28

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Add credit class schemas to LinkML model #28

wants to merge 22 commits into from

Conversation

paul121
Copy link
Collaborator

@paul121 paul121 commented Dec 27, 2024

WIP PR to add schemas for BT01 credit class + project + batches in schemas/src/BT01 directory

Closes #14

Copy link
Collaborator Author

@paul121 paul121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@S4mmyb just opening a draft so we can use comments to chat discuss further changes to this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@S4mmyb right now this is the BT01 credit class file you had started, except I have modified the ranges to be valid and work with linkml. The issue was we had ranges like schema:Duration and URL, but ranges need to be proper data types that linkml understands OR custom data classes.

I'm still not sure what our best option is for specifying a ISO 8601 Duration datatype. I wonder if this might be an easy contribution to LinkML core data types (they have other 8601 date/time types). Or, for now I have specified this in the description eg "Crediting term duration for the project. An ISO 8601 duration.".

Comment on lines 59 to 62
SDGs:
description: List of relevant Sustainable Development Goals for this impact.
range: SDG
multivalued: true
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@S4mmyb currently SDGs as a slot name doesn't fit our linkml linting configuration (slots should be lower camel case). What about just SDG?

In the past I have tried to avoid using plural wording for technical properties and just stick with singular words everywhere, knowing that some singular properties might indeed have mutliple values, thinking that the schema will specify if they are multi-valued, not the naming itself. Curious what you think about this more generally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change to rename the slots to (lowercase) sdg

Copy link
Collaborator Author

@paul121 paul121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things to pull out of BT01 Credit Class into a common credit class:

  • Methodology/Methodology list (added for all credit classes)
  • SDG

Used on both credit classes and projects

  • Primary impact
  • Project benefit

I'll start with these changes and start creating the new structure, then we can continue iterating off of that..

@paul121 paul121 force-pushed the semantic-enumerations branch 2 times, most recently from ee7e543 to 84ea3da Compare February 21, 2025 18:52
Base automatically changed from semantic-enumerations to main February 21, 2025 19:45
Comment on lines 43 to 50
MethodologyList:
class_uri: rfs:MethodologyList
description: List of methodologies for credit generation.
attributes:
itemListElement:
range: Methodology
multivalued: true
description: "List of methodology items."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this intermediary class. I see this was added to try and model https://schema.org/ItemList

Instead, I think we can just reference Methodology with multivalued: true

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this change

Copy link
Collaborator Author

@paul121 paul121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've just rebased on previously merged PRs, these changes should be simpler now.

Unfortunately moving the BT01 Project + Credit class into a src/BT01 subdirectory is breaking the linkml lint and thus builds are failing ☹️ - but the gen-doc still works locally. It seems like a bug with the linkml lint command, so for now I'll just remove the BT01 subdirectory. Here is the error: https://github.com/regen-network/regen-data-standards/actions/runs/13501803621/job/37721823950#step:5:59

Comment on lines 20 to 25
classes:
CreditClassInfo:
class_uri: rfs:CreditClassInfo
abstract: true
title: Credit Class info
description: Base class with common fields used to define credit class info.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned that there are additional common-metadata that we can use on all parts of the data model. The title is nice because we can give a more human-readable string that is rendered on top of markdown pages (Credit Class info instead of CreditClassInfo)

abstract is also useful to demonstrate that this is an abstract base class and should not be instantiated/used directly.

slot_uri: schema:name
required: true
description: Name of the credit class.
rank: 1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rank is another common metadata we can use to set the order of slots

Comment on lines 12 to 14
- SDG
- ProjectBenefit
- PrimaryImpact
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created these classes in separate yaml files because these classes might be used outside of credit classes. In each of these files I also defined slots that can be used to reference these classes. It's possible that a credit class might have multiple slots referencing the same class for different semantic purposes, in which case each class can define these specific slots themselves.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "Primary Impact" would it make sense to rename this to a generic "Project Impact' class? That way projects, credit classes, etc could reference multiple impacts, not only a single primary impact ?

Comment on lines 24 to 26
attributes:
id:
identifier: true
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these classes have id attributes, I believe from @S4mmyb initial migration trying to match the existing schemas. I added the identifier property so that linkml treats it as such. But I'm curious if we actually want to have full identifiers impacts, benefits, etc? If we are using identifiers does that mean we actually should be pulling from a taxonomy and have a semantic enum?

I need to review existing data more, just leaving the comment while its on my mind cc @clevinson

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it necessarily means that we need a semantic enum, but it should be something that is a full URI.

From the LinkML docs:

the range of an identifier can be any type, but it is a good idea to have these be of type Uriorcurie

A class must not have more than one identifier (asserted or derived). identifier marks the primary identifier.

I'm fine with us using app.regen.network/credit-class/C01 , or http://mainnet.regen.network:1317/regen/ecocredit/v1/class/C01 for these, or some other resource list that we can work on making fully resolvable later (e.g. registry.regen.network/C01 for any arbitrary resource that exists on-chain)

@S4mmyb
Copy link
Member

S4mmyb commented Feb 28, 2025

  • Primary Impact and Secondary Benefits should just be combined into one impact data type. Within the Credit Class and Project Schemas we can add in a field to reference which is the primary, the rest are considered secondary an are unordered (i.e. no rank).
  • For now let's migrate the cobenefit list to a secondary benefit list, but we might make the decision to stick with the name co-benefits

@S4mmyb
Copy link
Member

S4mmyb commented Feb 28, 2025

  • Source Registry should stay in
  • Things that I think are missing in the current Credit Class schema:
    • program which refers to the program the protocol is using. This is optional
    • creditProtocol is the root document of the credit protocol that defines the requirements and process to register and issue credits under a credit protocol. It could be registered under a program or independent. It could include the methods to measure and monitor (like ERA) or it could reference other methodology documents (like CarbonPlus or Ecometric).

Copy link
Member

@clevinson clevinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments here from my call with @paul121 a few weeks ago. I'll go ahead and work on a PR to resolve these issues as well as address comments pointed out by @S4mmyb

Comment on lines 24 to 26
attributes:
id:
identifier: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it necessarily means that we need a semantic enum, but it should be something that is a full URI.

From the LinkML docs:

the range of an identifier can be any type, but it is a good idea to have these be of type Uriorcurie

A class must not have more than one identifier (asserted or derived). identifier marks the primary identifier.

I'm fine with us using app.regen.network/credit-class/C01 , or http://mainnet.regen.network:1317/regen/ecocredit/v1/class/C01 for these, or some other resource list that we can work on making fully resolvable later (e.g. registry.regen.network/C01 for any arbitrary resource that exists on-chain)

@clevinson
Copy link
Member

  • Source Registry should stay in

  • Things that I think are missing in the current Credit Class schema:

    • program which refers to the program the protocol is using. This is optional
    • creditProtocol is the root document of the credit protocol that defines the requirements and process to register and issue credits under a credit protocol. It could be registered under a program or independent. It could include the methods to measure and monitor (like ERA) or it could reference other methodology documents (like CarbonPlus or Ecometric).

Added these in 07ccd43

Copy link
Member

@clevinson clevinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes from a live review with @S4mmyb today.

description: Eligible activities for registered projects.
multivalued: true
todos:
- What's the relationship between creditProtocol and approvedMethodologies, in terms of which is required and not? Should one be folded into the other, or atleast align on a single class for the fields' ranges?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current descriptions are good enough. We would like to update the slot names to be better aligned with RDF style predicate names.

@clevinson clevinson changed the title Add BT01 schemas Add credit class schemas to LinkML model Mar 20, 2025
@clevinson
Copy link
Member

Follows from this PR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add credit class definitions to schema folder
3 participants