Skip to content

[DMP 2026]: Minimum Viable Product (MVP) requirements #8

@Sunilshah-7

Description

@Sunilshah-7

Ticket Contents

Description

  1. Decide on platforms to host the UN-SDG classifier tool. By hosting it, maintainers can start using this platform. Github is a good option to host static content also, and since it has a dynamic content to fetch data from third party API and get predictions from sentence transformers, hosting it on github might be tricky. Feel free to change the architecture if needed. To be clear, we don't have a budget for hosting, so hosting via GitHub pages is our first choice, but if that is not possible due to limited capabilities, then Vercel or pythonanywhere may be other hosting options.
    Link to Vercel: https://vercel.com/
    Link to Pythonanywhere: https://www.pythonanywhere.com/

  2. Test out the tool for Digital Public Registry open source projects. Link: https://www.digitalpublicgoods.net/registry. Determine where our tool has discrepancies with the DPG Alliance Registry in identifying SDGs. For each project mentioned in the Digital Public Registry(DPG), it contains SDG Relevance section that provides the actual SDG relevance of that project. We want to test out the tool and find out the how much of the output provided by it is accurate to the SDG Relevance of the DPG website.

Goals & Mid-Point Milestone

Goals

  • Create a spreadsheet of test data that contains the predicted SDG relevance by our tool and the actual SDG relevance for the project according to the DPG Alliance registry
  • Improve the accuracy of the tool
  • Searching for dataset or any relevant models for this project
  • Deciding on the hosting platform for the tool
  • Implementing the new architecture to host our tool

Setup/Installation

Please go to Readme.md file and see the Manual Approach section to setup/install the project.

Expected Outcome

The expected outcome for this program is to have the tool publicly accessible by everyone. Anyone should be able to access the tool and get the json or yaml output file from the tool. Also, it is expected that the output from this tool will be as close as possible to the results of SDG relevance.

Acceptance Criteria

Three acceptance criteria for this feature

  • 100 test results from the DPG registry.
  • 85% accuracy for the DPG registry test results.
  • Implement the hosting platform and have a CICD pipeline to host the code to production.

Implementation Details

  1. The architecture of the tool is Next.js for frontend and Flask for backend. For the hosting, if you find any suitable architecture along with your hosting platform, feel free to migrate to a different architecture.
  2. For testing the tool, search for the most relevant text from each project from DPG, either from the Readme file of github repository or About section of the website of the project, and enter those details through the tool. See the json output the tool provides and make a spreadsheet of all the inputs and outputs to the system.
  3. Analyze the spreadsheet using jupyter to find out improvement section of this tool.
  4. There are two external APIs called Aurora API and OSDG API integrated into the tool. Both of them have certain rate limit. You can only make 1 request per second, so please maintain the API request logic so that the API server doesn't get overwhelmed and get "Too many requests" error.
  5. The architecture of this tool is in a way that no paid services are involved in it. So, please maintain the architecture and don't include any code or services that requires the tool to use paid services.

Mockups/Wireframes

No response

Product Name

CHAOSS UN-SDG Classifier Tool

Organisation Name

CHAOSS

Domain

Open Source Library

Tech Skills Needed

Artificial Intelligence, CI/CD, Debugging, Flask, HTML, JavaScript, Machine Learning, Python, React, TypeScript

Mentor(s)

David Lippert

Category

API, Backend, CI/CD, Data Science, Delpoyment, Frontend, Maintenance, Performance Improvement, Refactoring, Testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions