Skip to content

Massive file restructure to match hex archetecture; adding a FastAPI app skeleton #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file removed CHANGELOG.md
Empty file.
15 changes: 5 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,31 +19,26 @@ If you find a bug or have a feature request, please create an issue by following

1. **Fork the Repository**: Fork the repository to your own GitHub account.
2. **Create a Branch**: Create a new branch for your changes.
- Use a descriptive name for your branch, e.g., `feature/add-new-feature` or `bugfix/fix-issue`.
3. **Make Your Changes**: Implement your changes in your branch.
4. **Write Tests**: If applicable, write tests for your changes.
5. **Commit Your Changes**: Write clear and descriptive commit messages.
- **Commit Message Format**: Use the present tense. Example: `Add new feature` instead of `Added new feature`.

### Submitting a Pull Request

1. **Push Your Branch**: Push your branch to your forked repository.
2. **Open a Pull Request**: Open a pull request (PR) to the `main` branch of the original repository.
1. **Contact Jonathan Starr**: The project manager (jring-o), who can loop you into our regular Wednesday working sessions. Send him a message.
2. **Push Your Branch**: Push your branch to your forked repository.
3. **Open a Pull Request**: Open a pull request (PR) to the `main` branch of the original repository.
- **Title**: Provide a descriptive title for your PR.
- **Description**: Include a detailed description of your changes, the motivation behind them, and any related issues.
3. **Review Process**:
4. **Review Process**:
- **Automatic Checks**: Your PR will undergo automated checks.
- **Review by Maintainers**: Your PR will be reviewed by the maintainers. They may request changes or provide feedback.

### Code of Conduct

Please note that this project adheres to a [Code of Conduct](link-to-code-of-conduct). By participating, you are expected to uphold this code.

### Additional Information

- **Human Verification**: During PR reviews, we strive to ensure that any contributions (especially those generated using language models) are thoroughly reviewed and verified by human maintainers for accuracy and relevance.
- **Documentation**: Ensure that your changes are well-documented. Update any relevant documentation in the project.
- **Contact**: If you have any questions or need further assistance, feel free to [contact us](link-to-contact-information).
- **Contact**: If you have any questions or need further assistance, feel free to contact Jonathan Starr (jring-o).

## Acknowledgments

Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -198,4 +198,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
61 changes: 53 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,66 @@ MOSS is a project of [OSSci](https://www.opensource.science/), an initiative of


## Overview
This project aims to visualize the intersection of open source software and scientific research.

> The Map of Open Source Science is a proof of concept and as such, nothing is accurate.
This Map of Open Source Science is a proof of concept right now and as such, nothing is accurate.

This project aims to map open source software and scientific (e.g., peer-reviewed) research via one comprehensive project. This repository houses the backend (e.g., database, API endpoints, etc.) as well as various front-end frameworks (in /frontends) which allow for cool visualizations.

## [Getting Started](./scripts/README.md)
## Running

To start the backend, which is deployed on a production basis at [backend.some-domain-we-need-to-buy.com](backend.some-domain-we-need-to-buy.com) and at a beta/development basis at [beta.some-domain-we-need-to-buy.com](beta.some-domain-we-need-to-buy.com) simply run

## Goal
Here is an earlier iteration built using Kumu. We want to build something similar but better.
> (Instructions for dependency installations go here; Contact Mark Eyer for details.)
> python3 main.py

To start the frontend(s), of which the primary web-based one can be found in production at [some-domain-we-need-to-buy.com](some-domain-we-need-to-buy.com), follow the instructions in the /src/frontends subdirectories. In general, it contains an early iteration of a front-end built using Kumu. We want to build something similar but better.
- [kumu instance](https://embed.kumu.io/6cbeee6faebd8cc57590da7b83c4d457#default)
- [demo video](https://www.youtube.com/watch?v=jZyLSRCba_M)

## Data Sources
## File Structure

```
├── CONTRIBUTING.md **Outlines how to contribute to the project. Still under construction.**
├── LICENSE **Standard Apache2 license.**
├── README.md **Information about this repository.**
├── docs **This is the directory where all the documentation is stored.**
├── main.py **Launchpoint for the app's backend. Run python3 main.py**
├── mkdocs.yml **Documentation configuration settings. (TODO: Determine if can be moved into /docs)**
├── pdm.lock **Project dependency file, generated via PDM.**
├── pyproject.toml **Project dependency settings, used by PDM.**
├── src **Source code directory.**
│ ├── backend **The backend, a standalone hub. Internally organized using [hexagonal architecture](https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)).**
│ │ ├── administration **Used by repository maintainers to hold code for internal administrative tools.**
│ │ ├── biz_logic **The main business logic "guts" of the application. Organized around the ["Harvest, Bottle, Mix"](https://docs.google.com/presentation/d/1jE0-VBikgAd-E6XSRTEkt_RxI190uVlsWg11fB6YgXw/edit?usp=sharing) architecture developed by Schwartz et al.**
│ │ │ ├── bottle
│ │ │ ├── harvest
│ │ │ │ ├── endpoint.py **The app uses RESTful endpoints to connect with frontend spokes, via FastAPI.**
│ │ │ │ └── otherfiles.py **A bit tounge in cheek, otherfiles.py is a placeholder for the various other files related to business logic (such as ETL pipelines).**
│ │ │ ├── mix
│ │ │ └── scripts **Directory for miscellaneous stand-alone scripts which predate our overall architecture, primarily used for harvesting.**
│ │ ├── notification **The module for centralized notifications (e.g., sending emails when background scripts complete.)**
│ │ └── persistence **The module for all things database and data persistence related.**
│ └── frontends
│ └── moss-react-app **A standalone react-based website "spoke" which makes RESTful API calls to the backend "hub"**
└── tests **A directory/module which contains all unit/integration tests for src/**
```
## Contributing
We are using the [fork and pull](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/getting-started/about-collaborative-development-models#fork-and-pull-model) collaborative development model, we welcome [pull requests](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork).
- Check issues for anything to work on

We are still in the process of writing up all our formal procedures on how to contribute code to this repository. A rough draft is located [here](CONTRIBUTING.md) While we accept outside pull requests, the best way to get your contribution accepted is to contact Jon Starr (jring-o) and he can connect you with our weekly technical meetings and give you a brief orientation.

We follow NumFOCUS's [code of conduct](https://numfocus.org/code-of-conduct).

## Core Maintainers

Here are the people who regularly attend weekly meetings where we discuss the technical details of the project. People are listed alphabetically by last/family name. (TODO: Add contact email and/or github username for each person.)

* Dave Bunten
* Mark Eyer
* Victor Lu
* Guy Pavlov
* Sam Schwartz ([email protected])
* Jon Starr
* Peculiar Umeh
* Max Vasiliev
* Boris Veytsman
* Susie Yu
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
103 changes: 103 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
"""
Copyright 2025 the MOSS project.
Point person for this file: Sam Schwartz ([email protected])
Description:
This is the main.py file, which should be run to start the app.
This can be done by running the following command:

python3 main.py
"""

import uvicorn
from fastapi import FastAPI
from hex.biz_logic import router as biz_router


def _ini_api_app() -> FastAPI:
"""Helper/factory function for initializing and returning a FastAPI instance.

Returns:
FastAPI: a fresh FastAPI instance
"""
app = FastAPI()
return app


def _ini_hex_administration(app: FastAPI) -> FastAPI:
"""Helper function for initiating anything to do with CLI administration.

Args:
app (FastAPI): The FastAPI app

Returns:
FastAPI: The app, possibly changed with CLI-related modifications.
"""
return app


def _ini_hex_biz(app: FastAPI) -> FastAPI:
"""Helper function for initiating anything to do with buisness logic.
Specifically, adding RESTful endpoints, routers, and versioning

Args:
app (FastAPI): The application

Returns:
FastAPI: The app, now with base routes added.
"""
app.include_router(
biz_router,
prefix="/v1",
tags=["v1"],
responses={404: {"description": "Not found"}},
)

@app.get("/")
def read_root():
return {"Hello": "World"}

return app


def _ini_hex_notification(app: FastAPI) -> FastAPI:
"""Helper function for setting up any notifications to the application.

Args:
app (FastAPI): The application

Returns:
FastAPI: The app, possibly changed with notification-related modifications.
"""
return app


def _ini_hex_persistance(app):
"""Helper function for setting up database connections for the application.

Args:
app (FastAPI): The application

Returns:
FastAPI: The app, possibly changed with database-related modifications.
"""
return app



def main() -> FastAPI:
"""Initializes the app when called from the command line.

Returns:
FastAPI: The FastAPI app for the api server to serve.
"""
app = _ini_api_app()
_ini_hex_persistance(app)
_ini_hex_notification(app)
_ini_hex_administration(app)
_ini_hex_biz(app)
return app


if __name__ == "__main__":
app = main()
uvicorn.run(app, host="0.0.0.0", port=8000)
19 changes: 19 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,25 @@ readme = "README.md"
license = {text = "MIT"}


[tool.ruff]
line-length = 88
lint.select = [
"F", # pyflakes rules
"E", # pycodestyle error rules
"W", # pycodestyle warning rules
"B", # flake8-bugbear rules
"I", # isort rules
]
lint.ignore = [
"E501", # line-too-long
]

[tool.ruff.format]
indent-style = "space"
quote-style = "single"



[tool.pdm]
distribution = false

Expand Down
4 changes: 4 additions & 0 deletions src/backend/administration/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"""
The backend adhears to a hex architecture. See: https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)
This module contains code for internal tools for the maintainers/administrators to do maintenence.
"""
21 changes: 21 additions & 0 deletions src/backend/biz_logic/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
"""
The backend adhears to a hex architecture. See: https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)
This module contains code for all buisness logic of the application.
The buisness logic of the application is primarily driven by the "Harvest, Bottle, Mix" archetecture outlined by Sam Schwartz.
See this slide deck for more details: https://docs.google.com/presentation/d/1jE0-VBikgAd-E6XSRTEkt_RxI190uVlsWg11fB6YgXw/edit?usp=sharing
In particular, this module uses FastAPI to create RESTful endpoints.
The "unsightly cables behind the desk" which connect routers to various sections of the code are also included in these
__init__.py files.
"""

from fastapi import APIRouter

router = APIRouter(
prefix="",
responses={404: {"description": "Not found"}},
)

from ..biz_logic.harvest.endpoint import router as harvest_router

router.include_router(harvest_router)

17 changes: 17 additions & 0 deletions src/backend/biz_logic/bottle/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""
The Bottle module is for anything related to extracting repository data from various sources, and,
specifically, providing RESTful CRUD (create, read, update, delete) operations relating to this repository data,
providing the interface with our persistance layer.

These sources could include:
GitHub's multiple APIs (default repo information, contributor networks, SBOMs, etc.)
Google's BigQuery data about a repository
PyPi data about a repository
Data provided by cloning the repository and mining with PyDriller
Custom data provided by a user
and so on.

Note: Each of these data bottles are based on a proper subset of repositories stored in a harvest. A bottle can contain
data for one repository, or it can contain the data for many repositories. Each bottle will contain the same fields for
all repositories. It will also contain meta information about when the bottling happened.
"""
16 changes: 16 additions & 0 deletions src/backend/biz_logic/harvest/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""
The Harvest module is for anything related to extracting raw lists of repositories from various sources, and,
specifically, providing RESTful CRUD (create, read, update, delete) operations relating to these lists of repositories,
providing the interface with our persistance layer.

These sources could include:
GitHub search results
Spack and other build system configuration files
ArXiV
University websites
URLs within scientific papers
and so on.

Note: Harvesting only refers to repositories (and the metadata about how they were harvested) themselves; harvesting
does not include any data associated with the repository. (That comes in the bottling stage.)
"""
Loading