Skip to content

Feat/pypi stats #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .config
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[DEFAULT]
REPO_FILE_PATH=repository_list.tsv
DEBUG=False
24 changes: 24 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# http://editorconfig.org

root = true

[*]
indent_style = space
indent_size = 4
trim_trailing_whitespace = true
insert_final_newline = true
charset = utf-8
end_of_line = lf

[*.bat]
indent_style = tab
end_of_line = crlf

[LICENSE]
insert_final_newline = false

[Makefile]
indent_style = tab

[*.{yml, yaml}]
indent_size = 2
34 changes: 34 additions & 0 deletions .github/workflows/actions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: specdatri reporting
on:
schedule:
- cron: 0 0 * * 1 # At 00:00 on Monday
jobs:
build:
runs-on: ubuntu-22.04
steps:
- name: checkout repo content
uses: actions/checkout@v4
- name: setup python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: install python packages
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: execute py script
env:
github_token: '${{ secrets.github_token }}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to create and add the correct token

pepy_x_api_key: '${{ secrets.pepy_x_api_key }}'
run: python main.py
- name: commit files
run: |
git config --local user.email "[email protected]"
git config --local user.name "GitHub Action"
git add -A
git diff-index --quiet HEAD || (git commit -a -m "updated files" --allow-empty)
- name: push changes
uses: ad-m/[email protected]
with:
github_token: '${{ secrets.github_token }}'
branch: main
25 changes: 25 additions & 0 deletions .github/workflows/codeql-analysis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: "Code Scanning with CodeQL"

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
schedule:
- cron: '40 17 * * 3'

jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
permissions:
security-events: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Initialize
uses: github/codeql-action/init@v3
with:
languages: python
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
35 changes: 35 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
exclude: '.*\.tsv$'
default_stages: [pre-commit]

default_language_version:
python: python3.12

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-json
- id: check-toml
- id: check-xml
- id: check-yaml
- id: debug-statements
- id: check-builtin-literals
- id: check-case-conflict
- id: check-docstring-first
- id: detect-private-key

# Run the Ruff linter.
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.3
hooks:
# Linter
- id: ruff
args: [--fix, --exit-non-zero-on-fix]

# sets up .pre-commit-ci.yaml to ensure pre-commit dependencies stay up to date
ci:
autoupdate_schedule: weekly
skip: []
submodules: false
42 changes: 42 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,44 @@
# specdatri_reporting
The code base includes a collection of scripts and GitHub Actions designed to gather various metrics on RECETOX's impact.

## Local development

### Project setup
It is assumed you can clone and change directories into the development repo.
Create a virtualenv or conda environment (whatever your poison)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Create a virtualenv or conda environment (whatever your poison)
Create a virtualenv or conda environment (whatever your poison).


Once in the repos directory, activate your env then run the following command to install th needed python libraries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once in the repos directory, activate your env then run the following command to install th needed python libraries.
Once in the repos directory, activate your env then run the following command to install the needed python libraries.


> pip install -r .\requirements\local.txt

### Simulating Github Actions

You need [act](https://nektosact.com/) to test your code in development mode.
Install act for your chosen OS.
At your terminal, run (This simulates a GitHub action on your local device):

> act --secret-file .env schedule

### Things to note

1: Do not push local development changes from `tmp` folder and `reports` folder. In fact do not edit them at all !!!

2: When testing with `act` do not use a token that has the permission to make push requests else your test data wil mess with "production" data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2: When testing with `act` do not use a token that has the permission to make push requests else your test data wil mess with "production" data
2: When testing with `act` do not use a token that has the permission to make push requests else your test data will mess with "production" data.


3: When testing with `act` know that the push may fail due to the fact that you can't directly push to main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3: When testing with `act` know that the push may fail due to the fact that you can't directly push to main
3: When testing with `act` know that the push may fail due to the fact that you can't directly push to main.


4: Always,I repeat always devlop on another branch not main and never push directly to main.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4: Always,I repeat always devlop on another branch not main and never push directly to main.
4: Always, I repeat always devlop on another branch not main and never push directly to main.


5: You need tokens to test the code locally, place said tokens in `example.env` and change the filename to `.env`

### Running tests:

#### Running with unittest
> python -m unittest discover -s tests

#### Running with coverage
> coverage run -m unittest discover -s tests

> coverage report -m

> coverage html
2 changes: 2 additions & 0 deletions example.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
github_token="your_github_token"
pepy_x_api_key="your_pepy_x_api_key"
72 changes: 72 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import pandas as pd
from pandas import DataFrame

from src.github import process_github_repositories
from src.pypi import process_pypi_repositories
from src.utils import get_config_var, get_env_var, log_function, setup_logger

logger = setup_logger()


@log_function(logger)
def load_repositories(
file_path: str,
) -> DataFrame:
"""
Reads a list of repositories from a TSV file and returns it as a DataFrame.

:param file_path: Path to the TSV file containing the list of repositories.
:return: DataFrame containing the list of repositories.
"""
return pd.read_csv(file_path, sep="\t")


@log_function(logger)
def process_repositories(
repositories_df: DataFrame,
github_token: str,
pepy_x_api_key: str,
):
"""
Args:
repositories_df (DataFrame): DataFrame containing the list of repositories.
github_token (str): GitHub token to access the GitHub API.
Returns:
None
"""
for _, row in repositories_df.iterrows():
source = row["source"].lower()
repository = row["repository"]
action = row["action"]
project = row["project"]
package = row["package"]
if source == "github":
owner, repo = repository.split("/")
process_github_repositories(
owner, repo, github_token, action, project, package
)
elif source == "pypi":
process_pypi_repositories(
package, pepy_x_api_key, action, project
)
else:
logger.error(f"Unknown source: {source}")


@log_function(logger)
def main():
repo_file_path = get_config_var("DEFAULT", "REPO_FILE_PATH")
if repo_file_path:
logger.info("REPO_FILE_PATH found in .config file")
repositories_df = load_repositories(repo_file_path)
github_token = get_env_var("github_token")
pepy_x_api_key = get_env_var("pepy_x_api_key")
process_repositories(repositories_df, github_token, pepy_x_api_key)
logger.debug(f"Repositories DataFrame: \n{repositories_df}")
else:
logger.error("REPO_FILE_PATH not found in .config file")
print("REPO_FILE_PATH not found in .config file")


if __name__ == "__main__":
main()
Empty file added reports/.gitkeep
Empty file.
File renamed without changes.
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
-r requirements/base.txt
4 changes: 4 additions & 0 deletions requirements/base.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pandas==2.2
orjson==3.10.15
requests==2.32.3
python_dotenv==1.0.1
3 changes: 3 additions & 0 deletions requirements/local.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-r base.txt
# Used in development
pre-commit==4.1.0
102 changes: 102 additions & 0 deletions src/github.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
import requests
from src.reports import write_make_request_response

from .utils import (
log_function,
make_api_request,
setup_logger,
)

logger = setup_logger()


def _get_headers(github_token: str) -> dict:
return {
"Accept": "application/vnd.github.v3+json",
"X-GitHub-Api-Version": "2022-11-28",
"Authorization": f"Bearer {github_token}",
}


@log_function(logger)
def get_clone_stats(
owner: str,
repo: str,
github_token: str,
) -> requests.Response:
"""
Fetches the clone statistics for a given GitHub repository.

Args:
owner (str): The owner of the repository.
repo (str): The name of the repository.
github_token (str): The GitHub token.

Returns:
dict: A dictionary containing the clone statistics.
"""

url = f"https://api.github.com/repos/{owner}/{repo}/traffic/clones"
headers = _get_headers(github_token)
response = make_api_request(http_method="GET", url=url, headers=headers)
return response


@log_function(logger)
def get_repo_views(
owner: str,
repo: str,
github_token: str,
) -> requests.Response:
"""
Fetches the view statistics for a given GitHub repository.

Args:
owner (str): The owner of the repository.
repo (str): The name of the repository.
github_token (str): The GitHub token.

Returns:
dict: A dictionary containing the view statistics.
"""

url = f"https://api.github.com/repos/{owner}/{repo}/traffic/views"
headers = _get_headers(github_token)
response = make_api_request(http_method="GET", url=url, headers=headers)
return response


@log_function(logger)
def process_github_repositories(
owner: str,
repo: str,
github_token: str,
action: str,
project: str,
package: str,
):
"""
Processes the specified GitHub repository to fetch clone and view statistics.

Args:
owner (str): The owner of the GitHub repository.
repo (str): The name of the GitHub repository.
github_token (str): The GitHub token to access the GitHub API.
action (str): The action to be performed on the repository.
project (str): The project name.
package (str): The specific package name.

Returns:
None

Logs:
Logs the clone and view statistics for the specified repository.
"""
if action == "clones":
clone_stats = get_clone_stats(owner, repo, github_token)
write_make_request_response(clone_stats, project, package, "github", "clones")
elif action == "views":
view_stats = get_repo_views(owner, repo, github_token)
write_make_request_response(view_stats, project, package, "github", "views")
else:
logger.error(f"Invalid action: {action}")
Loading