|
| 1 | +# GitHub Stars Index |
| 2 | + |
| 3 | +English | [中文](README.md) |
| 4 | + |
| 5 | +> Automatically fetch GitHub Stars, generate AI summaries, and make them easily searchable. |
| 6 | +
|
| 7 | +## Contents |
| 8 | + |
| 9 | +- [Features](#features) |
| 10 | +- [Quick Start](#quick-start) |
| 11 | +- [Configuration Reference (Environment Variables / .env)](#configuration-reference-environment-variables--env) |
| 12 | +- [Obsidian Sync (Optional)](#obsidian-sync-optional) |
| 13 | +- [Local Installation](#local-installation) |
| 14 | + |
| 15 | +--- |
| 16 | + |
| 17 | +## Features |
| 18 | + |
| 19 | +- 🤖 **Automatic Sync**: Fetches all starred repositories from your GitHub account. |
| 20 | +- 📝 **AI Summaries**: Reads each repository's README and uses AI to generate concise summaries and technical tags. |
| 21 | +- 🏷️ **Smart Tagging**: Built-in `TAG_MAPPING` for automatic synonym merging and tech stack normalization (e.g., LLM -> Large Language Model), preventing tag explosion. |
| 22 | +- ⚡️ **High Performance**: Supports **concurrency** for AI API calls, significantly speeding up the processing of new projects. |
| 23 | +- 🗃️ **Data Driven**: Uses `data/stars.json` at runtime and publishes it to `gh-pages/data/stars.json` for custom development. |
| 24 | +- 🎨 **Template Driven**: Uses Jinja2 templates to generate Markdown and static HTML search pages. |
| 25 | +- ⏭️ **Smart Incremental Updates**: Uses AI for new projects, while **automatically updating star counts and metadata** for existing ones. |
| 26 | +- ⏰ **Automated Workflow**: Regularly runs via GitHub Actions with customizable cron schedules. |
| 27 | +- 🔄 **Vault Sync (Optional)**: Automatically pushes generated `stars_zh.md` & `stars_en.md` to your **Obsidian Vault**. |
| 28 | +- 🌐 **GitHub Pages (Optional)**: Deploys a static search page with multi-language (ZH/EN) support and real-time search. |
| 29 | +- 💻 **Flexible AI Providers**: Compatible with any **OpenAI-format API** (OpenAI, Azure, local Ollama, etc.). |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +## Process Overview |
| 34 | + |
| 35 | +```mermaid |
| 36 | +graph TD |
| 37 | + Start([Start]) --> Trigger{Trigger Mode} |
| 38 | + Trigger -- "Actions (Schedule/Manual)" --> Sync[Run sync_stars.py] |
| 39 | + Trigger -- "Local (Manual Run)" --> Sync |
| 40 | + |
| 41 | + Sync --> FetchGH[Fetch GitHub Stars] |
| 42 | + FetchGH --> Filter{Incremental Check} |
| 43 | + Filter -- "Processed Projects" --> UpdateMeta[Update Stars/Metadata] |
| 44 | + Filter -- "New Projects" --> FetchRD[Fetch README] |
| 45 | + |
| 46 | + FetchRD --> AI[AI Summarization/Tagging] |
| 47 | + AI --> Norm[Tag Governance/Normalization] |
| 48 | + Norm --> Store[(data/stars.json)] |
| 49 | + UpdateMeta --> Store |
| 50 | + Store --> Render |
| 51 | + |
| 52 | + Render[[Jinja2 Template Rendering]] --> Output |
| 53 | + |
| 54 | + subgraph Output [Output Results] |
| 55 | + MD[Markdown Archive] |
| 56 | + HTML[Static HTML Search Page] |
| 57 | + end |
| 58 | + |
| 59 | + Output --> Dispatch{Distribution} |
| 60 | + Dispatch -- "VAULT_SYNC" --> Obs[Push to Obsidian Vault] |
| 61 | + Dispatch -- "PAGES_SYNC" --> Pages[Deploy GitHub Pages] |
| 62 | + |
| 63 | + Obs --> End([Finish]) |
| 64 | + Pages --> End |
| 65 | +``` |
| 66 | + |
| 67 | +--- |
| 68 | + |
| 69 | +## Quick Start |
| 70 | + |
| 71 | +### Step 1: Fork This Repository |
| 72 | + |
| 73 | +Click the **Fork** button in the top right corner to copy this repository to your account. |
| 74 | + |
| 75 | +### Step 2: Configure Environment (Choose One) |
| 76 | + |
| 77 | +This project is driven by environment variables. **Priority: GitHub Secrets > .env file**. |
| 78 | + |
| 79 | +#### Method A: Using GitHub Environment Variables (Recommended for continuous running) |
| 80 | + |
| 81 | +Go to **Settings → Secrets and variables → Actions** in your repository: |
| 82 | + |
| 83 | +**🔐 Required Secrets/Variables** |
| 84 | +- `GH_USERNAME`: The GitHub username whose stars you want to crawl. |
| 85 | +- `AI_API_KEY`: Your AI interface API Key. |
| 86 | + |
| 87 | +**📋 Optional Variables** |
| 88 | +These have built-in defaults and usually don't need configuration: |
| 89 | +- `AI_BASE_URL`: AI API endpoint (defaults to OpenAI). |
| 90 | +- `AI_MODEL`: Model name (defaults to `gpt-4o-mini`). |
| 91 | +- `OUTPUT_FILENAME`: Base name for generated files (defaults to `stars`). |
| 92 | +- `VAULT_SYNC_PATH`: Save directory in your Vault (defaults to `GitHub-Stars/`). |
| 93 | +- `PAGES_SYNC_ENABLED`: Whether to sync to Pages (defaults to `true`). |
| 94 | + |
| 95 | +> [!TIP] |
| 96 | +> **About GitHub API Limits**: |
| 97 | +> - **Running Online (Actions)**: The workflow automatically injects `GITHUB_TOKEN` with a high limit (1,000 requests/hour), easily handling heavy crawls. |
| 98 | +> - **Running Locally**: Without a `GH_TOKEN`, the limit is 60 requests/hour. If you have many stars, it's recommended to add a `GH_TOKEN` to your `.env` to increase the limit to 5,000 requests/hour. |
| 99 | +
|
| 100 | +#### Method B: Using a .env File (Best for local development) |
| 101 | + |
| 102 | +1. Copy `.env.example` to `.env` in the root directory. |
| 103 | +2. Fill in the required fields in `.env`. |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +### Step 3: Customize Schedule Frequency |
| 108 | + |
| 109 | +Edit `.github/workflows/sync.yml` to modify the `cron` expression: |
| 110 | + |
| 111 | +```yaml |
| 112 | +schedule: |
| 113 | + - cron: "0 2 * * 1" # Example: Run every Monday at 2 AM |
| 114 | +``` |
| 115 | +
|
| 116 | +### Step 4: Manually Trigger the First Run |
| 117 | +
|
| 118 | +Go to **Actions → 🌟 GitHub Stars Index 同步 → Run workflow** and click run. |
| 119 | +
|
| 120 | +--- |
| 121 | +
|
| 122 | +## Configuration Reference |
| 123 | +
|
| 124 | +| Variable | Type | Description | Default Value | |
| 125 | +| -------------------- | ------------------------ | --------------------------------------------- | --------------------------- | |
| 126 | +| `GH_USERNAME` | Required | GitHub username to sync | - | |
| 127 | +| `AI_API_KEY` | Required | AI API Key | - | |
| 128 | +| `AI_BASE_URL` | Optional | OpenAI-compatible API endpoint | `https://api.openai.com/v1` | |
| 129 | +| `AI_MODEL` | Optional | AI model to use | `gpt-4o-mini` | |
| 130 | +| `OUTPUT_FILENAME` | Optional | Base name for generated MD/HTML files | `stars` | |
| 131 | +| `VAULT_SYNC_ENABLED` | Optional | Whether to enable Obsidian sync | `false` | |
| 132 | +| `VAULT_REPO` | Optional | Vault repository (`owner/repo`) | - | |
| 133 | +| `VAULT_SYNC_PATH` | Optional | Directory path for Vault sync | `GitHub-Stars/` | |
| 134 | +| `PAGES_SYNC_ENABLED` | Optional | Whether to deploy to GitHub Pages | `true` | |
| 135 | +| `MAX_CONCURRENCY` | Optional | AI concurrency limit (recommended 1-10) | `1` | |
| 136 | +| `GH_TOKEN` | **Strongly Recommended** | Increases API limits to prevent rate-limiting | - | |
| 137 | + |
| 138 | +--- |
| 139 | + |
| 140 | +## Obsidian Sync (Optional) |
| 141 | + |
| 142 | +This feature allows you to automatically push the generated star summaries to your Obsidian Vault (or any other) GitHub repository, keeping your notes updated automatically. |
| 143 | + |
| 144 | +### Core Mechanism |
| 145 | +**Cross-repo sync**: Many Obsidian users use GitHub to store and sync their notes. This project uses the GitHub API to push the generated Markdown files directly to your designated Vault repository. |
| 146 | + |
| 147 | +### Setup Steps |
| 148 | + |
| 149 | +1. **Prepare Target Repository**: Ensure your Obsidian Vault is already hosted on GitHub. |
| 150 | +2. **Create Personal Access Token (PAT)**: |
| 151 | + - Visit the [Fine-grained PAT configuration page](https://github.com/settings/personal-access-tokens). |
| 152 | + - **Repository access**: Choose "Only select repositories" and select your **Vault repository**. |
| 153 | + - **Permissions**: Under "Repository permissions," set **Contents** to **Read and write**. |
| 154 | + - Once generated, add it to this project's **Settings -> Secrets -> Actions** as `VAULT_PAT`. |
| 155 | +3. **Enable Sync Configuration**: |
| 156 | + - In this project's **Settings -> Variables -> Actions**: |
| 157 | + - Set `VAULT_SYNC_ENABLED` to `true`. |
| 158 | + - Set `VAULT_REPO` to `your-username/repo-name` (e.g., `iblogc/my-obsidian-vault`). |
| 159 | + - Set `VAULT_SYNC_PATH` to the desired folder in your Vault (e.g., `Reading/GitHub-Stars/`). |
| 160 | +4. **Save and Finish**: The next time the Action runs, `stars_zh.md` and `stars_en.md` will automatically appear in your Vault repository. |
| 161 | + |
| 162 | +> [!TIP] |
| 163 | +> **How to view locally?** |
| 164 | +> Once the remote sync is complete, just use the **Obsidian Git** plugin to "Pull," or run `git pull` in your local vault directory. The latest star summaries will then appear in your note library. |
| 165 | + |
| 166 | +--- |
| 167 | + |
| 168 | +## GitHub Pages Deployment (Optional) |
| 169 | + |
| 170 | +This project automatically generates multi-language static web pages with real-time search functionality. |
| 171 | + |
| 172 | +1. Ensure `PAGES_SYNC_ENABLED=true`. |
| 173 | +2. After running the Action once, go to **Settings -> Pages**. |
| 174 | +3. Select `gh-pages` branch and `/(root)` directory, then click Save. |
| 175 | + |
| 176 | +> [!IMPORTANT] |
| 177 | +> **Data Source Migration (Compatibility for Forks)**: |
| 178 | +> - The current recommended data source is `gh-pages/data/stars.json`. |
| 179 | +> - `data/stars.json` in the `main` branch is only used for initial migration compatibility. |
| 180 | +> - Normal runs will no longer commit `data/stars.json` back to the `main` branch. |
| 181 | + |
| 182 | +--- |
| 183 | + |
| 184 | +## Docker Deployment |
| 185 | + |
| 186 | +If you want to run this long-term on a server with automatic synchronization, Docker Compose is recommended. |
| 187 | + |
| 188 | +### 1. Configuration |
| 189 | +Copy `.env.example` to `.env` and fill in the necessary information: |
| 190 | +```bash |
| 191 | +cp .env.example .env |
| 192 | +# Edit .env to fill in GH_USERNAME, AI_API_KEY, and GH_TOKEN |
| 193 | +``` |
| 194 | + |
| 195 | +> [!IMPORTANT] |
| 196 | +> **GH_TOKEN is Mandatory**: In Docker environments, calling the GitHub API without a token easily triggers [Rate Limiting](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api). Configuration increases the limit from 60 to 5,000 requests per hour. |
| 197 | + |
| 198 | +### 2. Start Service |
| 199 | +Launch with Docker Compose: |
| 200 | +```bash |
| 201 | +docker compose up -d |
| 202 | +``` |
| 203 | +This starts two containers: |
| 204 | +- `sync`: The core sync script. By default, it runs every **24 hours**. You can adjust this by setting `SCHEDULE_HOURS` in your `.env`. |
| 205 | +- `web`: An Nginx-based static server for viewing the generated index. |
| 206 | + |
| 207 | +### 3. Access the Page |
| 208 | +Open your browser and visit: `http://localhost:8080` |
| 209 | + |
| 210 | +### 4. Management Commands |
| 211 | +```bash |
| 212 | +# View sync logs |
| 213 | +docker logs -f github-stars-sync |
| 214 | +
|
| 215 | +# Run a manual sync immediately |
| 216 | +docker compose run --rm sync |
| 217 | +
|
| 218 | +# Update page rendering only (skip AI calls) |
| 219 | +docker compose run --rm sync --render-only |
| 220 | +``` |
| 221 | + |
| 222 | +--- |
| 223 | + |
| 224 | +## Local Installation |
| 225 | + |
| 226 | +```bash |
| 227 | +# Clone the repository and install dependencies |
| 228 | +git clone https://github.com/iblogc/GithubStarsIndex.git |
| 229 | +cd GithubStarsIndex |
| 230 | +
|
| 231 | +# Install dependencies |
| 232 | +pip install -r requirements.txt |
| 233 | +# Or use uv (recommended) |
| 234 | +uv pip install -r requirements.txt |
| 235 | +
|
| 236 | +# Configure using .env |
| 237 | +cp .env.example .env |
| 238 | +# Edit .env and fill in AI_API_KEY and GH_USERNAME |
| 239 | +
|
| 240 | +# [Normal Run] Fetch metadata, call AI for summaries, and render pages |
| 241 | +python scripts/sync_stars.py |
| 242 | +# Or |
| 243 | +uv run scripts/sync_stars.py |
| 244 | +
|
| 245 | +# [Render Only] Skip fetching/AI, re-render HTML/MD from local stars.json |
| 246 | +python scripts/sync_stars.py --render-only |
| 247 | +``` |
| 248 | + |
| 249 | +--- |
| 250 | + |
| 251 | +## File Structure |
| 252 | + |
| 253 | +| File | Description | |
| 254 | +| :--------------------------- | :------------------------------------------------ | |
| 255 | +| `data/stars.json` | Temporary runtime data (migration entry point) | |
| 256 | +| `templates/` | Jinja2 generation templates (Markdown/HTML) | |
| 257 | +| `dist/` | Automatically generated local results (HTML / MD) | |
| 258 | +| `scripts/sync_stars.py` | Core sync and generation script | |
| 259 | +| `.github/workflows/sync.yml` | GitHub Actions scheduled workflow | |
| 260 | +| `.env.example` | Configuration example file | |
| 261 | + |
| 262 | +--- |
| 263 | + |
| 264 | +## Appendix: Applying for a GitHub Token (GH_TOKEN) |
| 265 | + |
| 266 | +To ensure the program can smoothly crawl all your starred repositories, it's recommended to create a Personal Access Token (PAT). |
| 267 | + |
| 268 | +### Steps: |
| 269 | +1. Go to the [GitHub Fine-grained PAT page](https://github.com/settings/personal-access-tokens/new). |
| 270 | +2. **Token name**: `Stars-Index-Sync` (or any name you prefer). |
| 271 | +3. **Expiration**: `90 days` or `Custom` is recommended. |
| 272 | +4. **Resource owner**: Select your personal account. |
| 273 | +5. **Repository access**: Choose `Public Repositories (read-only)` (or `All repositories`). |
| 274 | +6. **Permissions**: No special permissions are required; default public access is enough to fetch your stars list. |
| 275 | +7. Click **Generate token**, then **copy and save** it immediately. |
| 276 | +8. Add this token to the `GH_TOKEN` field in your `.env` file. |
| 277 | + |
| 278 | +> [!TIP] |
| 279 | +> If you've enabled **Vault Sync (Obsidian Sync)**, you can reuse the same `VAULT_PAT` (with write permissions) as your `GH_TOKEN`. |
0 commit comments