A Next.js + TypeScript project leveraging OpenAI or Ollama for ingesting, embedding, clustering, and exploring news articles.
git clone https://github.com/jeromecovington/silver-skates.git
cd silver-skates
yarn installnpm install -g bunCreate a .env.local file at the project root:
NEWS_API_KEY=your_newsapi_key
INGEST_SECRET=your_custom_token
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/mydbdocker run --name news-postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=mydb \
-p 5432:5432 \
-d postgres:15Make sure Postgres is running and create the mydb database:
createdb mydbnpx prisma@6 generatenpx prisma@6 migrate dev --name initbun run ingestcurl -X POST http://localhost:3000/api/ingest \
-H "Authorization: Bearer your_custom_token"This will:
- Fetch new articles
- Deduplicate
- Extract keywords
- Generate MiniLM embeddings
- Store in Postgres
bun run clusterThis uses K-Means clustering on embeddings and stores cluster IDs in each article.
bun run summarizeThis uses OpenAI’s gpt-3.5-turbo model to generate concise 2–3 sentence summaries for articles that do not yet have a summary. Summaries are stored in the summary field of each article and surfaced via the /api/preview endpoint.
LLM_MODE=local \
LLM_BASE_URL=http://localhost:11434 \
LLM_MODEL=llama3.1:8b \
bun run summarizeThis assumes Ollama running locally or on your LAN, and installation of the llama3.1:8b model.
Ingestion, clustering, and summarizing can be run sequentially using:
bun run pipelinecurl "http://localhost:3000/api/preview?token=your_custom_token"Returns latest articles, including keywords and cluster assignments.
Start the application locally, e.g.
yarn run devhttp://localhost:3000/clusters
- Embeddings generated via
@xenova/transformers(MiniLM) - Clustering via
ml-kmeans - Keywords via TF-IDF from
natural