NewsBro - MLOPS Final Project

Distributed resilient architecture for news recommendation

⚠ App is now down (23/12/2025)

Feel free to have a look at our website to create your account and interact with 100k+ articles.

We provide automatic update everyday (9am / 6pm UTC), 1k articles ingested from various sources. We also provide new recommendation based on your liked articles.

Passwords are never stored, authentication is handled through secure and short-lived JWT.
Internal credentials are randomly generated for each session and securely stored in Vault, never embedded in code, only injected as env variables.
All feedback data is anonymized before being used for training or evaluation (only user_id is used), ensuring that no personally identifiable information is retained.
Data transmission is encrypted (TLS) between client & dns provider, and between dns provider and server.

All internal data will be securely erased once the project concludes.

Project Structure

.
├── apps
│   ├── repo-account
│   ├── repo-article
│   ├── repo-feed
│   ├── port-front
│   ├── srvc-scrapping
│   ├── srvc-search
│   ├── srvc-drift
│   └── srvc-inference
├── docs
├── dev                          # dev folder for various dev operations
├── k8s
│   ├── apps                     # deployment of each apps
│   ├── capacitor                # Frontend to manage flux
│   ├── cert-manager             # Manager TLS certificate
│   ├── external-secrets         # ESO to communicate with vault
│   ├── flux/flux-system         # deployment of flux repo (gitops tools)
│   ├── ingress-nginx            # Ingress nginx
│   ├── kafka                    # Kafka brokers, controllers, redpanda console and topic definition
│   ├── metallb                  # Metal LB
│   ├── minio                    # Minio cluster storage (datalake + store for mlflow artifacts)
│   ├── mlflow
│   ├── postgres                 # CNPG CRD, database definition, user definition
│   ├── qdrant                   # Vector database
│   ├── redis                    # Redis Operator, Insight and Sentinels
│   └── vault                    # Vault server to store secrets, act as authenticator for SA in kubernetes
├── .github/workflows/
└── README.md

Architecture

The project is built with a microservices architecture including the following components :

repo-account: User account and login management
repo-article: Article & feedbacks management
srvc-scrapping: Scrapping for articles, ingest to kafka
srvc-search: Search service
srvc-inference: Inference service to provide update user feeds
srvc-drift: Drift service to provide insights on potential data drift
repo-feed: Service providing the user feed
port-front: main frontend

Light Speed Recommendation

We want light speed recommendation and scalable infrastructure. Behind the scene:

Kafka consumers consuming articles, processing articles by batch and creating new recommendations.
Qdrant storing vector embeddings to pre compute recommendation as they come. These representation are then used in various way.
Redis Sentinels with zsets to create efficient feed queue.
Repo Feed, a dedicated service managing feed, storing feedbacks / articles for short periods?

Nort-South Traffic

North-South traffic is handled by two separate ingress controllers.

Public Ingress: on a dedicated ip behind a NAT, exposing public services including frontend and backend services.
Private Ingress: using VPN ip handled by wireguard, exposing critical and back office services.

We use also a dedicated load balancer called Metal LB to expose ip pools for the two ingresses.

Secrets Rollout

We use external-secret to create our secrets

Secrets are centralized within a vault, provisioned by Terraform.

Dev

We provide a generic docker compose to run a minimal stack at apps/docker-compose.yml

CI/CD

Our CI/CD will lint, build and test backend codes.
To deploy a new version just tag <srvc-name>-vx.x.x.

Commits

Example of commit we use:

k8s: redis: increase redis sentinel for minimal config
apps: port-front: now using get profile to retrieve user info as cookies could not be read by frontend
apps: review swagger, specify dto for errors

Deployment

Our Project is deployed with Kubernetes. We FluxCD as GitOps tool to automate the deployment of our new manifests.

To deploy the whole stack, just use:

kubectl apply -k k8s/flux/flux-system

Name		Name	Last commit message	Last commit date
Latest commit History 505 Commits
.github/workflows		.github/workflows
ansible		ansible
apps		apps
assets		assets
dev/scripts		dev/scripts
docker		docker
docs		docs
k8s		k8s
terraform		terraform
.envrc		.envrc
.gitignore		.gitignore
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NewsBro - MLOPS Final Project

Table of Contents

Privacy Note

Project Structure

Architecture

Light Speed Recommendation

Nort-South Traffic

Secrets Rollout

Dev

CI/CD

Commits

Deployment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NewsBro - MLOPS Final Project

Table of Contents

Privacy Note

Project Structure

Architecture

Light Speed Recommendation

Nort-South Traffic

Secrets Rollout

Dev

CI/CD

Commits

Deployment

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages