Skip to content

Commit efc1616

Browse files
Maas api add gpu and restructure (#95)
* Working version with qwen * Initial restructured commit * Semi woking token based limiting, for enterprise for example it stops on 120 tokens but it should happen on 1000 * Restructuring as per PR comments + update token rate limiting, however there are still issues with tiers that needs to be debugged and fixed * Fixes as per coderabbit comments * fix: Remove duplicate HTTPRoute definition after rebase - Removed duplicate gateway-routes.yaml from base - HTTPRoute for maas-api is now only in base/networking/httproute.yaml - Fixed kustomization build error after rebasing on main * fix: Restore maas-api/deploy files that exist on main - Restored kuadrant.yaml, gateway.yaml, httproute.yaml files - Restored gateway-auth-policy.yaml for model policies - Fixed duplicate ServiceAccount in deployment/samples/models - All validations now pass except 2 ODH-related that also fail on main (external GitHub dependency issue with kustomize) These files were incorrectly removed during rebase thinking they were duplicates, but they serve different purposes than the ones in deployment/ * Fixes to deployment, added updated opendatahub/maas-api image, added a script for fast openshift deployment * Referenced PR comments, tested deployment * Improve scrit, removed dynamic AWS ELB address retrieving and added openshif route retrieving * Removed kustomization config as it is redundant * Updating policies * Updating rbac and some minor changes around the install script and instructions * Trying to move VLLMInferenceService to deployment/ not everything works, but simulator should work * Updating instructions and cluster role issues * Fixed qwen for LLMInferenceService, not finished facebook cpu model * Removed conflicting parts from maas-api * Revert maas-api/ directory to latest main branch state --------- Co-authored-by: Jamie Land <hokie10@gmail.com> & Bartos
1 parent d744fa3 commit efc1616

File tree

105 files changed

+2806
-3046
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

105 files changed

+2806
-3046
lines changed

β€Ž.gitignoreβ€Ž

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,4 @@ pip-delete-this-directory.txt
3737
htmlcov/
3838
apps/frontend/.env.local
3939
apps/backend/.env
40-
CLAUDE.md
40+
CLAUDE.md

β€ŽPOLICY_METRICS_IMPLEMENTED.mdβ€Ž

Lines changed: 0 additions & 30 deletions
This file was deleted.

β€ŽREADME.mdβ€Ž

Lines changed: 83 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ Our goal is to create a comprehensive platform for **Models as a Service** with
88
## πŸ“¦ Technology Stack
99

1010
- **Kuadrant/Authorino/Limitador**: API gateway and policy engine
11-
- **Istio**: Service mesh and traffic management
12-
- **Gateway API**: Traffic routing and management
11+
- **Gateway API**: Traffic routing and management (OpenShift native implementation)
12+
- **OpenShift Service Mesh**: Automatically provisioned when Gateway API is enabled (includes Istio)
1313
- **React**: Frontend framework
1414
- **Go**: Backend frameworks
1515

@@ -26,7 +26,7 @@ Our goal is to create a comprehensive platform for **Models as a Service** with
2626
## πŸ—οΈ Architecture
2727

2828
### Backend Components
29-
- **API Gateway**: Istio/Envoy with Gateway API support and Kuadrant integration
29+
- **API Gateway**: OpenShift Gateway API implementation with Envoy proxy and Kuadrant integration
3030
- **Policy Engine**: Real-time policy enforcement through Kuadrant (Authorino + Limitador)
3131
- **Model Serving**: KServe-based AI model deployment with vLLM runtime
3232
- **Model Discovery**: Automatic model listing model resources
@@ -46,14 +46,25 @@ Our goal is to create a comprehensive platform for **Models as a Service** with
4646

4747
## πŸš€ Quick Start
4848

49-
For deployment instructions, see the READMEs in the deployment directory:
49+
### Deploy Infrastructure
5050

51-
- **[Infrastructure](deployment/infrastructure/README.md)** - Base platform components (Istio, KServe, Kuadrant operators)
52-
- **[Example Usage](deployment/examples/README.md)** - Complete deployment examples with models, authentication, and observability
51+
See the comprehensive [Deployment Guide](deployment/README.md) for detailed instructions.
5352

54-
## Development Setup
53+
Quick deployment for OpenShift:
54+
```bash
55+
export CLUSTER_DOMAIN="apps.your-openshift-cluster.com"
56+
kustomize build deployment/overlays/openshift | envsubst | kubectl apply -f -
57+
```
58+
59+
Quick deployment for Kubernetes:
60+
```bash
61+
export CLUSTER_DOMAIN="your-kubernetes-domain.com"
62+
kustomize build deployment/overlays/kubernetes | envsubst | kubectl apply -f -
63+
```
64+
65+
### Start Development Environment
5566

56-
After deploying the infrastructure, start the frontend and backend:
67+
After deploying the infrastructure:
5768

5869
#### Option A: One-Command Start (Recommended)
5970
```bash
@@ -107,117 +118,94 @@ This will:
107118
## πŸ”§ Development
108119

109120
### Project Structure
110-
```
111-
maas-billing/
112-
β”œβ”€β”€ apps/
113-
β”‚ β”œβ”€β”€ frontend/ # React frontend with Material-UI
114-
β”‚ β”‚ β”œβ”€β”€ src/
115-
β”‚ β”‚ β”‚ β”œβ”€β”€ components/ # Policy Manager, Metrics Dashboard, etc.
116-
β”‚ β”‚ β”‚ β”œβ”€β”€ hooks/ # API integration hooks
117-
β”‚ β”‚ β”‚ └── services/ # API client
118-
β”‚ β”‚ └── package.json
119-
β”‚ └── backend/ # Node.js/Express API server
120-
β”‚ β”œβ”€β”€ src/
121-
β”‚ β”‚ β”œβ”€β”€ routes/ # API endpoints
122-
β”‚ β”‚ β”œβ”€β”€ services/ # Kuadrant integration
123-
β”‚ β”‚ └── utils/ # Logging and utilities
124-
β”‚ └── package.json
125-
β”œβ”€β”€ deployment/kuadrant/ # Kuadrant infrastructure
126-
└── start-*.sh # Development scripts
127-
```
128121

129-
### API Endpoints
130-
- `GET /api/v1/policies` - List all policies
131-
- `POST /api/v1/policies` - Create new policy
132-
- `PUT /api/v1/policies/:id` - Update policy
133-
- `DELETE /api/v1/policies/:id` - Delete policy
134-
- `GET /api/v1/metrics/live-requests` - Real-time metrics
135-
- `GET /api/v1/metrics/dashboard` - Dashboard statistics
122+
| Directory | Description | Documentation |
123+
|-----------|-------------|---------------|
124+
| `apps/frontend/` | React frontend with Material-UI | [Frontend Guide](apps/frontend/README.md) |
125+
| `apps/backend/` | Node.js/Express API server | [Backend Guide](apps/backend/README.md) |
126+
| `maas-api/` | Go API for key management | [MaaS API Guide](maas-api/README.md) |
127+
| `deployment/` | Kubernetes/OpenShift deployments | [Deployment Guide](deployment/README.md) |
128+
| `scripts/` | Automation and utility scripts | - |
136129

137-
### Environment Variables
138-
```bash
139-
# Backend (.env)
140-
PORT=3001
141-
FRONTEND_URL=http://localhost:3000
142-
```
130+
### Available Scripts
143131

144-
## πŸ›‘ Stopping the Platform
132+
From the repository root:
133+
- `./start-dev.sh` - Start full development environment
134+
- `./stop-dev.sh` - Stop all development services
135+
- `./start-backend.sh` - Start backend only
136+
- `./start-frontend.sh` - Start frontend only
137+
- `./scripts/test-gateway.sh` - Test gateway endpoints
145138

146-
```bash
147-
# Stop all services
148-
./stop-dev.sh
139+
### Backend API Endpoints
149140

150-
# Or manually stop individual components
151-
pkill -f "npm start" # Stop frontend
152-
pkill -f "npm run dev" # Stop backend
153-
```
141+
The backend provides these key endpoints:
142+
- `GET /api/v1/models` - List available models
143+
- `GET /api/v1/policies` - Retrieve current policies
144+
- `POST /api/v1/policies` - Create/update policies
145+
- `GET /api/v1/metrics/live-requests` - Live metrics stream
146+
- `POST /api/v1/simulator/run` - Run policy simulation
154147

155-
## πŸ“Š Monitoring & Logs
148+
### Frontend Components
156149

157-
### Application Logs
158-
```bash
159-
# Real-time logs
160-
tail -f backend.log # Backend API logs
161-
tail -f frontend.log # Frontend build logs
162-
163-
# Service logs
164-
kubectl logs -n kuadrant-system -l app=limitador
165-
kubectl logs -n kuadrant-system -l app=authorino
166-
```
150+
Key React components:
151+
- `PolicyBuilder` - Drag-and-drop policy editor
152+
- `MetricsDashboard` - Real-time metrics visualization
153+
- `RequestSimulator` - Policy testing interface
154+
- `TokenManagement` - API key management
167155

168-
### Metrics and Health Checks
169-
- Backend health: `curl http://localhost:3001/health`
170-
- Kuadrant status: `kubectl get pods -n kuadrant-system`
171-
- Live metrics: `curl http://localhost:3001/api/v1/metrics/live-requests`
156+
## πŸ§ͺ Testing
172157

173-
## πŸ” Troubleshooting
174-
175-
### Common Issues
176-
177-
**Port Already in Use**
158+
### Test Infrastructure
178159
```bash
179-
# Kill processes on ports 3000/3001
180-
lsof -ti:3000 | xargs kill -9
181-
lsof -ti:3001 | xargs kill -9
182-
```
160+
# Use the test script
161+
./scripts/test-gateway.sh
183162

184-
**Kuadrant Not Ready**
185-
```bash
186-
# Check Kuadrant deployment
187-
kubectl get pods -n kuadrant-system
188-
kubectl get gateways -A
163+
# Or manually test endpoints
164+
curl http://localhost:3001/health
189165
```
190166

191-
**Frontend Not Loading**
167+
### Run Frontend Tests
192168
```bash
193-
# Clear browser cache and restart frontend
194-
rm -rf apps/frontend/node_modules/.cache
195-
./start-frontend.sh
169+
cd apps/frontend
170+
npm test
196171
```
197172

198-
**No Metrics Data**
173+
### Run Backend Tests
199174
```bash
200-
# Check Kuadrant components
201-
kubectl port-forward -n kuadrant-system svc/limitador 8080:8080
202-
curl http://localhost:8080/metrics
175+
cd apps/backend
176+
npm test
203177
```
204178

179+
## πŸ“š Documentation
180+
181+
- [Deployment Guide](deployment/README.md) - Complete deployment instructions
182+
- [Platform-Specific Overlays](deployment/overlays/README.md) - OpenShift vs Kubernetes
183+
- [MaaS API Documentation](maas-api/README.md) - Go API for key management
184+
- [OAuth Setup Guide](OAUTH_SETUP.md) - Configure OAuth authentication
185+
205186
## 🀝 Contributing
206187

188+
We welcome contributions! Please:
207189
1. Fork the repository
208-
2. Create a feature branch: `git checkout -b feature/amazing-feature`
209-
3. Commit changes: `git commit -m 'Add amazing feature'`
210-
4. Push to branch: `git push origin feature/amazing-feature`
211-
5. Open a Pull Request
190+
2. Create a feature branch
191+
3. Make your changes
192+
4. Submit a pull request
193+
194+
## πŸ“ License
212195

213-
## πŸ“„ License
196+
This project is licensed under the Apache 2.0 License.
214197

215-
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
198+
## πŸ™ Acknowledgments
216199

217-
---
200+
Built with:
201+
- [Kuadrant](https://kuadrant.io/) for API management
202+
- [KServe](https://kserve.github.io/) for model serving
203+
- [OpenShift](https://www.openshift.com/) with Service Mesh for infrastructure
204+
- [React](https://react.dev/) and [Material-UI](https://mui.com/)
218205

219-
## πŸ“š Additional Resources
206+
## πŸ“ž Support
220207

221-
- **Kuadrant Documentation**: https://kuadrant.io/
222-
- **KServe Documentation**: https://kserve.github.io/website/
223-
- **Istio Gateway API**: https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/
208+
For questions or issues:
209+
- Open an issue on GitHub
210+
- Check the [deployment guide](deployment/README.md) for troubleshooting
211+
- Review the [samples](deployment/samples/models/) for examples

0 commit comments

Comments
Β (0)