Skip to content

Commit 03a0da5

Browse files
authored
Merge branch 'master' into patch-17
2 parents f124d48 + 4f56f8b commit 03a0da5

File tree

6 files changed

+230
-23
lines changed

6 files changed

+230
-23
lines changed

subjects/ai/matrix-factorization/README.md

Lines changed: 48 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,31 +19,69 @@ The goal of this project is to understand and apply advanced matrix factorizatio
1919
1. **Download the [MovieLens Dataset](https://grouplens.org/datasets/movielens/1m/)** (ratings, users, and movies).
2020
2. Preprocess the dataset to remove null values and prepare it for matrix factorization.
2121
3. Create a user-item interaction matrix from the data.
22+
4. Split the data into training and testing sets using a fixed `random_state = 42`.
23+
5. Normalize the user–item interaction matrix and save it under `processed/user_item_matrix.csv`.
2224

2325
#### Singular Value Decomposition (SVD) Model
2426

2527
1. Implement the SVD algorithm using the **scipy.sparse.linalg.svds** function for matrix factorization.
2628
2. Train the SVD model on the MovieLens dataset to generate predicted ratings for all users.
29+
3. Compute RMSE on the test set and append the value to `reports/model_metrics.json`.
30+
4. Save the full predicted rating matrix as `reports/svd_predictions.npy`.
2731

2832
#### Probabilistic Matrix Factorization (PMF) Model
2933

3034
1. Implement the PMF algorithm.
3135
2. Train the PMF model and visualize the model's convergence (e.g., plot Mean Squared Error over iterations).
36+
3. During training, log the Mean Squared Error (MSE) at each iteration/epoch.
37+
4. Generate and save a convergence plot (`MSE vs. iteration`) as `reports/pmf_convergence.png`.
38+
5. Save the learned latent factor matrices (`U` and `V`) under `reports/pmf_factors/`.
3239

3340
#### Model Comparison and Evaluation
3441

3542
1. Compare the performance of SVD and PMF using evaluation metrics such as **Mean Squared Error (MSE)**.
3643
2. Provide visual comparisons between the models using **matplotlib** to plot predicted vs. actual ratings.
44+
3. Save consolidated evaluation results as JSON:
45+
`reports/model_metrics.json`
46+
47+
Example format:
48+
49+
```json
50+
{
51+
"SVD_RMSE": 0.91,
52+
"PMF_RMSE": 0.85,
53+
"PMF_vs_SVD_improvement_%": 6.6
54+
}
55+
```
56+
57+
- Generate and save comparison plots:
58+
- Predicted vs Actual ratings: `reports/predicted_vs_actual.png`
59+
- RMSE comparison (bar chart): `reports/rmse_comparison.png`
60+
- Minimum expected performance:
61+
- SVD RMSE ≤ 0.90
62+
- PMF RMSE ≤ 0.85
63+
- PMF improvement ≥ 5% over SVD
3764

3865
#### Recommendation Generation
3966

4067
1. Implement a function that generates movie recommendations for a user based on the predicted ratings from both the SVD and PMF models.
4168
2. Display top-rated movies for users and compare recommendations from both models.
69+
3. Implement in `utils/recommendation.py`:
70+
71+
```python
72+
def generate_recommendations(user_id, model, top_n=10):
73+
...
74+
```
75+
76+
4. Save the top-10 recommendations for each evaluated user in `reports/user_<id>_recommendations.csv`
4277

4378
#### Analysis and Visualization
4479

4580
1. Provide visualizations comparing SVD and PMF predictions for the same user.
4681
2. Offer insights into how the models differ in recommending movies for specific users based on their ratings history.
82+
3. Save the following plots under `reports/`:
83+
- `user_comparison.png` — SVD vs PMF predictions for a selected user
84+
- `top_recommendations.png` — Histogram (or bar chart) of top recommended movies
4785

4886
#### Streamlit Dashboard
4987

@@ -52,6 +90,7 @@ The goal of this project is to understand and apply advanced matrix factorizatio
5290
- Movie recommendations from both the **SVD** and **PMF** models.
5391
- Visual comparison of the SVD vs. PMF predictions for the user.
5492
2. Ensure real-time interaction, with recommendations and visualizations updating dynamically based on user input.
93+
3. The app must run successfully via: `streamlit run app.py`
5594

5695
### Project Repository Structure
5796

@@ -72,6 +111,15 @@ matrix-factorization-project/
72111
│ ├── matrix_creation.py
73112
│ ├── recommendation.py
74113
114+
├── reports/
115+
│ ├── model_metrics.json
116+
│ ├── pmf_convergence.png
117+
│ ├── rmse_comparison.png
118+
│ ├── predicted_vs_actual.png
119+
│ ├── user_comparison.png
120+
│ ├── top_recommendations.png
121+
│ └── user_<id>_recommendations.csv
122+
75123
├── app.py
76124
├── requirement.txt
77125
├── Movie_Recommender_System.ipynb
@@ -85,20 +133,6 @@ matrix-factorization-project/
85133
- **Movie_Recommender_System.ipynb**: A notebook for initial experiments, data exploration, and visualization of the model training and recommendations.
86134
- **README.md**: Project documentation with an overview of the recommender system, instructions for setup and running the dashboard, and additional resources.
87135

88-
### Timeline (1-2 weeks)
89-
90-
**Week 1:**
91-
92-
- **Days 1-2:** Load and preprocess the dataset, create user-item interaction matrix.
93-
- **Days 3-4:** Implement and train the SVD model.
94-
- **Days 5-7:** Implement and train the PMF model, visualize MSE vs. iterations for PMF.
95-
96-
**Week 2:**
97-
98-
- **Days 1-2:** Compare SVD and PMF models, evaluate using MSE.
99-
- **Days 3-4:** Implement recommendation generation for both models.
100-
- **Days 5-7:** Build the Streamlit dashboard, create visualizations, and finalize the project.
101-
102136
### Tips
103137

104138
Remember, a great recommender system needs to understand both the users and the content. Keep in mind the trade-off between model complexity and interpretability. Here are some additional considerations:

subjects/ai/matrix-factorization/audit/README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@
88

99
###### Is there a `requirements.txt` or `environment.yml` file listing all necessary libraries and their versions?
1010

11+
###### Do the core files exist: `app.py`, `models/svd_model.py`, `models/pmf_model.py`, and `utils/recommendation.py`?
12+
13+
###### Do the main dependencies import without error?
14+
15+
```bash
16+
python -c "import numpy, pandas, scipy, streamlit, matplotlib"
17+
```
18+
1119
##### Data Processing and Exploratory Data Analysis
1220

1321
###### Is there an exploratory data analysis notebook describing insights from the MovieLens dataset?
@@ -16,6 +24,10 @@
1624

1725
###### Has a user-item interaction matrix been created from the data?
1826

27+
###### Was a reproducible split used (e.g., `random_state = 42`)?
28+
29+
###### Does the normalized user–item matrix exist at `processed/user_item_matrix.csv`
30+
1931
##### Matrix Factorization Models
2032

2133
###### Has the Singular Value Decomposition (SVD) model been implemented using scipy.sparse.linalg.svds?
@@ -24,6 +36,12 @@
2436

2537
###### Have both models been trained on the MovieLens dataset?
2638

39+
###### Is the SVD predicted rating matrix saved as `reports/svd_predictions.npy`?
40+
41+
###### Does the PMF implementation save a convergence plot (`reports/pmf_convergence.png`)?
42+
43+
###### Are the learned factor matrices (`U`, `V`) saved (e.g., under `reports/pmf_factors/`)?
44+
2745
##### Model Evaluation
2846

2947
###### Is the Root Mean Square Error (RMSE) calculated for both models on a test set?
@@ -36,12 +54,38 @@
3654

3755
###### Is there a justification for when to stop training based on the learning curves?
3856

57+
###### Does `reports/model_metrics.json` exist with fields:
58+
59+
```json
60+
{
61+
"SVD_RMSE": ...,
62+
"PMF_RMSE": ...,
63+
"PMF_vs_SVD_improvement_%": ...
64+
}
65+
```
66+
67+
###### Are the following thresholds met?
68+
69+
- SVD RMSE ≤ 0.90
70+
- PMF RMSE ≤ 0.85
71+
- PMF improvement ≥ 5%
72+
- Are the plots saved? `reports/rmse_comparison.png` and `reports/predicted_vs_actual.png`.
73+
3974
##### Recommendation Generation
4075

4176
###### Is there a function that generates movie recommendations for a user based on both SVD and PMF models?
4277

4378
###### Does the recommendation system return the top 10 movie recommendations for a given user?
4479

80+
###### Does `utils/recommendation.py` expose:
81+
82+
```python
83+
def generate_recommendations(user_id, model, top_n=10):
84+
...
85+
```
86+
87+
###### Are user-level outputs saved as `reports/user_<id>_recommendations.csv`
88+
4589
##### Model Interpretability
4690

4791
###### Is there an analysis of the key latent factors that drive recommendations (global interpretability)?
@@ -58,12 +102,25 @@
58102

59103
###### For the 2 users from the training set, is there an analysis of why the recommendations were accurate for one and less accurate for the other?
60104

105+
###### Are required visuals present in `reports/` with proper titles and labeled axes?
106+
107+
- `pmf_convergence.png`
108+
- `rmse_comparison.png`
109+
- `predicted_vs_actual.png`
110+
- `user_comparison.png`
111+
61112
##### Streamlit Dashboard
62113

63114
###### Has a Streamlit dashboard been implemented?
64115

65116
###### Does the dashboard take a user ID as input and return recommendations and required visualizations?
66117

118+
###### Does `streamlit run app.py` launch the dashboard successfully?
119+
120+
###### Does the dashboard update recommendations dynamically on user ID input?
121+
122+
###### Does it handle invalid user IDs gracefully (error shown, no crash)?
123+
67124
##### Additional Considerations
68125

69126
###### Is the code well-documented and following these good coding practices:

subjects/ai/vision-track/README.md

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,74 @@ The primary goal of **VisionTrack** is to develop practical skills in building a
8484
- Evaluate the app's performance with multi-stream support using metrics like **precision**, **recall**, and **F1-score**.
8585
- Display performance analysis within the app to inform users of the detection and tracking accuracy.
8686

87+
#### Validation
88+
89+
To ensure project completeness and audit validation, include the following:
90+
91+
1. **Model Artifacts**:
92+
- Save all trained and optimized YOLO model weights in:
93+
```
94+
models/checkpoints/
95+
├── best.pt
96+
├── best_quantized.onnx
97+
└── config.yaml
98+
```
99+
- Include logs or configuration files documenting training and optimization steps.
100+
101+
2. **Evaluation Metrics**:
102+
- Generate and save a report file:
103+
reports/performance_metrics.json
104+
- Example format:
105+
```json
106+
{
107+
"detection_precision": 0.92,
108+
"detection_recall": 0.9,
109+
"f1_score": 0.91,
110+
"average_fps_per_stream": 18.5,
111+
"average_latency_ms": 85.0
112+
}
113+
```
114+
- Minimum passing thresholds:
115+
Precision ≥ 0.85
116+
Recall ≥ 0.80
117+
F1-score ≥ 0.85
118+
Average FPS ≥ 15 (for 720p video)
119+
120+
3. **Real-Time App Test**
121+
- The app must run using:
122+
```
123+
streamlit run app.py
124+
```
125+
- The app should:
126+
Display real-time detection overlays and FPS/latency counters.
127+
Allow toggling of detection and tracking features per stream.
128+
Handle missing or broken video sources gracefully.
129+
130+
4. **ROI Counting Validation**
131+
- Demonstrate ROI-based counting of people entering/exiting the region.
132+
- Save examples in:
133+
```
134+
reports/demo_results/
135+
├── roi_counting_example.png
136+
└── multi_stream_demo.mp4
137+
```
138+
139+
5. **GPU and Fallback Test**
140+
- Check for CUDA availability in your code:
141+
```
142+
import torch
143+
print("Using CUDA:", torch.cuda.is_available())
144+
```
145+
- The app must still run on CPU if CUDA is unavailable (with lower FPS
146+
147+
6. **Error Handling**
148+
- The app must not crash on missing files or failed streams.
149+
150+
- Log errors to:
151+
```
152+
logs/app_errors.log
153+
```
154+
87155
### Project Repository Structure
88156
89157
```
@@ -97,15 +165,24 @@ vision-track/
97165
├── models/
98166
│ ├── yolo_person_detection.py
99167
│ └── __init__.py
168+
│ └── /checkpoints/
169+
│ ├── best.pt
170+
│ ├── best_quantized.onnx
171+
│ └── config.yaml
100172
101173
├── utils/
102174
│ ├── data_loader.py
103175
│ ├── preprocessing.py
104176
│ ├── multi_stream_tracking_helpers.py
105177
│ ├── counting_logic.py
178+
│ ├── VisionTrack_Analysis.ipynb
106179
│ └── __init__.py
107180
108-
├── app.py # Streamlit app for running multi-stream detection, tracking, and counting
181+
├── reports/demo_results/
182+
│ ├── roi_counting_example.png
183+
│ └── multi_stream_demo.mp4
184+
185+
├── app.py
109186
├── README.md # Project overview and setup instructions
110187
└── requirements.txt # List of dependencies
111188
```

subjects/ai/vision-track/audit/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88

99
###### Is a `requirements.txt` file included with all dependencies and specific library versions required to run the project?
1010

11+
###### import test `python -c "import torch, supervision, cv2, streamlit"`
12+
1113
##### Data Processing and Exploratory Data Analysis
1214

1315
###### Does the Jupyter notebook (`VisionTrack_Analysis.ipynb`) include EDA showcasing data distribution, object detection samples, and preprocessing methods?
@@ -16,6 +18,9 @@
1618

1719
###### Does data preprocessing include resizing and normalization, ensuring compatibility with YOLO model input formats?
1820

21+
- Validation of YOLO-compatible annotations (.txt files with class, x, y, w, h).
22+
- Confirm frames are resized and normalized properly before inference.
23+
1924
##### Model Implementation
2025

2126
###### Is the YOLO model implemented for person detection with configuration options for detection thresholds and class-specific tuning?
@@ -32,6 +37,8 @@
3237

3338
###### Does the project include logic for tracking and counting entries and exits within specified regions of interest (ROIs)?
3439

40+
###### Check that trained weights are saved in: `models/checkpoints/best.pt`
41+
3542
##### Streamlit App Development
3643

3744
###### Is the **Streamlit** app implemented to display video feeds with overlaid detection, tracking, and counting information?
@@ -56,6 +63,38 @@
5663

5764
###### Are evaluation metrics presented, showcasing precision, recall, and F1-score to assess the effectiveness of detection and tracking?
5865

66+
###### Check:
67+
68+
- Require metrics file:
69+
70+
```
71+
reports/performance_metrics.json
72+
```
73+
74+
- Validate JSON includes:
75+
76+
```json
77+
{
78+
"detection_precision": ...,
79+
"detection_recall": ...,
80+
"f1_score": ...,
81+
"average_fps_per_stream": ...,
82+
"average_latency_ms": ...
83+
}
84+
```
85+
86+
- Add minimum thresholds:
87+
88+
Precision ≥ 0.85
89+
90+
Recall ≥ 0.80
91+
92+
F1 ≥ 0.85
93+
94+
FPS ≥ 15 (720p)
95+
96+
- Add check that metrics are visible in Streamlit dashboard (FPS + latency shown live).
97+
5998
##### Additional Considerations
6099

61100
###### Does the codebase is documented with comments and explanations for readability and maintainability?

subjects/guess-it-1/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Each of the numbers will be your standard input and the purpose of your program
2626
This range should have a space separating the lower limit from the upper one like in the example:
2727

2828
```console
29-
>$ ./your_program
29+
$ ./your_program
3030
189 --> the standard input
3131
120 200 --> the range for the next input, in this case for the number 113
3232
113 --> the standard input

0 commit comments

Comments
 (0)