Skip to content

Ben backend #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 76 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
ba52b30
Add data exploration file
Mar 17, 2025
b7badaf
Add a books json file
Mar 19, 2025
6fac54d
Editted app.py by adding get books function to return first 10 books
Mar 19, 2025
3cf12dd
Modify endpoint
Mnu02 Mar 19, 2025
dd3b634
Updated base.html, removed old style.css, added new styles.css and sc…
Mar 19, 2025
a0d93eb
Add new dataset class with unimplemented functions
Mar 21, 2025
7a3628c
Add processing class
Mnu02 Mar 21, 2025
030e398
Merge pull request #1 from Peter-N-Wainaina/frank-backend
Mnu02 Mar 21, 2025
e4a9f44
Remove stuff dealing with dataset from processing
Mnu02 Mar 21, 2025
6d442d1
Add empty __init__.py file in backend
Mnu02 Mar 21, 2025
410f598
modified: .gitignore
Mar 20, 2025
e503ad4
Merge branch 'master' into brandon-frontend
kipkorir98 Mar 21, 2025
dfbbe2e
Merge pull request #2 from Peter-N-Wainaina/brandon-frontend
kipkorir98 Mar 21, 2025
81f82b4
Add implementatin for getting recs by author and genre using Jaccard
Mnu02 Mar 21, 2025
28ab54f
Implemented get recommendations by title together with its helper fun…
Mar 21, 2025
42721d0
Resolved merge conflict in backend/app.py and merged changes
Mar 21, 2025
e36a22b
Merge pull request #3 from Peter-N-Wainaina/ben_backend
Benja1958 Mar 21, 2025
ab54373
Implement and test Database class, Add utils and constants files.
Mar 21, 2025
12319b9
Add implementation to endpoints
Mnu02 Mar 21, 2025
7c14a79
Merge branch 'master' into frank-processing
Mnu02 Mar 21, 2025
7cb35bb
Merge pull request #4 from Peter-N-Wainaina/frank-processing
Mnu02 Mar 21, 2025
14ff308
remove script.js and add script to base.html
Mar 21, 2025
8e9b17d
implement getting user input, modifying it and sending it to backend
Mar 21, 2025
ef50a74
Merge pull request #5 from Peter-N-Wainaina/vera-frontend
kipkorir98 Mar 21, 2025
8df2c6b
Remove unnecessary endpoints from app
Mnu02 Mar 21, 2025
e1da22a
Add score to book info
Mnu02 Mar 21, 2025
5a95b06
implemented main get_all_books function to be called by fronend
Mar 21, 2025
8459aea
resolved conflicts
Mar 21, 2025
8f9ea44
Merge pull request #6 from Peter-N-Wainaina/ben_backend
Benja1958 Mar 21, 2025
964f4b3
Fix errors in app.py, and refactor tests folder
Mar 21, 2025
fdf09c0
Change class name from Processing to Processor
Mnu02 Mar 21, 2025
4edf159
Add popular books dataset
Mar 21, 2025
dff281b
Extract test dataset variable
Mnu02 Mar 21, 2025
881eca1
Add json input to Processor initializer
Mnu02 Mar 21, 2025
c1fd738
Remove test dataset declaration
Mnu02 Mar 21, 2025
df1fc58
Modified the frontend
Mar 21, 2025
75f6d7c
Add and test titles index
Mar 22, 2025
f2f0fa4
Added tests for category and author
Mnu02 Mar 22, 2025
c9fa53b
Modify inputs for getting recs by categories and authors
Mnu02 Mar 22, 2025
d98d78b
Merge pull request #7 from Peter-N-Wainaina/frank-testing
Mnu02 Mar 22, 2025
cac7abf
worked on get_recommended_books and the frontend logic for that
Mar 22, 2025
76c08cb
Merge pull request #8 from Peter-N-Wainaina/brandon
Mnu02 Mar 22, 2025
50fc56e
Implemented a placeholder function for get recomendations by title
Mar 22, 2025
54096e7
Merge pull request #9 from Peter-N-Wainaina/ben_backend
kipkorir98 Mar 22, 2025
e0bfb49
Fix errors in processing and frontend
Mar 22, 2025
eb21610
Add dataset for testing processor
Mnu02 Mar 22, 2025
39e285f
Add tests for processor
Mnu02 Mar 22, 2025
58794f8
Debug processor according to expected inputs and jaccard
Mnu02 Mar 22, 2025
83412d4
Implemented get recommendations by title to use cosine similarity
Mar 23, 2025
89747d4
Merged 'master' into 'ben_backend' to bring updates from the remote r…
Mar 23, 2025
4d6b3fa
Merge pull request #10 from Peter-N-Wainaina/ben_backend
Benja1958 Mar 23, 2025
c3e6824
Updated get recommendations by title to use cosine similarity
Mar 23, 2025
d35208d
Merge branch 'master' of https://github.com/Peter-N-Wainaina/rank_n_r…
Mar 23, 2025
46c1dc4
Merge pull request #11 from Peter-N-Wainaina/ben_backend
Benja1958 Mar 23, 2025
512c259
Fixed the Frontend - added a favicon, added head tags, fixed images a…
Mar 23, 2025
7aa4317
add functionality for suggestions while user types input
Mar 24, 2025
0187653
Merge branch 'master' of github-personal:Peter-N-Wainaina/rank_n_read…
Mar 24, 2025
29068c4
fix identation
Mar 24, 2025
8d6fa31
Update .gitignore
Mar 24, 2025
5f93932
Merge branch 'vera-frontend'
Mar 24, 2025
60af1c6
Implement get recommendations function and remove NaN values from dat…
Mar 24, 2025
d13aa2a
added test cases for get recommendations by title
Mar 24, 2025
774e613
Resolved merge conflicts in processing.py and test_processing.py
Mar 24, 2025
4f83e2f
Merge pull request #12 from Peter-N-Wainaina/ben_backend
Benja1958 Mar 24, 2025
f632d84
Implement title vocab frequency and cutoff threshold for title search
Mar 25, 2025
1376613
implemented refresh and feedback when no books found
Mar 25, 2025
395facf
Added get books recomndations by description
Apr 12, 2025
e08bfe1
Added get books recommendations by description
Benja1958 Apr 12, 2025
4435f59
Refactor tokenizing funcs to clean up the code - tests still work
Mnu02 Apr 12, 2025
b6bc3b2
Implement book data dictionary
Apr 12, 2025
52a792e
Add skeleton funcs for svd
Mnu02 Apr 13, 2025
e4e372d
Fix return typr for create tfidf
Mnu02 Apr 13, 2025
8492473
Implemented create_tfidf_matrix and transform_query
Apr 13, 2025
61d46ee
Implemented create_tfidf_matrix and transform_query
Benja1958 Apr 13, 2025
307e21c
Resolve merge conflict in test_processing.py
Mnu02 Apr 13, 2025
8933a05
changed create_tfidf_matrix function to take in books metadata
Apr 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ env/
Docker/__pycache__

venv/
project-venv/
test-env/

*.pyc
__pycache__/
Expand All @@ -17,4 +19,6 @@ dist/
build/
*.egg-info/
helpers/*
json_template/
json_template/

.history
12 changes: 12 additions & 0 deletions backend/.coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# .coveragerc
[run]
branch = True
source = backend

[report]
omit =
*/__init__.py
*/tests/*
*/config.py
show_missing = True
skip_covered = True
Empty file added backend/__init__.py
Empty file.
86 changes: 55 additions & 31 deletions backend/app.py
Original file line number Diff line number Diff line change
@@ -1,46 +1,70 @@
import json
import os
from flask import Flask, render_template, request
from flask_cors import CORS
from helpers.MySQLDatabaseHandler import MySQLDatabaseHandler
import pandas as pd
from flask import Flask, render_template, request,jsonify
from flask_cors import CORS

# ROOT_PATH for linking with all your files.
# Feel free to use a config.py or settings.py with a global export variable
os.environ['ROOT_PATH'] = os.path.abspath(os.path.join("..",os.curdir))

# Get the directory of the current script
current_directory = os.path.dirname(os.path.abspath(__file__))

# Specify the path to the JSON file relative to the current script
json_file_path = os.path.join(current_directory, 'init.json')

# Assuming your JSON data is stored in a file named 'init.json'
with open(json_file_path, 'r') as file:
data = json.load(file)
episodes_df = pd.DataFrame(data['episodes'])
reviews_df = pd.DataFrame(data['reviews'])
from .processing import Processor
from .dataset import Dataset

app = Flask(__name__)
CORS(app)

# Sample search using json with pandas
def json_search(query):
matches = []
merged_df = pd.merge(episodes_df, reviews_df, left_on='id', right_on='id', how='inner')
matches = merged_df[merged_df['title'].str.lower().str.contains(query.lower())]
matches_filtered = matches[['title', 'descr', 'imdb_rating']]
matches_filtered_json = matches_filtered.to_json(orient='records')
return matches_filtered_json
processor = Processor()

@app.route("/")
def home():
return render_template('base.html',title="sample html")

@app.route("/episodes")
def episodes_search():
text = request.args.get("title")
return json_search(text)
@app.route("/getbooks", methods=["POST"])
def books_search():
user_input = request.get_json()
books = processor.get_recommended_books(user_input)
result_json = jsonify(books)
return result_json

# @app.route("/titles")
# def get_title_suggestions():
# query = request.args.get("q", "").lower()

# suggestions = []
# for title in processor.books.keys():
# if query in title.lower():
# suggestions.append(title)

# return jsonify(suggestions[:10])

# @app.route("/authors")
# def get_author_suggestions():
# query = request.args.get("q", "").lower()
# seen = set()
# suggestions = []

# for book_list in processor.books.values():
# for book in book_list:
# for author in book.get("authors", []):
# author_lower = author.lower()
# if query in author_lower and author_lower not in seen:
# suggestions.append(author)
# seen.add(author_lower)

# return jsonify(suggestions[:10])

@app.route("/categories")
def get_category_suggestions():
query = request.args.get("q", "").lower()
seen = set()
suggestions = []

for book_list in processor.books.values():
for book in book_list:
for category in book.get("categories", []):
category_lower = category.lower()
if query in category_lower and category_lower not in seen:
suggestions.append(category)
seen.add(category_lower)

return jsonify(suggestions[:5])

if 'DB_NAME' not in os.environ:
app.run(debug=True,host="0.0.0.0",port=5000)
app.run(debug=True,host="0.0.0.0",port=5001)
4 changes: 4 additions & 0 deletions backend/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
import os

ROOT_PATH = os.path.abspath(os.path.dirname(__file__))
DEFAULT_BOOKS_JSON_FILE = os.path.join(ROOT_PATH,'data', 'popular_books.json')
25 changes: 25 additions & 0 deletions backend/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from types import SimpleNamespace

# Database Book Record Keys
CATEGORY_KEY = "categories"
AUTHOR_KEY = "authors"
SCORE_KEY = "score"
DESCRIPTION_KEY = "description"

NOT_AVAILABLE = "NOT AVAILABLE"

# Frontend Input Keys
INPUT_TITLES_KEY = "titles"
INPUT_AUTHORS_KEY = "authors"
INPUT_CATEGORIES_KEY = "categories"

DEFAULT_RECS_WEIGHTS = SimpleNamespace(
TITLES = 0.6,
AUTHORS = 0.3,
CATEGORIES = 0.1
)

DEFAULT_RECS_SIZE = 20

# Processor Constants
NUM_LATENT_SEMANTIC_CONCEPTS = 100
Loading