Skip to content

pull request #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 455 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
455 commits
Select commit Hold shift + click to select a range
9d4dcb9
removed olf comma
camille-bb Apr 12, 2025
3ecc1e0
UI Changed
KAsqech Apr 12, 2025
0de2687
Search bar changes
KAsqech Apr 12, 2025
2b8dc18
Gif
KAsqech Apr 12, 2025
7fabe52
Gif changes
KAsqech Apr 12, 2025
c773d06
More gif changes
KAsqech Apr 12, 2025
677b77b
Site header
KAsqech Apr 12, 2025
fa40f4a
app.py update - finally works!!
danic77 Apr 12, 2025
1137c5d
updated title header
danic77 Apr 12, 2025
7524677
background color
danic77 Apr 12, 2025
88aa54e
added Entry object
camille-bb Apr 12, 2025
ba0a8b5
new header!
danic77 Apr 12, 2025
9b3cf37
updating changes
danic77 Apr 12, 2025
da92475
removed %s in init.sql
danic77 Apr 12, 2025
507713c
entry class
danic77 Apr 12, 2025
35b8217
new changes
camille-bb Apr 12, 2025
41393e5
updated header again
danic77 Apr 12, 2025
61194bd
added dumbly
camille-bb Apr 12, 2025
669bdd4
added ourentries
yoojungjoe3 Apr 12, 2025
467eb5e
added entry and final sorting
yoojungjoe3 Apr 12, 2025
0b33663
Entry __repr__
danic77 Apr 12, 2025
e3767cb
mini
danic77 Apr 12, 2025
79f922e
min
danic77 Apr 12, 2025
c8f4ab8
updated ourentries to jsonify
yoojungjoe3 Apr 12, 2025
32656ad
changed
camille-bb Apr 12, 2025
fa0ddea
mini change
danic77 Apr 12, 2025
1a96aa8
html fix attempt #1
yoojungjoe3 Apr 12, 2025
e8f1fdb
html fix attempt #2
yoojungjoe3 Apr 12, 2025
827a4c7
html fix attempt #3
yoojungjoe3 Apr 12, 2025
f82fe6c
html fix attempt #4
yoojungjoe3 Apr 12, 2025
9bda83b
results work!!!!!!
danic77 Apr 13, 2025
1e28923
fixed formatting
danic77 Apr 13, 2025
8ef7462
added padding
camille-bb Apr 13, 2025
7fab8dc
fixed link 1
camille-bb Apr 13, 2025
e49a0f3
fixed link 2
camille-bb Apr 13, 2025
ed37a2c
addited message
camille-bb Apr 13, 2025
01f0be2
making box bigger 1
camille-bb Apr 13, 2025
c91d3e4
new no fanfics message
camille-bb Apr 13, 2025
96a4fb5
commented out start to image stuff
camille-bb Apr 13, 2025
1306acf
made bigger
camille-bb Apr 13, 2025
a9cca61
added one pic
camille-bb Apr 13, 2025
01b77ee
added [ictures
camille-bb Apr 13, 2025
926cf65
changed HG to PD
camille-bb Apr 13, 2025
77fa186
elif to else if
camille-bb Apr 13, 2025
624a41b
made images string
camille-bb Apr 13, 2025
c5ebf1f
fix to images
camille-bb Apr 13, 2025
cf02288
change to images
camille-bb Apr 13, 2025
5c95b36
added comma
camille-bb Apr 13, 2025
904b3f3
wrapped image
camille-bb Apr 13, 2025
eb39e8e
made console logs
camille-bb Apr 13, 2025
8eb9025
removed or
camille-bb Apr 13, 2025
55d3422
try again
camille-bb Apr 13, 2025
56e3148
edit
camille-bb Apr 13, 2025
0a8fe1a
changed path
camille-bb Apr 13, 2025
f8d3a25
fixed bugs
camille-bb Apr 13, 2025
cab7f8d
added qutoes
camille-bb Apr 13, 2025
d6e66f3
improved vector_search w/ averages
danic77 Apr 13, 2025
9cbe445
new images
camille-bb Apr 13, 2025
58d1bac
new data in init.sql
danic77 Apr 13, 2025
60f8204
added merlin
camille-bb Apr 13, 2025
91aa4f7
new entries to database!
danic77 Apr 13, 2025
ccecc41
fixed init.sql
danic77 Apr 13, 2025
a79c832
fixed init.sql formatting issues
danic77 Apr 13, 2025
fea73a9
cut down init.sql and images
danic77 Apr 14, 2025
c60e875
SVD integration
KAsqech Apr 14, 2025
5edb243
added search button; fixed svd_vector_search; improved performance
danic77 Apr 14, 2025
65e626e
added my imortal
camille-bb Apr 14, 2025
6801d4d
prettier button
camille-bb Apr 14, 2025
201a333
changed colours of button
camille-bb Apr 14, 2025
349d30c
switched fandom and ship
camille-bb Apr 14, 2025
70d5e1b
replaced all None with nothing in init
camille-bb Apr 14, 2025
f8d961a
fixed avg afer SVD to reduce valuable information cuts
yoojungjoe3 Apr 14, 2025
0736da6
limit to 10 outputs
yoojungjoe3 Apr 14, 2025
66bae6e
changed weights
yoojungjoe3 Apr 14, 2025
c0fe02d
print statements
yoojungjoe3 Apr 14, 2025
1cb764c
fixed print error
danic77 Apr 14, 2025
bf49a83
erased print
yoojungjoe3 Apr 14, 2025
25540d2
Modified app.py and base.html to implement user feedback
KAsqech Apr 17, 2025
8e1f43e
Removed old vector search from app.py
KAsqech Apr 17, 2025
a958fe0
Adding like and dislike buttons for user feedback
KAsqech Apr 17, 2025
330b446
Modified session access
KAsqech Apr 18, 2025
fb146cf
Modified feedback
KAsqech Apr 18, 2025
bf3f016
Modified Rocchio algorithm
KAsqech Apr 18, 2025
f274bab
Modified Rocchio and submit_feedback
KAsqech Apr 18, 2025
8e8d520
Modified submit_feedback
KAsqech Apr 18, 2025
fc34708
Explicitly defining host
KAsqech Apr 18, 2025
5e5b2ba
Reverting mysql_engine
KAsqech Apr 18, 2025
254244f
Changed database credentials to test
KAsqech Apr 18, 2025
74b0852
Removed user for db credentials
KAsqech Apr 19, 2025
d5d67f3
Changed setup of db credentials
KAsqech Apr 19, 2025
d909d66
Changed host
KAsqech Apr 19, 2025
284cac0
Changed host back to correct db
KAsqech Apr 19, 2025
5b29687
Changed db credentials
KAsqech Apr 19, 2025
a940d82
Tried removing flask secret key
KAsqech Apr 19, 2025
0fff92f
Vectorizer thru initialize_precomputed
KAsqech Apr 19, 2025
007e510
Defined all_text to combine all text fields
KAsqech Apr 19, 2025
52e9fdd
Typo
KAsqech Apr 19, 2025
dd01f1c
Changed host to test
KAsqech Apr 20, 2025
b715de7
Reincluded session secret key
KAsqech Apr 20, 2025
40d00a6
Trying different secret key
KAsqech Apr 20, 2025
c64f928
Delaying intialize_precomputed() until first request
KAsqech Apr 20, 2025
7bcbe88
Loading old frontend
KAsqech Apr 20, 2025
c86b810
Testing new iteration of base.html with user feedback in frontend
KAsqech Apr 20, 2025
fca4380
Added a check to filterText()
KAsqech Apr 20, 2025
93ded36
Modified apply rocchio to safely handle feedback
KAsqech Apr 20, 2025
852d344
Testing new backend with functional frontend
KAsqech Apr 20, 2025
83d67aa
changed to new frontend
KAsqech Apr 20, 2025
f3df092
Updated submit_feedback
KAsqech Apr 20, 2025
e4f99a7
Reverted to the prior version of submit_feedback
KAsqech Apr 20, 2025
54c2ba7
Passing the matrix
KAsqech Apr 20, 2025
5d11933
Moved portions of initialize_precomputed around
KAsqech Apr 20, 2025
cce8030
Moved rocchio vars
KAsqech Apr 20, 2025
ff62afc
Moved things around and added raw fields in initialize_precomputed()
KAsqech Apr 20, 2025
236933e
declared global precomputed dictionary
KAsqech Apr 20, 2025
ff3be15
Calling initialize_precomputed
KAsqech Apr 20, 2025
6f0a7bc
removed call to initialize_precomputed
KAsqech Apr 20, 2025
28b2490
changed initialization of precomputed
KAsqech Apr 20, 2025
d9a3a5a
removed call to initialize_precomputed
KAsqech Apr 20, 2025
85564ee
Using field-specfic vectorizers
KAsqech Apr 20, 2025
9ab1656
Changed to field-specific logic
KAsqech Apr 20, 2025
76f09e8
fixed syntax error
KAsqech Apr 20, 2025
71d2265
Prints to test
KAsqech Apr 20, 2025
d631678
Changed precompute function arguments
KAsqech Apr 20, 2025
1c9fd9f
Prints included to test
KAsqech Apr 20, 2025
6a45207
change to rocchio
KAsqech Apr 20, 2025
723bdfb
testing feedback db
KAsqech Apr 20, 2025
4bf8b3e
reverted changes
KAsqech Apr 20, 2025
69a059a
updated db to include feedback table
KAsqech Apr 20, 2025
e8ad36b
Created store_feedback_in_db method
KAsqech Apr 20, 2025
3063a93
Removed startup_precompute
KAsqech Apr 21, 2025
6bfede4
Added prints to test
KAsqech Apr 21, 2025
68d70b4
Moved tests
KAsqech Apr 21, 2025
f6313ae
Added test prints
KAsqech Apr 21, 2025
20c6ad2
edited vectorizer
KAsqech Apr 21, 2025
5549f59
Added function to wat for mysql connection
KAsqech Apr 21, 2025
b02f89d
Edited waiting function
KAsqech Apr 21, 2025
51e36d4
Added execute query function to MySQlDatabaseHandler
KAsqech Apr 21, 2025
69d6811
Added self.connection
KAsqech Apr 21, 2025
d8f5794
edited execute_query
KAsqech Apr 21, 2025
b486b1a
modified syntax error
KAsqech Apr 21, 2025
b61cfb0
modified call to execute_query
KAsqech Apr 21, 2025
0607586
changed svd query vector search
KAsqech Apr 23, 2025
31f6b77
Fixed typo
KAsqech Apr 23, 2025
c3c3733
changed pop to get
KAsqech Apr 23, 2025
a2c831d
Corrected typos
KAsqech Apr 23, 2025
3aa3934
Changed alpha, beta, and gamma
KAsqech Apr 23, 2025
a956a1f
Filtering out disliked search results
KAsqech Apr 23, 2025
0546acb
Fixed sorted_keys
KAsqech Apr 23, 2025
ff7b306
Added def for avg
KAsqech Apr 23, 2025
04bbf30
adjusted apply_rocchio_feedback
KAsqech Apr 23, 2025
4acc340
uncommented
KAsqech Apr 23, 2025
b723c7d
Modified alpha, beta, and gamma
KAsqech Apr 23, 2025
22c7c01
Modified liked and disliked
KAsqech Apr 23, 2025
8be0e9c
Changed alpha
KAsqech Apr 23, 2025
6ea9a89
Added prints for debugging
KAsqech Apr 23, 2025
270f2e9
prints to test
KAsqech Apr 23, 2025
396f595
added images to image file
camille-bb Apr 24, 2025
a227985
added enter
camille-bb Apr 24, 2025
6764e2b
revert back to P04
yoojungjoe3 Apr 24, 2025
e69b355
revert back to P04 pt.2
yoojungjoe3 Apr 24, 2025
af810da
updated database w/ new fandoms
danic77 Apr 24, 2025
0b5bc1c
revert back to P04 working code
danic77 Apr 24, 2025
85b23a2
update to database
danic77 Apr 24, 2025
9c1680a
deleted base1 and app1
camille-bb Apr 24, 2025
ba5c2ab
added enter key
camille-bb Apr 24, 2025
ed56f93
added image implementation into the code
camille-bb Apr 24, 2025
bf34738
updated database
danic77 Apr 24, 2025
580abac
update database (again)
danic77 Apr 24, 2025
ee4a79a
fixed button
camille-bb Apr 24, 2025
2eaf805
database >(
danic77 Apr 24, 2025
86934c4
cleaning up
danic77 Apr 24, 2025
0808f40
fix
danic77 Apr 24, 2025
b3174c2
fix
camille-bb Apr 24, 2025
4b60304
removed useless comments
camille-bb Apr 24, 2025
ec0fce5
dropdown version 1
camille-bb Apr 24, 2025
13d6d69
undo
camille-bb Apr 24, 2025
6949cb6
try again
camille-bb Apr 24, 2025
fa16e1f
changed location
camille-bb Apr 24, 2025
f719b64
made all names the same for dropdown
camille-bb Apr 24, 2025
79368a1
added all fandoms to dropdown
camille-bb Apr 25, 2025
160511e
got fandom dropdown to the next
camille-bb Apr 25, 2025
5f2353e
added into SVD Vector Search
camille-bb Apr 25, 2025
cc67db3
like/dislike buttons
danic77 Apr 25, 2025
e8e64b4
new values for the dropdown list
camille-bb Apr 25, 2025
b0d6b87
fixed spelling
camille-bb Apr 25, 2025
e201233
added Select none option
camille-bb Apr 25, 2025
bd4fb8e
alphabetised drop down list
camille-bb Apr 25, 2025
a010e23
restructured with flex box
camille-bb Apr 25, 2025
34c93cd
decorated the drop down list
camille-bb Apr 25, 2025
3443524
changed colours
camille-bb Apr 25, 2025
4d89303
changed colours again
camille-bb Apr 25, 2025
eb1be00
added padding
camille-bb Apr 25, 2025
fdc109a
put back under
camille-bb Apr 25, 2025
ea79c16
columned them
camille-bb Apr 25, 2025
5d1630f
changed colour and padding
camille-bb Apr 25, 2025
6107b90
limited to 3 and rounded box
camille-bb Apr 25, 2025
e72b6d5
red version
camille-bb Apr 25, 2025
3a32137
full round edges coloured
camille-bb Apr 25, 2025
0be97c9
added quill cursor
camille-bb Apr 25, 2025
388dc51
made the quill more tiny
camille-bb Apr 25, 2025
a12cf1b
blah
camille-bb Apr 25, 2025
34df084
moved body style into its own thing
camille-bb Apr 25, 2025
238b05d
fixed url
camille-bb Apr 25, 2025
f050c0f
tried moving scroll-box into css bit for clarity
camille-bb Apr 25, 2025
7b24068
trying to make entries into a scroll shape
camille-bb Apr 25, 2025
88af00d
simplified it
camille-bb Apr 25, 2025
858b23f
changed location of scroll
camille-bb Apr 25, 2025
446d35b
added morw padding to move text down
camille-bb Apr 25, 2025
7b71395
changed scroll and added no repeat
camille-bb Apr 25, 2025
3237295
added padding to the rest of the sides
camille-bb Apr 25, 2025
3e27494
changed padding a bit more
camille-bb Apr 25, 2025
671f153
adjust padding
camille-bb Apr 25, 2025
eb2fb54
changed way of adjusting padding and added font everywhere
camille-bb Apr 25, 2025
95b0c7b
size adjust agin
camille-bb Apr 25, 2025
d704ead
keep width uniform
camille-bb Apr 25, 2025
d878245
make bigger
camille-bb Apr 25, 2025
1e3ae03
fied size of image
camille-bb Apr 25, 2025
78d2dfd
adjusted size
camille-bb Apr 25, 2025
0abf428
rocchio, dislike/like functionality
danic77 Apr 25, 2025
ac4efd2
base.html like/dislike chages
danic77 Apr 25, 2025
a0e4384
Added new scroll
KAsqech Apr 25, 2025
ca271a1
Adding cursor to fandom dropdown
KAsqech Apr 25, 2025
e0be7d0
Removed quill
KAsqech Apr 25, 2025
d0c7eea
Changes to cursor
KAsqech Apr 25, 2025
25bb8bd
Changes to cursor
KAsqech Apr 25, 2025
f148153
new scroll image
danic77 Apr 25, 2025
90721b5
image revamp
danic77 Apr 25, 2025
8565491
fixed merge issue
danic77 Apr 25, 2025
85e85da
fixed scroll bar
danic77 Apr 25, 2025
05478f0
new scroll
danic77 Apr 25, 2025
6c0729c
fixed scroll ui
danic77 Apr 25, 2025
f61138b
init fix
danic77 Apr 25, 2025
882d588
changed colours of the rating
danic77 Apr 25, 2025
34dd2ae
changed colours
danic77 Apr 25, 2025
756010e
fixed scroll and colours
camille-bb Apr 25, 2025
155c7f5
try colours again
camille-bb Apr 25, 2025
1081d4b
removed rocchio
danic77 Apr 25, 2025
530f385
blahhh
camille-bb Apr 25, 2025
18d8698
white when do back
camille-bb Apr 25, 2025
02d5e6d
removed spanish
camille-bb Apr 25, 2025
27c7c91
fixed liked changing colors
danic77 Apr 25, 2025
17a12fb
changed border radius and font of stuff
camille-bb Apr 25, 2025
30fd8c0
changed to go back to grey
camille-bb Apr 25, 2025
eb08d14
replaced some images with transparent
camille-bb Apr 25, 2025
6031a3e
rocchio
danic77 Apr 25, 2025
e9dbbed
cleaning up code
danic77 Apr 25, 2025
8f9e76b
cleaned up print statements
danic77 Apr 25, 2025
a51a96a
cleaned old database
camille-bb Apr 25, 2025
004f540
removed problem
camille-bb Apr 25, 2025
62c1351
put database back
camille-bb Apr 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,5 @@ htmlcov/
dist/
build/
*.egg-info/
helpers/*
helpers/*
/backend/myvenv4300
72 changes: 72 additions & 0 deletions DataSet.csv

Large diffs are not rendered by default.

67 changes: 67 additions & 0 deletions DataSet_clean.csv

Large diffs are not rendered by default.

302 changes: 277 additions & 25 deletions backend/app.py
Original file line number Diff line number Diff line change
@@ -1,46 +1,298 @@
import json
import os
from flask import Flask, render_template, request
import re
import numpy as np
from flask import Flask, render_template, request, jsonify, Response
from flask_cors import CORS
from helpers.MySQLDatabaseHandler import MySQLDatabaseHandler
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import TruncatedSVD
from sklearn.preprocessing import normalize

# ROOT_PATH for linking with all your files.
# Feel free to use a config.py or settings.py with a global export variable
os.environ['ROOT_PATH'] = os.path.abspath(os.path.join("..",os.curdir))
from helpers.MySQLDatabaseHandler import MySQLDatabaseHandler
# import sqlalchemy as db
# engine = db.create_engine("mysql+pymysql://admin:[email protected]/kardashiandb")
# cursor = engine.connect()

# Set ROOT_PATH for linking with all your files.
os.environ['ROOT_PATH'] = os.path.abspath(os.path.join("..", os.curdir))

# These are the DB credentials for your OWN MySQL
# Don't worry about the deployment credentials, those are fixed
# You can use a different DB name if you want to
# Database credentials (adjust if needed)
LOCAL_MYSQL_USER = "root"
LOCAL_MYSQL_USER_PASSWORD = "admin"
LOCAL_MYSQL_USER_PASSWORD = "bobbob"
LOCAL_MYSQL_PORT = 3306
LOCAL_MYSQL_DATABASE = "kardashiandb"

mysql_engine = MySQLDatabaseHandler(LOCAL_MYSQL_USER,LOCAL_MYSQL_USER_PASSWORD,LOCAL_MYSQL_PORT,LOCAL_MYSQL_DATABASE)

# Path to init.sql file. This file can be replaced with your own file for testing on localhost, but do NOT move the init.sql file
# Initialize database handler and load init.sql into the database
mysql_engine = MySQLDatabaseHandler(LOCAL_MYSQL_USER, LOCAL_MYSQL_USER_PASSWORD, LOCAL_MYSQL_PORT, LOCAL_MYSQL_DATABASE)
mysql_engine.load_file_into_db()

app = Flask(__name__)
CORS(app)

# Sample search, the LIKE operator in this case is hard-coded,
# but if you decide to use SQLAlchemy ORM framework,
# there's a much better and cleaner way to do this
def sql_search(episode):
query_sql = f"""SELECT * FROM episodes WHERE LOWER( title ) LIKE '%%{episode.lower()}%%' limit 10"""
keys = ["id","title","descr"]
data = mysql_engine.query_selector(query_sql)
return json.dumps([dict(zip(keys,i)) for i in data])
current_dir = os.path.dirname(os.path.abspath(__file__))
init_sql_path = os.path.join(current_dir, "..", "init.sql")

# Precomputed models will be stored here:
precomputed = {}

def precompute_field(field_texts, n_components=100):
"""
Given a list of texts, create and return:
- a TfidfVectorizer (configured with analyzer='char_wb', ngram_range=(3,5), stop_words='english')
- a fitted TruncatedSVD model (with a proper n_components)
- the SVD-reduced matrix for the field texts.
Replace empty texts with a single space.
"""
# Ensure no field is empty:
texts = [text if text.strip() != "" else " " for text in field_texts]
vectorizer = TfidfVectorizer(analyzer='char_wb', ngram_range=(3, 5), stop_words='english')
tfidf_matrix = vectorizer.fit_transform(texts)
# Ensure n_components is at least 1 and does not exceed available features - 1:
n_comp = max(1, min(n_components, tfidf_matrix.shape[1] - 1))
svd = TruncatedSVD(n_components=n_comp)
reduced_matrix = svd.fit_transform(tfidf_matrix)
return {"vectorizer": vectorizer, "svd": svd, "matrix": reduced_matrix}

def initialize_precomputed():
"""
Loads all entries from the database and precomputes the TF-IDF + SVD representations
for fields: names, fandoms, ships, reviews, abstracts.
Also stores the raw lists for later reconstruction.
"""
query = "SELECT Name, Fandom, Ships, Rating, Link, Review, Abstract FROM fics;"
rows = list(mysql_engine.query_selector(query))
# Extract raw fields:
names = [r[0] for r in rows]
fandoms = [r[1] for r in rows]
ships = [r[2] for r in rows]
ratings = [r[3] for r in rows]
links = [r[4] for r in rows]
reviews = [r[5] for r in rows]
abstracts = [r[6] for r in rows]

global precomputed
precomputed['names'] = precompute_field(names)
precomputed['fandoms'] = precompute_field(fandoms)
precomputed['ships'] = precompute_field(ships)
precomputed['reviews'] = precompute_field(reviews)
precomputed['abstracts'] = precompute_field(abstracts)

# Store raw versions to reconstruct final Entry objects:
precomputed['names_raw'] = names
precomputed['fandoms_raw'] = fandoms
precomputed['ships_raw'] = ships
precomputed['ratings'] = ratings
precomputed['links'] = links
precomputed['reviews_raw'] = reviews
precomputed['abstracts_raw'] = abstracts

ratings_raw = [r[3] for r in rows]

def safe_num(x):
try:
return float(x)
except (TypeError, ValueError):
return 0

precomputed['ratings'] = [safe_num(x) for x in ratings_raw]

# Precompute on startup
initialize_precomputed()

#function creates object in the format that we want printed out
class Entry:
def __init__(self, name, ship, fandom, rating, abstract, link):
self.name = name
self.ship = ship
self.fandom = fandom
self.rating = rating
self.abstract = abstract
self.link = link
if fandom == ('"Harry Potter"') or fandom == ('Harry Potter'):
self.image = "/static/images/dumbly.jpg"
elif fandom == ('"Kardashians"') or fandom == ('Kardashians'):
self.image ="/static/images/KIM.jpg"
elif fandom == ('"Merlin"'):
self.image ="/static/images/Merlin.jpg"
elif fandom == ('"One Direction"'):
self.image = "/static/images/OneD.jpeg"
elif fandom == ('"Hunger Games"') or fandom == ('Hunger Games'):
self.image = "/static/images/HG.jpeg"
elif fandom == ('"The Princess Diaries"') or fandom == ('Princess Diaries'):
self.image = "/static/images/PD.jpg"
#DC
elif fandom == ('DC Superheroes'):
self.image = "/static/images/DC.jpeg"
#Sherlock
elif fandom == ('Sherlock'):
self.image = "/static/images/Sherlock.jpeg"
#MCU
elif fandom == ('Marvel'):
self.image = "/static/images/MCU.jpeg"
#Supernatural
elif fandom == ('Supernatural'):
self.image = "/static/images/Supernatural.jpeg"
#My hero academia
elif fandom == ('My Hero Academia'):
self.image = "/static/images/MHA.jpeg"
#Star Wars
elif fandom == ('Star Wars'):
self.image = "/static/images/SW.jpeg"
#Doctor Who
elif fandom == ('Doctor Who'):
self.image = "/static/images/DW.jpeg"
#Naruto
elif fandom == ('Naruto'):
self.image = "/static/images/Naruto.jpeg"
#Star Trek
elif fandom == ('Star Trek'):
self.image = "/static/images/StarTrek.jpeg"
#Teen Wolf
elif fandom == ('Teen Wolf'):
self.image = "/static/images/TeenWolf.jpeg"
#How to train your dragon
elif fandom == ('How to Train Your Dragon'):
self.image = "/static/images/HTTYD.jpeg"
#greek myths
elif fandom == ('Greek Mythology'):
self.image = "/static/images/GM.png"
#Pirates of the carribean
elif fandom == ('Pirates of the Caribbean'):
self.image = "/static/images/POTC.jpeg"
else:
self.image = "/static/images/fandom.jpeg"

def to_dict(self):
return {
"name": self.name,
"ship": self.ship,
"fandom": self.fandom,
"rating": self.rating,
"abstract": self.abstract,
"link": self.link,
"image": self.image
}

def __repr__(self):
return f"Entry(Name: {self.name}, Ships: {self.ship}, Fandoms: {self.fandom}, Ratings: {self.rating}, Abstracts: {self.abstract}, Links: {self.link}, Image: {self.image})"

def compute_precomputed_similarity(precomputed_obj, query):
"""
Given a precomputed dictionary (with vectorizer, svd, and matrix)
and a query, transform the query and compute cosine similarities.
Returns a 1D array of cosine similarities.
"""
vectorizer = precomputed_obj["vectorizer"]
svd = precomputed_obj["svd"]
matrix = precomputed_obj["matrix"]
query_tfidf = vectorizer.transform([query])
query_reduced = svd.transform(query_tfidf)
return cosine_similarity(query_reduced, matrix).flatten()

def SVD_vector_search(user_query, fandom_dropdown):
"""
Compute combined similarity scores using precomputed TF-IDF + SVD representations.
Applies field weights: Name (3.0), Fandom (2.0), Ship (1.5), Abstract (1.0), Review (1.0).
Returns a list of Entry objects for the best matches.
"""
cleaned_query = clean_text(user_query)

# Compute similarities for each field using the precomputed objects:
sim_names = compute_precomputed_similarity(precomputed['names'], cleaned_query)
sim_fandoms = compute_precomputed_similarity(precomputed['fandoms'], cleaned_query)
sim_ships = compute_precomputed_similarity(precomputed['ships'], cleaned_query)
sim_abstracts = compute_precomputed_similarity(precomputed['abstracts'], cleaned_query)
sim_reviews = compute_precomputed_similarity(precomputed['reviews'], cleaned_query)

# Set weights for each field
weight_names = 3.0
weight_fandoms = 2.0
weight_ships = 1.5
weight_abstracts = 1.0
weight_reviews = 1.0

# Combine weighted similarities (elementwise sum)
combined_similarities = (
weight_names * sim_names +
weight_fandoms * sim_fandoms +
weight_ships * sim_ships +
weight_abstracts * sim_abstracts +
weight_reviews * sim_reviews
)

# Create dictionary mapping record index (starting at 1) to similarity score
total_sim_dict = {i + 1: float(score) for i, score in enumerate(combined_similarities)}
sorted_keys = sorted(total_sim_dict, key=total_sim_dict.get, reverse=True)


# Optional: filter by threshold relative to average nonzero similarity
nonzero = [score for score in total_sim_dict.values() if score != 0]
avg = sum(nonzero)/len(nonzero) if nonzero else 0

ourentries = []
for idx in sorted_keys:
if total_sim_dict[idx] > avg and len(ourentries) < 10:
i = idx - 1


fandom = precomputed['fandoms_raw'][i]

if fandom == fandom_dropdown or fandom_dropdown == "":
entry = Entry(
precomputed['names_raw'][i],
precomputed['ships_raw'][i],
precomputed['fandoms_raw'][i],
precomputed['ratings'][i],
precomputed['abstracts_raw'][i],
precomputed['links'][i]
)
ourentries.append(entry)
return ourentries


@app.route("/")
def home():
return render_template('base.html',title="sample html")
return render_template('base.html', Name="sample html")


def clean_text(user_query):
"""Convert text to lowercase and remove punctuation."""
return re.sub(r'[^\w\s]', '', user_query.lower())


@app.route("/fics")
def fics_search():
user_query = request.args.get("Name")
fandom_dropdown = request.args.get("Fandom_Dropdown")
if not user_query:
return ("Please input a query :)"), 400


ourentries = SVD_vector_search(user_query, fandom_dropdown)
for e in ourentries:
e.rating = mysql_engine.get_rating(e.name)
ourentries_dicts = [entry.to_dict() for entry in ourentries]

return jsonify({
"ourentries": ourentries_dicts,
})

@app.route('/like', methods=['POST'])
def like():
data = request.get_json()
name = data.get('name')
# increment in database, return new rating
new_rating = mysql_engine.increment_rating(name)
return jsonify({'rating': new_rating})

@app.route("/episodes")
def episodes_search():
text = request.args.get("title")
return sql_search(text)
@app.route('/dislike', methods=['POST'])
def dislike():
data = request.get_json()
name = data.get('name')
new_rating = mysql_engine.decrement_rating(name)
return jsonify({'rating': new_rating})

if 'DB_NAME' not in os.environ:
app.run(debug=True,host="0.0.0.0",port=5000)
app.run(debug=True, host="0.0.0.0", port=5000)
27 changes: 27 additions & 0 deletions backend/clean_csv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import re
import csv

input_path = "DataSet.csv"
output_path = "DataSet_clean.csv"

# This regex handles quoted fields with commas and inner quotes
pattern = re.compile(r'"((?:[^"]|"")*?)"|([^,]+)')

def extract_fields(line):
fields = []
for match in pattern.finditer(line):
value = match.group(1) or match.group(2)
if value is not None:
value = value.replace('""', '"').strip()
fields.append(value)
return fields[:7] # Return only first 7 fields

with open(input_path, "r", encoding="utf-8") as infile, \
open(output_path, "w", newline='', encoding="utf-8") as outfile:

writer = csv.writer(outfile, quoting=csv.QUOTE_ALL)
for line in infile:
if line.strip(): # skip empty lines
fields = extract_fields(line)
if len(fields) == 7:
writer.writerow(fields)
Loading