Skip to content

Commit d7a887d

Browse files
Merge pull request #114 from dipamsen/crawler-update
Crawler Overhaul
2 parents 6e676f3 + 8af0a27 commit d7a887d

File tree

16 files changed

+630
-114
lines changed

16 files changed

+630
-114
lines changed

.github/workflows/test_and_lint.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ on:
66
- main
77
pull_request:
88
paths:
9-
- 'backend/'
9+
- 'backend/**'
1010

1111

1212
env:

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ iqps.db
66

77
crawler/cache
88
crawler/qp/*.pdf
9-
qp.csv
10-
qp.tar.xz
9+
qp.json
10+
qp.tar.gz
1111

1212
go.work.sum
1313

README.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@
3535

3636
- [About The Project](#about-the-project)
3737
- [Development](#development)
38+
- [Database](#database)
39+
- [Authentication](#authentication)
40+
- [OAuth Flow](#oauth-flow)
41+
- [Crawler](#crawler)
3842
- [Deployment](#deployment)
3943
- [Backend](#backend)
4044
- [Environment Variables](#environment-variables)
@@ -67,7 +71,8 @@ IQPS was originally created by [Shubham Mishra](https://github.com/grapheo12) in
6771
- Set up the database (see [Database](#database))
6872
- Start the Rust backend by running `cargo run .`
6973
3. Set up the frontend by running `pnpm install` and then `pnpm start` in the `frontend/` directory.
70-
4. Profit.
74+
4. (Optional) Set up an http file server for serving static files from the `STATIC_FILE_STORAGE_LOCATION` directory. (eg: `python3 -m http.server 8081`) The host of this server should be set in the `.env` file as `STATIC_FILES_URL`.
75+
5. Profit.
7176

7277
### Database
7378

@@ -121,7 +126,10 @@ A user is considered as an admin if they are a part of the team `GH_ORG_TEAM_SLU
121126

122127
### Crawler
123128

124-
[WIP: Steps to locally set up crawler]
129+
1. Change directory to `crawler/` and run `go mod tidy`.
130+
2. Run the crawler by running `go run crawler.go`. (Make sure you are connected to the campus network)
131+
3. This will generate a `qp.tar.gz` file. Transfer this file to the server's `backend/` folder.
132+
4. In the backend, run `cargo run --bin import-papers` to import the data into the database. (Make sure the database is set up and running)
125133

126134
## Deployment
127135

backend/.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
# Added by cargo
22

3-
/target
3+
/target

backend/Cargo.lock

Lines changed: 129 additions & 16 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

backend/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
name = "iqps-backend"
33
version = "0.1.0"
44
edition = "2021"
5+
default-run = "iqps-backend"
56

67
[dependencies]
78
axum = { version = "0.7.7", features = ["multipart"] }
@@ -10,6 +11,7 @@ clap = { version = "4.5.20", features = ["derive", "env"] }
1011
color-eyre = "0.6.3"
1112
dotenvy = "0.15.7"
1213
duplicate = "2.0.0"
14+
flate2 = "1.0"
1315
hmac = "0.12.1"
1416
http = "1.1.0"
1517
jwt = "0.16.0"
@@ -18,6 +20,8 @@ serde = { version = "1.0.210", features = ["serde_derive"] }
1820
serde_json = "1.0.128"
1921
sha2 = "0.10.8"
2022
sqlx = { version = "0.8.2", features = ["postgres", "runtime-tokio", "chrono"] }
23+
tar = "0.4"
24+
tempfile = "3.17.1"
2125
tokio = { version = "1.40.0", features = ["full"] }
2226
tower-http = { version = "0.6.1", features = ["cors", "trace"] }
2327
tracing = "0.1.40"

0 commit comments

Comments
 (0)