-
Notifications
You must be signed in to change notification settings - Fork 2
Database creation
-
Ensure that the the Docker installed and running
-
Stop and Remove the Existing Container
docker stop chatlse-postgres
docker rm chatlse-postgres
- Recreate the PostgreSQL Container and Database
docker run -itd --name chatlse-postgres --restart unless-stopped -p 5432:5432 -e POSTGRES_PASSWORD=chatlse -e POSTGRES_USER=chatlse -e POSTGRES_DB=chatlse -d pgvector/pgvector:0.7.1-pg16
-
Ensure
.envfile exists in the directory. This is same as.env.samplefile POSTGRES_HOST = localhost POSTGRES_USERNAME = chatlse POSTGRES_PASSWORD = chatlse POSTGRES_DATABASE = chatlse POSTGRES_PORT = 5432 POSTGRES_SSL = disable -
Run the crawler
scrapy crawl lse_crawler
The crawler takes about 10 minutes to run, and there will be error messages in the terminal for some URLs. These are mostly links to external websites which were forbidden for the crawler. The final message on the terminal will be "ItemToPostgresPipeline close spider" which means that the crawling is finished.
- Run Queries on the Database
** Adding more here