Skip to content

Database creation

Riya Chhikara edited this page Jul 1, 2024 · 3 revisions

Steps to recreate the PostgreSQL Database

  1. Ensure that the the Docker installed and running

  2. Stop and Remove the Existing Container

docker stop chatlse-postgres

docker rm chatlse-postgres

  1. Recreate the PostgreSQL Container and Database

docker run -itd --name chatlse-postgres --restart unless-stopped -p 5432:5432 -e POSTGRES_PASSWORD=chatlse -e POSTGRES_USER=chatlse -e POSTGRES_DB=chatlse -d pgvector/pgvector:0.7.1-pg16

  1. Ensure .env file exists in the directory. This is same as .env.sample file POSTGRES_HOST = localhost POSTGRES_USERNAME = chatlse POSTGRES_PASSWORD = chatlse POSTGRES_DATABASE = chatlse POSTGRES_PORT = 5432 POSTGRES_SSL = disable

  2. Run the crawler

scrapy crawl lse_crawler

The crawler takes about 10 minutes to run, and there will be error messages in the terminal for some URLs. These are mostly links to external websites which were forbidden for the crawler. The final message on the terminal will be "ItemToPostgresPipeline close spider" which means that the crawling is finished.

  1. Run Queries on the Database

** Adding more here

Clone this wiki locally