Skip to content

SJEC_session1_4SO21AI062_Vijeth_Fernandes #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

21Vijeth
Copy link

@21Vijeth 21Vijeth commented Jun 22, 2024

Did changes in docker-compose.yml, Dockerfile.

Made a python file which has the code for extracting data from the blog.python.org.

This keeps extracting the data from the blog.python.org until there are no post.

Scraper.py and postgresql runs in the container output is stored in the postgresql container.

Stores the data in columns (id, Title, Author, Content)

docker-compose build (to build the image)
docker-compose up (to start the containers)

docker exec -it postgres_container psql -U postgres (to access the postgresql in container)
\c blogdata;
SELECT * FROM blog_posts; (if you run this wont be able to properly view the data from the command line but the data is stored in the columns cause of the "content column" data you cannot view the data properly)

SELECT id, date, title, author
FROM blog_posts; (you can see the data stored in these columns ,)

@21Vijeth 21Vijeth changed the title Add Webscraper SJEC_session1_4SO21AI062_Vijeth_Fernandes Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant