Skip to content

Support Chinese Characters (non-ASCII) by UTF-8 #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ElevenyChen
Copy link

@ElevenyChen ElevenyChen commented Dec 16, 2023

1. Chinese Character Support: The script now correctly handles Chinese characters in CSV and text file outputs.
2. Progress Tracking: It prints the scraping progress, showing the current work ID and count, enhancing user experience during long scraping sessions.
3. Richer Text File Output: Output text files are now reformatted to include detailed metadata such as title, author, rating, relationship, language, status, and chapters, offering more context for each work.

1. Support Chinese characters in CVS and output txt.files now!
2. Print out working progress when scrapping
3. Reformatted the txt.files. Include more information about the work.
@caizhuoyue77
Copy link

When I try to scrape Chinese articles, I couldn't get the Chinese Character's, only the Pinyin?

@ElevenyChen ElevenyChen closed this Jun 5, 2024
@ElevenyChen ElevenyChen reopened this Jun 5, 2024
@ElevenyChen
Copy link
Author

When I try to scrape Chinese articles, I couldn't get the Chinese Character's, only the Pinyin?

Hi hi, I've successfully scraped Chinese characters in my version of code. See an example. Since the owner of the main branch haven't respond my request, you can directly use the version from my git.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants