-
Notifications
You must be signed in to change notification settings - Fork 0
Main #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Main #45
Changes from all commits
646a454
f409dca
c479468
88364a1
ea9d484
d851493
6deb9da
03667a1
1d1e186
931ab8c
304e7be
f035f17
7df9772
a6c2dc6
e64207e
4cc0351
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -8,6 +8,18 @@ | |||||||
| import os | ||||||||
| import glob | ||||||||
|
|
||||||||
| def ensure_nltk_resources(): | ||||||||
| try: | ||||||||
| nltk.data.find("tokenizers/punkt") | ||||||||
| except LookupError: | ||||||||
| nltk.download("punkt") | ||||||||
| nltk.download("punkt_tab") | ||||||||
|
|
||||||||
| try: | ||||||||
| nltk.data.find("corpora/stopwords") | ||||||||
| except LookupError: | ||||||||
| nltk.download("stopwords") | ||||||||
|
|
||||||||
|
||||||||
| ensure_nltk_resources() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a mismatch between the NLTK data download location and the runtime NLTK_DATA environment variable. NLTK resources are downloaded to '/usr/local/share/nltk_data' during the build (lines 8-10), but the NLTK_DATA environment variable is set to '/tmp/nltk_data' at runtime (line 23). This will cause NLTK to be unable to find the downloaded resources, leading to runtime errors. Either download NLTK data to '/tmp/nltk_data' during build, or change the NLTK_DATA environment variable to '/usr/local/share/nltk_data'.