Skip to content

Commit 39d54cf

Browse files
committed
first prototype
1 parent 835b5ce commit 39d54cf

File tree

4 files changed

+11
-2
lines changed

4 files changed

+11
-2
lines changed

app/questions_generator/Dockerfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ RUN pip install --no-cache-dir --upgrade pip \
1818
# 4. NLTK
1919
RUN python -m nltk.downloader punkt stopwords
2020

21+
RUN python -m nltk.downloader punkt
22+
2123
# 5. Copy local model
2224
COPY rut5-base/ /app/rut5-base/
2325

app/questions_generator/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@
77
- `hf download cointegrated/rut5-base-multitask --local-dir rut5-base`
88
## Выбор файла ВКР
99
- заменить в `run.py` в функции `main` путь для файла ВКР
10-
## Запуск
10+
## Запуск (после любых изменений)
1111
- `docker build -t vkr-generator .`
1212
- `docker run -it --rm vkr-generator`

app/questions_generator/run.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import sys
44
import os
55
from docx import Document
6+
import nltk
67

78

89
def load_vkr_text(path: str) -> str:
@@ -19,6 +20,12 @@ def load_vkr_text(path: str) -> str:
1920

2021

2122
def main():
23+
try:
24+
nltk.data.find('tokenizers/punkt_tab/english')
25+
except LookupError:
26+
print("Загрузка необходимых данных NLTK...")
27+
nltk.download('punkt_tab')
28+
2229
print("=== Загрузка текста ВКР ===")
2330
text = load_vkr_text("vkr_examples/VKR1.docx")
2431

app/questions_generator/validator.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ def __init__(self, vkr_text: str):
1717
vkr_text: Полный текст ВКР
1818
"""
1919
self.vkr_text = vkr_text.lower()
20-
self.keywords = self._extract_keywords()
2120
self.stopwords = set(stopwords.words('russian'))
21+
self.keywords = self._extract_keywords()
2222

2323
def _extract_keywords(self) -> Dict[str, Set[str]]:
2424
"""

0 commit comments

Comments
 (0)