-
Notifications
You must be signed in to change notification settings - Fork 4
Adding transliterate feature #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds a transliterate feature to the application, which converts text from one script to another (e.g., Cyrillic to Latin characters). The feature accepts a 2-letter ISO language code and can be combined with the --non-ascii option for further character conversion.
- Adds transliterate functionality with language code parameter
- Includes test coverage for Serbian transliteration
- Documents the new feature with usage examples
Reviewed Changes
Copilot reviewed 3 out of 5 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/test_app.py | Adds test case for transliterate feature using Serbian test data |
| tests/conftest.py | Creates test input file with Cyrillic text for transliteration testing |
| docs/usage.rst | Documents the new transliterate feature with usage instructions and examples |
Comments suppressed due to low confidence (1)
tests/test_app.py:1017
- The test should verify that the output file was created successfully and handle potential file I/O errors. Consider adding error handling or using a context manager with proper cleanup.
with open('testdata/output55') as f:
Co-authored-by: Copilot <[email protected]>
Wineh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paar kleine puntjes, voor de rest approve!
| becomes u, ç becomes c. | ||
| --trim Enables removing newlines representations from end and beginning. Newline | ||
| representations detected are '\\n', '\\r', '\n', '\r', '<br>', and '<br />'. | ||
| --transliterate <language> Transliterate a strings, for example "ipsum" becomes "իպսում". Language is iso |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hier ook -s bij "strings"
| --trim Enables removing newlines representations from end and beginning. Newline | ||
| representations detected are '\\n', '\\r', '\n', '\r', '<br>', and '<br />'. | ||
| --transliterate <language> Transliterate a strings, for example "ipsum" becomes "իպսում". Language is iso | ||
| 2 letter code. Examples: ru, sr, ua |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wellicht de volledige output van
>>> transliterate.get_available_language_codes() ['ka', 'sr', 'l1', 'ru', 'mn', 'uk', 'mk', 'el', 'hy', 'bg']
hier toevoegen?
No description provided.