Skip to content

Commit 525dd19

Browse files
hlomzikrobot-ci-heartexmakseq
authored
docs: LEAP-1657: Add a note to Text tag about \r\n (#6645)
Co-authored-by: robot-ci-heartex <[email protected]> Co-authored-by: Max Tkachenko <[email protected]>
1 parent 92c85c5 commit 525dd19

File tree

2 files changed

+15
-0
lines changed

2 files changed

+15
-0
lines changed

docs/source/tags/text.md

+7
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,13 @@ Every space in the text sample is counted when calculating result offsets, for e
1212

1313
Use with the following data types: text.
1414

15+
### How to read my text files in python?
16+
The Label Studio editor counts `\r\n` as two different symbols, displaying them as `\n\n`, making it look like there is extra margin between lines.
17+
You should either preprocess your files to replace `\r\n` with `\n` completely, or open files in Python with `newline=''` to avoid converting `\r\n` to `\n`:
18+
`with open('my-file.txt', encoding='utf-8', newline='') as f: text = f.read()`
19+
This is especially important when you are doing span NER labeling and need to get the correct offsets:
20+
`text[start_offset:end_offset]`
21+
1522
### Parameters
1623

1724
| Param | Type | Default | Description |

web/libs/editor/src/tags/object/Text.js

+8
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@
66
* Every space in the text sample is counted when calculating result offsets, for example for NER labeling tasks.
77
*
88
* Use with the following data types: text.
9+
*
10+
* ### How to read my text files in python?
11+
* The Label Studio editor counts `\r\n` as two different symbols, displaying them as `\n\n`, making it look like there is extra margin between lines.
12+
* You should either preprocess your files to replace `\r\n` with `\n` completely, or open files in Python with `newline=''` to avoid converting `\r\n` to `\n`:
13+
* `with open('my-file.txt', encoding='utf-8', newline='') as f: text = f.read()`
14+
* This is especially important when you are doing span NER labeling and need to get the correct offsets:
15+
* `text[start_offset:end_offset]`
16+
*
917
* @example
1018
* <!--Labeling configuration to label text for NER tasks with a word-level granularity -->
1119
* <View>

0 commit comments

Comments
 (0)