|
1 | | -# parsent |
| 1 | +# parsent |
| 2 | +A Python module for analyzing parts of a sentence enclosed by specific symbols. |
| 3 | + |
| 4 | +## Features |
| 5 | +* Separation of parts enclosed by specific symbols from those that are not. |
| 6 | +* Hierarchical classification of parts enclosed by specific symbols within another enclosed part. |
| 7 | +* Ability to set multiple types of delimiters in a single process. |
| 8 | +* Grouping of only specific parts when classification of lower hierarchies is not needed. |
| 9 | + |
| 10 | +## How to use |
| 11 | +1. Importing the parsent module |
| 12 | +Note: To import as shown below, the program from which you're importing and the parsent module must be in the same directory. |
| 13 | + ``` |
| 14 | + import parsent |
| 15 | + ``` |
| 16 | +2. Sentence Analysis |
| 17 | +Analyze the sentence and obtain a SentenceStructureInformation object. |
| 18 | + ``` |
| 19 | + result = parsent.analyze_sentence('This is a "test".', ('"', '"')) |
| 20 | + ``` |
| 21 | + Note: Although not specified here, the `delimiter_handling_mode` argument can be provided to the `analyze_sentence` function. By specifying this, delimiters can be attached to the part itself or not attached anywhere. |
| 22 | + Also, by specifying the `consider_escaping` argument, consideration of escaping can be set. |
| 23 | + To retrieve the original sentence from the SentenceStructureInformation object, do as follows: |
| 24 | + ``` |
| 25 | + text = str(result) |
| 26 | + ``` |
| 27 | +3. Result Verification |
| 28 | + ``` |
| 29 | + print(result.get()) |
| 30 | + ``` |
| 31 | + Note: To display omitting empty strings, do as follows: |
| 32 | + ``` |
| 33 | + print(result.get(omit_empty_strings=True)) |
| 34 | + ``` |
| 35 | +4. Shrinking Analysis Results (optional) |
| 36 | + In the code below, the hierarchies below the first layer are being declassified. |
| 37 | + ``` |
| 38 | + text1 = 'The book says, "[Man] is only one of many creatures on Earth."' |
| 39 | + result1 = parsent.analyze_sentence(text, [('"', '"'), ('[', ']')]) |
| 40 | + print(result1.get()) |
| 41 | + result2 = result1.shrink_analysis_results(bottom_hierarchy=1) |
| 42 | + print(result2.get()) |
| 43 | + ``` |
| 44 | +5. Utilizing the Analyzed Results |
| 45 | +While the analyzed results can be used for various purposes, here's an example code that displays only the quoted strings. |
| 46 | + ``` |
| 47 | + for part in result.get(): |
| 48 | + if part[1] == 1: |
| 49 | + print(part[0]) |
| 50 | + ``` |
| 51 | +
|
| 52 | +## Handy Utility Functions |
| 53 | +Typically, the data structure obtained from the `get` method of the SentenceStructureInformation object looks like `[['This is a "', 0], ['test', 1], ['".', 0]]`, which is a structure like `[[(string), (hierarchy number)], [(string), (hierarchy number)], ... ]`. There are times when you want to use this in a different format or only want to use some of the information. For such cases, this library is equipped with functions to manipulate the data. Please check the source code to see what functions are available and how they can be used. |
| 54 | +
|
| 55 | +## About the Project Name |
| 56 | +The name "parsent" is a portmanteau created by combining the functionality of the program, "parse sentences", with "%". |
| 57 | +
|
| 58 | +## Licence |
| 59 | +Please refer to the [license file](LICENSE). |
0 commit comments