Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions backend/spellchecker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
### huntojson.sh
Convert hunspell output to JSON for spellchecker.

Launch hunspell on input file, convert output to spellchecker JSON format and write to standard output.
Usage: `./huntojson.sh [options...] [file]`
Options:
`-h` display this help and exit
`-d dict` use custom dictionaries

Example:
```bash
./huntojson.sh demo.tex # launch hunspell on 'demo.tex'
./huntojson.sh -d ru_RU ru_demo.tex # use russian dictionary
```
70 changes: 70 additions & 0 deletions backend/spellchecker/huntojson.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#!/bin/bash

# usage info
show_help() {
cat << EOF
Usage: ${0##*/} [options...] [file]
Launch hunspell on file, convert output to JSON format
and write to standard output.

-h display this help and exit
-d dict use custom dictionaries

Example: ${0##*/} demo.tex # launch hunspell on 'demo.tex'
${0##*/} -d ru_RU ru_demo.tex # use russian dictionary
EOF
}

DICT=
OPTIND=1

# command line arguments processing
while getopts ":hd:" opt; do
case "$opt" in
h)
show_help
exit 0
;;
d)
DICT="-p $OPTARG"
;;
?)
echo -e "Invalid option: -$OPTARG\n" >&2
show_help >&2
exit 1
;;
esac
done
shift $((OPTIND-1))

if [ "$#" != "1" ]
then
echo -e "Missing input file\n" >&2
show_help >&2
exit 2
fi

INFILE=$1

JSON=$(
cat $INFILE |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enquote $INFILE, otherwise you're going have troubles with spaces in the path

hunspell -a -t |
grep "^&.*" |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it must be safer to use single quotes here, to avoid shell expansion

awk $DICT '
BEGIN { print "{" }
{
print "\t\""$2"\": [" ;
Copy link
Contributor

@dbarashev dbarashev Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove these pretty-printing efforts as well. Those who want pretty-printed output will use python -m json.tool or jq '.'


split($0, split_string, ": ");
options_number=split(split_string[2], options, ", ");

for (i = 1; i <= options_number; i++)
{print "\t\t\""options[i]"\","}
Copy link
Contributor

@dbarashev dbarashev Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comma after the last element of the array produces invalid JSON, at least from jq perspective.The same applies to the commma after the last entry


print "\t]," }
END { print "}" }
'
)

echo $JSON