-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathlanguage.dart
More file actions
32 lines (31 loc) · 1.34 KB
/
language.dart
File metadata and controls
32 lines (31 loc) · 1.34 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
const int kGramSize = 3;
const int kDictSize = 750;
/// To learn a new Language follow this steps:
///
/// - add the new language to [Language]
/// i.e. ukr
/// implement all switches with [UnimplementedError]
/// - download the 500 wikipedia topviews from
/// https://pageviews.wmcloud.org/topviews/?project=it.wikipedia.org&platform=all-access&date=last-year&excludes=
/// for the given language project (in this example this is italian)
/// - save the file in assets/topviews/[THREE_LETTER_NEW_LANG].json (i.e. deu, eng,...)
/// - download the articles by running
/// dart bin/fetch.dart [THREE_LETTER_NEW_LANG] > assets/blob/[THREE_LETTER_NEW_LANG].json
/// - the fetch positives (you can / should add new positives to this file when you find articles that need to be covered)
/// dart bin/fetch_positives.dart [THREE_LETTER_NEW_LANG] > assets/positives/[THREE_LETTER_NEW_LANG].json
/// - [optional] add new gibberish examples to gibberish.json
/// - Run bin/train.sh
/// - Implement the switches with the generated dict files (i.e ukrDictionary)
/// - verify that `dart test` passes
/// In case that one test doesn't pass you can modify the test class to cover the expected alternative coverage
/// We allways wanna have 100% on the positives and almost 0% on the negatives
enum Language {
eng,
deu,
dut,
fre,
pol,
esp,
ukr,
ita;
}