@@ -22,3 +22,41 @@ Elias Grünewald
2222
2323## License
2424General Public License v3
25+
26+
27+ ---
28+
29+ # Research Agenda
30+ ## Problem
31+ - users have certain rights as transparency information but are not able to conceive them
32+ - if data is transferred to multiple parties, the resulting network is not visible
33+ - lack of transparency information describing representation format
34+
35+ ## Research questions
36+ - how should a transparency representation format look like?
37+ - how to automatically extract transparency information?
38+ - how to extract data flow networks?
39+
40+ ## Sketched solution process
41+ 1 . define transparency representation format
42+ 2 . make use of existing corpora
43+ - ** Privacy Policies** e.g. OPP-115 Corpus
44+ - ** Transparency information key words or categories** list of
45+ (sensitive) personal data terms such as name, birthday, bank
46+ account details, picture, IP address…
47+ - ** Third parties** list of top * N* companies, institutions
48+ 3 . use NLP for semantics extraction (link each transparency key word to
49+ third party e.g. by distance)
50+ 4 . save * n* -tuples (incl. purpose, duration) to previously defined
51+ representation format
52+ 5 . visualize data flow networks
53+
54+ ## Implementation
55+ - may extend Polisis framework
56+ - transparency representation is defined as json example/schema
57+ - make use of established NLP framework such as TensorFlow, PyTorch, Google Natural Language API, Amazon Comprehend
58+ - common web technologies for visualization
59+
60+ ## Evaluation
61+ - try unseen privacy policies for precision, recall, F1-score
62+ - measure performance
0 commit comments