You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/review_design_notes.Rmd
+24-21Lines changed: 24 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ knitr::opts_chunk$set(
16
16
17
17
## User Requirements
18
18
19
-
These requirements are based on initial brainstorming conversations. Some of them are direct requests and some of them are educated guesses from the feature developer. All of them are subject to change until confirmed by the interested parties. The list is not exhaustive:
19
+
These requirements are based on initial brainstorming conversationsand a few rounds of user feedback. The list is not exhaustive:
20
20
21
21
- User can self-select a reviewer role and, under that capacity, annotate each row of any given dataset with a value chosen from those available on a dropdown menu.
22
22
- Several users can interact under _strictly non-overlapping_ roles with the application, each annotating individual rows of _possibly overlapping_ datasets.
@@ -25,10 +25,7 @@ These requirements are based on initial brainstorming conversations. Some of the
25
25
- A subset of the columns (which we call `tracked` and does not overlap the `identifier` columns) is considered necessary and sufficient for review purposes.
26
26
- Updates to the provided datasets are expected during the course of a study.
27
27
- Changes to contents of `tracked` columns of a previously reviewed dataset row will be highlighted in the user interface and require re-confirmation.
28
-
29
-
#### Open questions
30
-
- Do all datasets share the same decision dropdown choices?
31
-
- Should we guard against or track "disappearing" rows (those whose `identifier` values vanish during a dataset update)?
28
+
- All datasets share the same decision dropdown choices.
32
29
33
30
## API
34
31
This feature can be implemented by adding an extra parameter to `mod_listings`. The names of fields and subfields are all temporary placeholders:
@@ -56,34 +53,38 @@ A possible simplification would be to make `"USUBJID"` optional on `id_vars`, si
56
53
57
54
**Beware**: Once the application is configured and run once, the only change permitted to the `datasets` subfield will be to *add* extra datasets. Changes to previously configured `id_vars` or `tracked_vars` sub-subfields could potentially render the collected review information inconsistent. The module should disallow the editing controls until such a situation is addressed. Review choices and roles do not suffer from that problem.
58
55
59
-
#### Open questions
60
-
- Do we need to keep track of row numbers? They don't have an assigned column name, so this draft API would be insufficient to specify that they should/should not be tracked.
56
+
## Data Stability Requirements
57
+
The module only has access to the latest version of any given dataset. In order to inform users about modified and newly added records, it relies on stored summary hashes of previously seen data. Thus, it is necessary that some aspects of the representation of data are kept constant over the life of a study. Currently, these are:
58
+
59
+
- Values assigned to the sub-parameters `id_vars` and `tracked_vars` are set once and remain the same for the duration of the study.
60
+
- Variables identified by `id_vars` and `tracked_vars` retain their types (factor, numeric, ...) and are available on each revision of each dataset.
61
+
- All rows of each provided dataset are identified uniquely by the combination of `id_vars` configured at the beginning of the study.
62
+
- No data rows are dropped during the study. In other words, if a combination of `id_vars` is present on revision `n` of a dataset, it will be available on revision `n+1`.
61
63
62
64
## User Interface
63
-
Basic features (sufficient for initial user feedback):
65
+
Basic features:
64
66
65
67
- Isolated drop-down to choose reviewer role. Blank every time the application starts. Not bookmarked. Only when a non-empty role is selected can the user review data.
66
-
- A listing set up for review will have *at least* two extra columns:
67
-
- Latest decision
68
-
-Row status: unreviewed data, reviewed data, data modified after review.
69
-
Sorting/Filtering by "row status" should allow to conduct reviews of incremental changes to the underlying dataset.
68
+
- A listing set up for review will have three extra columns:
69
+
- Latest review decision
70
+
-Latest reviewer role
71
+
- Row status: unreviewed data, reviewed data, data modified after review, conflict across reviewers.
70
72
71
-
Future features (not requested, so not planned for this development phase):
73
+
Sorting/Filtering by "row status" should allow to conduct reviews of incremental changes to the underlying dataset.
72
74
73
-
- Hover-on decision info detail: date and reviewer role.
74
-
- Warn against simultaneous conflicting editing.
75
+
Future features (not requested, so not planned for this development phase):
75
76
- User upload/download of review information. For manual backup purposes. Stored data consists mostly of hashes, so plaintext download should be OK. However, if necessary we could encrypt it using a symmetric key configured as an app secret and provided as an extra parameter to the module.
- The module allows to tweak column visibility. Is it OK to allow review actions performed while some `tracked_vars` are not visible?
84
83
85
-
86
84
## Server storage
85
+
86
+
_None of the proposals of this section are in scope for the first version of the review functionality. Only the alternative "Client storage" explain in the next section is implemented_.
87
+
87
88
Currently, the two available forms of storage on Connect are:
88
89
89
90
- Pins
@@ -108,7 +109,7 @@ The optional `review_store_path` parameter allows to point to an arbitrary folde
108
109
- Will client-controlled mount points become available at some point on Connect?
109
110
110
111
## Client Storage
111
-
An alternative approach to review data storage is to use Google's [File System Access API](https://wicg.github.io/file-system-access/) that is currently available in Chrome-derived browsers. To use it, reviewers would have to point the app to a folder shared by the team.
112
+
An alternative approach to review data storage is to use Google's [File System Access API](https://wicg.github.io/file-system-access/) that is currently available in Chrome-derived browsers. To use it, reviewers have to point the app to a folder shared by the team at the beginning of each session.
112
113
113
114
## Data structures
114
115
There will use a small collection of files for each input dataset configured for review. If we take an imaginary "ae" domain, we would store the following files:
@@ -121,7 +122,9 @@ There will use a small collection of files for each input dataset configured for
121
122
- 1 complete hash of "ae" data.frame
122
123
- 1 domain string ("ae")
123
124
- n `id_vars` column names
125
+
-**MISSING**: n `id_vars` column types
124
126
- m `tracked_vars` column names
127
+
-**MISSING**: m `tracked_vars` column types
125
128
- 1 row count
126
129
- p (1 per "ae" row) `hash_id(ae[id_vars])`
127
130
- p (1 per "ae" row, *m* bytes long) `hash_tracked(ae[tracked_vars])`
0 commit comments