Meeting notes - August 28th, 2023 #16

Chris-ECE · 2023-08-29T14:03:32Z

Chris-ECE
Aug 29, 2023
Maintainer

Participants: Rogayah Alsanabrah, Inkyung Choi, Christopher Jones, Nadia Mignolli, Helda Mitre, Antti Santaharju, Giorgia Simeoni, Florian Vucko

1. Reviewing Inkyung’s proposed text for 6.1 (Explaining ML-based results)
Returning to the discussion of 20th June, for 6.2 & 6.3 (Explaining ML-based results), where it was suggested to add a sentence in either 6.3 or 6.1 about explaining methodology (e.g. of ML), InKyung proposed addition of explanation to 6.1, as an additional component for people to think about (whereas for 6.3 this might be more of an internal discussion).

Action: InKyung offered to add some text to 6.1, regarding methodology, additional explanation/communication, and that the output can be microdata.

2. Reviewing Chris’s proposed text for 6.2 and 6.3
Proposed 6.2 description:

It was agreed to: remove reference to automated checking.
Remove the term “subjectivity”
Removed first example “Checking that the population coverage and response rates are as required” as it seemed that this would be done in an earlier phase.
Removed the line that 6.2 and 6.3 can be simultaneous.

Proposed 6.3 description:

Removed the analysis examples (e.g. mirror data)
Removed the line about the most believable interpretation.

Reviewing InKyung’s proposed changes to 6.4
In the meeting of 11th July, there was a point about disclosure control measures being applied to internally-stored data, and not just disseminated data. For this reason, InKyung suggested removing reference to finalised outputs. She also added some text for 6.4 about extra confidentiality challenges arising from geospatial data.

Data Ethics
Statistics Israel is currently translating a document into English about data ethics, that InKyung can put onto GitHub.

This closes phase 6.

Process phase
5.1 (Integrate Data) and pseudo-anonymisation
It was noted that different countries have different practices. Some countries might use the same pseudo-identifier across all statistical domains, while others might not. The creation of pseudo-identifiers might happen in the collection phase, so it is debatable as to which phase it belongs in. In Finland, pseudo-anonymisation is considered part of data processing.
In the context of Admin data, pseudo-anonymisation lends itself more to the data integration subprocess, although Giorgia suggested it might be a separate subprocess.
InKyung suggested to add pseudo under Collect and also 5.1 – InKyung can add it for the collection phase (i.e. when data is captured, pseudo-anonymization is performed to protect the identities of people within collected data), but additionally add some text to 5.1 just to indicate that pseudo-anonymization can happen there, but without specifying the exact method used to do it (it could be integration of survey and admin data, or admin data source that possess different pseudo-identifiers, etc.).
Action: InKyung to propose some text for the Collect phase (maybe for subprocess 4.4)
Action: Antti to consult with colleagues, and possibly propose some text for 5.1 (or 4.4) in the context of pseudo-anonymization in the context of admin data sources.

Next meeting
18th at 3pm CEST

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meeting notes - August 28th, 2023 #16

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Meeting notes - August 28th, 2023 #16

Uh oh!

Chris-ECE Aug 29, 2023 Maintainer

Replies: 0 comments

Chris-ECE
Aug 29, 2023
Maintainer