Skip to content

Telemetry message wording is "inaccurate" #10262

Open
@arekbal

Description

@arekbal

Steps to reproduce

Just write dotnet in console
There is a notification about using telemetry and how to disable it by using environment variable. Th message states that telemetry data is shared with community.

Actual behavior

The wording of the message is inaccurate and misleading as only "some" of the data "used to be" shared with the community.
Looking at https://docs.microsoft.com/en-us/dotnet/core/tools/telemetry
tells us that only 5 out of 13? Data Points "used to be" shared.
Also, data blobs are not raw data, they are already preprocessed and are missing a lot of useful information that is gathered by telemetry but not exposed to the public such as time of invocation.
What I mean by "used to be" is that last blob that is currently available comes from late 2017. Latest data is not publicly available (5 last quarters right now) even in this tiny aggregated form.

Suggested change

I don't expect more than - for the very least - change to be applied to the wording. Get rid of the sentence about sharing data with community. Alternatively change wording to take into account that only some portion of the gathered data was shared with the community.

EU GDPR Sidenote

According to this page
https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en
gathered data, especially MAC addresses(even hashed obviously) makes it personal information which should be managed scrupulously.

ML.NET telemetry (Added day later)

mlnet cli tool uses this messaging: "The data is anonymous and doesn't include personal information or data from your datasets." which is also inaccurate and somewhat misleading. Calling this dataset as containing anonymous data and not containing personal information is false. Also I wasn't aware that mlnet cli is going to use telemetry. Therefore I disabled it only after I runned it for the first time with quickstart example. Basic mlnet command doesn't tell you about telemetry. That is a bit concerning.

Opinion

IMHO some variant of the 99% anonymised(as in nearly impossible to cross-check, correlate) telemetry makes sense for usage metrics... This data used by dotnet, mlnet is not so much anonymous. Sharing publicly only portion of the data leads to different conclusions about the planned usage of the data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions