Skip to content

Classification feedback #3

@willemarcel

Description

@willemarcel

Hi @jremillard !! Congratulations again for this great work. I'll use this ticket to post some feedback.

Imports

Changesets didn't create anything and were classified as Import:
https://osmcha.mapbox.com/changesets/44070150
https://osmcha.mapbox.com/changesets/44067113

https://osmcha.mapbox.com/changesets/44054299 - it has a considerable number of deletions and modifications, so it shouldn't be considered as an import.

I use a logic to detect imports, mass modifications and mass deletions in OSMCha. Take a look on:
https://github.com/willemarcel/osmcha/blob/master/osmcha/changeset.py#L377

I consider more suspect the changesets that don't have a balance between the number of elements created, modified and deleted.

Spam

Some changesets with long comments and description tags are classified as spam, but are good edits:
https://osmcha.mapbox.com/changesets/43890576
https://osmcha.mapbox.com/changesets/44725218
https://osmcha.mapbox.com/changesets/43711394
https://osmcha.mapbox.com/changesets/44758435
https://osmcha.mapbox.com/changesets/43953212

Tagging error

This classification seems to work very well. Are you able to identify which features are tagged wrongly? It can be a bit difficult to the users to discover which feature has inappropriate tags in a changeset with a big number of features. In OSMCha we can add suspicion reason also to features (that way we can list the flagged features of the changeset).

Maybe you need to consider the name:xx variants, for example this changeset has not wrong tags, although it has a name tag with a value that is not 100% correct: https://osmcha.mapbox.com/changesets/44026170

Another point that I find useful is to flag a edit only if it adds a wrong tag. For example: https://osmcha.mapbox.com/changesets/44386947 modifies some features with tagging error, but the wrong tags were added in another edit. It would be great also to take in consideration the number of wrongly tagged features in a changeset.

Some tips

We have more problems with new users changesets, so it could be more useful to use the classification only to new users edits. For example, you could use it only with edits of users with less than 500 edits and less than 3 months of experience. We try to reduce the number of false positives and our great challenge is to save the time of the reviewers by filtering suspect changesets in the best way possible.

You can get the changesets reviewed as bad in OSMCha to train the classifier: https://osmcha.mapbox.com/api-docs/#!/changesets/changesets_harmful_list

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions