Skip to content

Combined name for basque language #2484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: dev
Choose a base branch
from
Open

Conversation

Crashillo
Copy link
Contributor

Due to the so long names presents in the Basque country, both basque and spanish ways, is quite common they're shorted through the common parts.

This PR introduces a function to merge two names, as another valid form for naming inside the sp_eu rule

For instance:

name=Calle Severo Ochoa kalea
name:eu=Severo Ochoa kalea
name:es=Calle Severo Ochoa

imagen

@Crashillo
Copy link
Contributor Author

I don't know what's wrong with the linter :(

@Famlam
Copy link
Collaborator

Famlam commented Apr 1, 2025

The linter is unrelated, I'm testing a fix but you can ignore that for now
Can you add a test case (at the bottom of the file)?

@Crashillo
Copy link
Contributor Author

My expertise in python is quite limited. AFAIK, the assertion should be true or false, not None... I guess I'm missing something

@Famlam
Copy link
Collaborator

Famlam commented Apr 1, 2025

The test functions (node, way and relation; in this case they're equal as they refer to each other) return a list containing errors if an error is found, or None or False if no error is found. So if you have to check for an error (=not adhering to the rules), use assert self.p.way(...), and if you need to check that no error is thrown, use assert not self.p.way(...)

assert doesn't need to receive a real True or False, just something truthy or falsey :)

@frodrigo
Copy link
Member

frodrigo commented Apr 1, 2025

Thank you for the PR submission. But first. This way of doing, it is a good practice attested by the community ? There is wiki page of forum about this ?

@Crashillo
Copy link
Contributor Author

Crashillo commented Apr 1, 2025

The test functions (node, way and relation; in this case they're equal as they refer to each other) return a list containing errors if an error is found, or None or False if no error is found. So if you have to check for an error (=not adhering to the rules), use assert self.p.way(...), and if you need to check that no error is thrown, use assert not self.p.way(...)

assert doesn't need to receive a real True or False, just something truthy or falsey :)

I do not see what's the problem in this clause versus the rest of assertions I added

assert self.p.way(None, {"name": u"Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez", "name:es": u"Calle Vicente Blasco Ibáñez", "name:eu": u"Vicente Blasco Ibañez kalea"}, None)
E       AssertionError: assert None
E        +  where None = <bound method Name_Multilingual.way of <plugins.Name_Multilingual.Name_Multilingual object at 0x7f22ccabf490>>(None, {'name': 'Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez', 'name:es': 'Calle Vicente Blasco Ibáñez', 'name:eu': 'Vicente Blasco Ibañez kalea'}, None)
E        +    where <bound method Name_Multilingual.way of <plugins.Name_Multilingual.Name_Multilingual object at 0x7f22ccabf490>> = <plugins.Name_Multilingual.Name_Multilingual object at 0x7f22ccabf490>.way
E        +      where <plugins.Name_Multilingual.Name_Multilingual object at 0x7f22ccabf490> = <plugins.Name_Multilingual.Test testMethod=test_eu>.p

assert self.p.way(None, {"name": u"Carretera Ollaretxe errepidea", "name:es": u"Carretera Ollaretxe", "name:eu": u"Ollaretxe errepidea"}, None)
assert self.p.way(None, {"name": u"Kale Nagusia / Calle Mayor", "name:es": u"Calle Nagusia", "name:eu": u"Kale Nagusia"}, None)
assert self.p.way(None, {"name": u"Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez", "name:es": u"Calle Vicente Blasco Ibáñez", "name:eu": u"Vicente Blasco Ibañez kalea"}, None)
assert self.p.way(None, {"name": u"Calle San Diego kalea", "name:es": u"Calle San Diego", "name:eu": u"San Diego kalea"}, None)
assert self.p.way(None, {"name": u"Calle Islas Canarias / Kanariar Uharteen kalea", "name:es": u"Calle Islas Canarias", "name:eu": u"Kanariar Uharteen kalea"}, None)
assert not self.p.way(None, {"name": u"Calle Islas Canarias / Kanariar Uharteen kalea", "name:es": u"Calle Canarias", "name:eu": u"Kanarias kalea"}, None)
assert not self.p.way(None, {"name": u"Calle San Diego", "name:es": u"", "name:eu": u"San Diego kalea"}, None)
assert not self.p.way(None, {"name": u"Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez", "name:es": u"", "name:eu": u""}, None)
assert not self.p.way(None, {"name": u"Kale Nagusia", "name:es": u"Calle Nagusia", "name:eu": u""}, None)
assert not self.p.way(None, {"name": u"Carretera Ollaretxe", "name:es": u"Carretera Ollaretxe", "name:eu": u"Ollaretxe errepidea"}, None)

@Crashillo
Copy link
Contributor Author

Crashillo commented Apr 1, 2025

it is a good practice attested by the community ? There is wiki page of forum about this ?

Basque community is not such, but standalone mappers. Due to the language (and the recent history) you know the naming is somehow conflictive. Besides, basque names plus spanish names generates a very long names (example, and also note the renaming the users do after normalization).

This combined way is very usual in the basque country, much simpler than having <name:es> / <name:eu>, what usually leads to confusion, even to the tourists. I will update such detail in the spanish wiki; however, basque mappers tend to ignore it. My purpose doing this PR is about to reduce (and to allow) a common way of naming things.

By far, osmose name-multilingual error is one of the most repeated issues:

@Famlam
Copy link
Collaborator

Famlam commented Apr 11, 2025

Regarding the failing test:

assert self.p.way(None, {"name": u"Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez", "name:es": u"Calle Vicente Blasco Ibáñez", "name:eu": u"Vicente Blasco Ibañez kalea"}, None)

means: check that there IS an issue with this combination of tags. (The analyser returns None at least something Falsely if there's nothing to complain about)

However, the following piece of code:

[
    {"name": tags["name:"+lang[0]].strip()},
    {"name": tags["name:"+lang[1]].strip()},
    {"name": tags["name:"+lang[0]].strip() + separator + tags["name:"+lang[1]].strip()},
    {"name": tags["name:"+lang[1]].strip() + separator + tags["name:"+lang[0]].strip()},
    {"name": self.merge_sp_eu(tags["name:"+lang[0]], tags["name:"+lang[1]]).strip()}
]

gives the following result on this input (for valid name values):

[
    {'name': 'Calle Vicente Blasco Ibáñez'},
    {'name': 'Vicente Blasco Ibañez kalea'},
    {'name': 'Calle Vicente Blasco Ibáñez / Vicente Blasco Ibañez kalea'},
    {'name': 'Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez'},
    {'name': ''}
]

As you can see, the 4th option equals the name tag, thus considering it a valid result, returning no error

p.s.: I'm a bit worried the '' will end up as a fix suggestion, by head I thought it was only stripped if it was None. But we'll see that later.
p.p.s.: note that the original lint issue was fixed but the remaining one is due to this PR

@Crashillo
Copy link
Contributor Author

I was misunderstanding how the assert works, I thought the opposite what you just mentioned. Now I had to redo the full aggregator function for clarity, I think now is quite straightforward

@Crashillo
Copy link
Contributor Author

Any reason to keep it open?

@Famlam
Copy link
Collaborator

Famlam commented May 5, 2025

Any reason to keep it open?

I haven't had the time to review it yet, busy times. Sorry for the wait

Comment on lines 405 to 406
assert self.p.way(None, {"name": "Calle San Diego", "name:es": "", "name:eu": "San Diego kalea"}, None)
assert self.p.way(None, {"name": "Kale Nagusia", "name:es": "Calle Mayor", "name:eu": ""}, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert self.p.way(None, {"name": "Calle San Diego", "name:es": "", "name:eu": "San Diego kalea"}, None)
assert self.p.way(None, {"name": "Kale Nagusia", "name:es": "Calle Mayor", "name:eu": ""}, None)
assert self.p.way(None, {"name": "Calle San Diego"}, None)
assert self.p.way(None, {"name:eu": "San Diego kalea"}, None)
assert self.p.way(None, {"name:es": "Calle Mayor"}, None)
assert self.p.way(None, {"name": "Calle San Diego", "name:eu": "San Diego kalea"}, None)
assert self.p.way(None, {"name": "Calle San Diego", "name:es": "", "name:eu": "San Diego kalea"}, None)
assert self.p.way(None, {"name": "Kale Nagusia", "name:es": "Calle Mayor"}, None)
assert self.p.way(None, {"name": "Kale Nagusia", "name:es": "Calle Mayor", "name:eu": ""}, None)

Copy link
Contributor Author

@Crashillo Crashillo May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say the failing test come from this change. As I understand, it's not the same, since what I was testing in the first clause was the presence of name and name:eu, with different values, and not name:es. And similar for the second one: name and name:es different, and no name:eu

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sure to understand your response. The function should not fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the failing job: it fails just for these changes you made. Undo these changes and letś try again.

AFAI understand you're not testing the same case I did, aren't you? I wanted to test the cases I before-commented: two tags, name and name:lang, different values each other. It should warn the user "hey, something is missing here"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove when there is only name.

The code should not crash, even if there is nothing to be done.

@frodrigo
Copy link
Member

I add few tests, than fails. Please can you review my changes.

@Crashillo Crashillo requested a review from frodrigo May 18, 2025 12:42
@frodrigo
Copy link
Member

@Crashillo I think, there is a misunderstanding here. The tests need to pass for only one name variant. Actually it crash. The plugin could not crash, it stops the whole check for the "country". I cannot merge a plugin that crash.

See. I run it on real data, and as I expected, it crash.

time ./osmose_run.py --country=spain_euskadi_vizcaya --no-clean --analyser sax --plugin=Name_Multilingual
[...]
2025-05-18 14:14:46   Analysing file /data/work/fred/extracts/spain_euskadi_vizcaya.osm.pbf
[INFO] Reading the file/data/work/fred/extracts/spain_euskadi_vizcaya.osm.pbf
2025-05-18 14:14:48   error: Fail on <bound method Name_Multilingual.node of <plugins.Name_Multilingual.Name_Multilingual object at 0x7f9dda96a8e0>> with {'id': 4379264289, 'lon': -2.9604194000000033, 'lat': 43.274954500000035, 'tag': {'amenity': 'pharmacy', 'healthcare': 'pharmacy', 'name:eu': 'Botika'}, 'timestamp': 1670352603}, {'amenity': 'pharmacy', 'healthcare': 'pharmacy', 'name:eu': 'Botika'}
2025-05-18 14:14:48   Closing reader and parser
2025-05-18 14:14:48   error: error on analyse sax...
2025-05-18 14:14:48     Traceback (most recent call last):
2025-05-18 14:14:48       File "/home/fred/osmose-backend/./osmose_run.py", line 275, in execc
2025-05-18 14:14:48         analyser_obj.analyser()
2025-05-18 14:14:48       File "/home/fred/osmose-backend/analysers/analyser_sax.py", line 69, in analyser
2025-05-18 14:14:48         self._run_analyse()
2025-05-18 14:14:48       File "/home/fred/osmose-backend/analysers/analyser_sax.py", line 491, in _run_analyse
2025-05-18 14:14:48         self.parser.CopyTo(self)
2025-05-18 14:14:48       File "/home/fred/osmose-backend/modules/OsmPbf_libosmbf.py", line 85, in CopyTo
2025-05-18 14:14:48         osm_pbf_parser.read_osm_pbf(self._pbf_file, self)
2025-05-18 14:14:48       File "/home/fred/osmose-backend/modules/OsmPbf_libosmbf.py", line 98, in node
2025-05-18 14:14:48         self._output.NodeCreate(data)
2025-05-18 14:14:48       File "/home/fred/osmose-backend/analysers/analyser_sax.py", line 183, in NodeCreate
2025-05-18 14:14:48         res = meth(data, tags)
2025-05-18 14:14:48       File "/home/fred/osmose-backend/plugins/Name_Multilingual.py", line 171, in node
2025-05-18 14:14:48         a = self.aggregator(tags)
2025-05-18 14:14:48       File "/home/fred/osmose-backend/plugins/Name_Multilingual.py", line 75, in aggregator
2025-05-18 14:14:48         if str1 and name != str1 and separator not in name:
2025-05-18 14:14:48     TypeError: argument of type 'NoneType' is not iterable

@Crashillo
Copy link
Contributor Author

The tests need to pass for only one name variant

I don't understand what do you mean with this. The goal of the PR is about allowing the combined form as a valid result.

Look at the following table: it has a more readable view of the test cases. I hope you understand what's the purpose of this PR:

name name:es name:eu isValid comment
Carretera Ollaretxe errepidea Carretera Ollaretxe Ollaretxe errepidea New combined form
Kale Nagusia / Calle Mayor Calle Mayor Kale Nagusia Correct form already exists (Basque first, Spanish second)
Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez Calle Vicente Blasco Ibáñez Vicente Blasco Ibañez kalea Correct form already exists (Basque first, Spanish second)
Calle San Diego kalea Calle San Diego San Diego kalea New combined form
Calle Islas Canarias / Kanariar Uharteen kalea Calle Islas Canarias Kanariar Uharteen kalea Correct form already exists (Spanish first, Basque second)
Vicente Blasco Ibañez kalea / Calle Vicente Blasco Ibáñez (empty) (empty) Only name with separator, it's valid
Calle Islas Canarias / Kanariar Uharteen kalea Calle Canarias Kanarias kalea name is different from both Basque and Spanish
Calle San Diego (empty) San Diego kalea name and name:eu differ, Spanish is missing and name should be combined
Kale Nagusia Calle Mayor (empty) name and name:es differ, Basque is missing and name should be combined
Carretera Ollaretxe Carretera Ollaretxe Ollaretxe errepidea Should be correct as a new combined form

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants