-
-
Notifications
You must be signed in to change notification settings - Fork 742
Update linestopolygons algorithm description #9784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I didn't attach visualization (print screen) because it's a pretty long one in render. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @phidrho
Really nice stuff. It will certainly help a lot of users.
Lines to polygons on linestrings | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Features with 3 or more vertices are expected on input: **start point - vertex/vertices - end point** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example contradicts this statement. And as I suggest later, I'd just merge concept of single and multiline segments together
Lines to polygons on linestrings | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
Features with 3 or more vertices are expected on input: **start point - vertex/vertices - end point** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am aware that first example is opposite of description, but as noted in previous comment it's for people who will immediately jump to examples. My logic is to prove the opposite - just like in math - this is an example how algorithm works - and it is only example that show that in this case algorithm will have an output visible in attribute table only, with EMPTY geometry so it does fit in this table since output does exist it is not rejected.
I didn't merge examples from single segment lines and multi segment lines because the first table explains how algorithm works and it's much more important, and second table is like an appendix for users that are coming from CAD background and finding name of algorithm - "lines to polygons" confusing. The second table could be dismissed at all, but I would like to keep it separate because I find it important for understanding how algorithm works in these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADDITIONAL NOTES:
- first example from second table with examples (appendix - as I call it) is also to avoid possible bug reports for this case, where a user might be confused why the algorithm doesn't work - it is not obvious in normal conditions (see image below) - and if not documented with example image could possibly be hard to determine cause of behavior (imagine thousands of features on input):
- same layer, lines look exactly the same
- second example from second table is overkill, I agree, but it is very important for connection with description of Polygonize algorithm - user then can see clearly that this algorithm is not what he/she needs - and we have a "see also" note where he/she can jump to Polygonize algorithm and then user can see there an example that on the same input - algorithm produces polygon for this case
- same case, example that will be in Polygonize algorithm documentation:
So both of these examples are intended for better user experience and not directly for describing algorithm so I moved them to another table/section.
algorithm will automatically connect last to first point when forming a polygon. | ||
Result is always promoted to MultiPolygon. | ||
LineString geometry that have less than three vertices will produce | ||
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept. | |
new polygon features with **EMPTY** geometry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute table of the output layer is the same as the one of | ||
the input layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute table of the output layer is the same as the one of | |
the input layer. | |
Attributes of the input line feature are ALWAYS kept in the output polygon geometry. |
And then I wonder if we need to repeat it for every example below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this new sentence to new PR, but I would like to keep it repeated for every example.
Why? Because of users who do not read documentation thoroughly but are more of a "visual persons" and want to see examples immediately. Me personally is one of these persons - I would jump straight to examples - probably without reading entire description properly. But I also agree with you that's not needed redundancy - please make a decision and I'll fix it the way it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t have a strong opinion on this, but I’m concerned that if we adopt this kind of repetition for every example in every algorithm (we will probably add more examples in the future) it could lead to a lot of unnecessary text in the documentation?
| | ||
| **Output has a valid (empty) geometry.** | ||
|
||
* - **Feature b**: LineString with three vertices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the number of vertice/segments matters? It is more the situation of a not closed linestring with more than one segment, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linestring doesn't need to be closed. Algorithm always assume vertices are closing a loop - that's why I included example as open linestring with three vertices. I have also tried some other examples when all three vertices lie on same line but these are special cases, and I think that I decided to discard these examples.
:width: 25 em | ||
:align: center | ||
|
||
Two distinct LineStrings, one geometrically contained inside another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(here and below) Are LineString and MultiPolygon (written with that casing) pluralizable? How about
Two distinct LineStrings, one geometrically contained inside another | |
Two distinct line features, one geometrically contained inside another |
- | **Feature e**: MultiPolygon | ||
| **Feature f**: MultiPolygon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this render in HTML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the delay, I will catch up with this PR till the end of the week |
Hi @DelazJ, please see my comments and resolve conversations, and give your opinion on things that are left open from me. Then I'll create a new PR and we can finish it refining details. Sorry once again for delay, I couldn't find free time earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @phidrho this is a great improvement.
Just a tip: it might be easier to create a new commit for the suggestions (you can make more than one), rather than opening a whole new PR.
or more vertices as polygon rings. Input LineString geometry doesn't need to close a loop, | ||
algorithm will automatically connect last to first point when forming a polygon. | ||
Result is always promoted to MultiPolygon. | ||
LineString geometry that have less than three vertices will produce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @phidrho , "vertices" is also a term used in the log message - "One or more line ignored due to geometry not having a minimum of three vertices."
algorithm will automatically connect last to first point when forming a polygon. | ||
Result is always promoted to MultiPolygon. | ||
LineString geometry that have less than three vertices will produce | ||
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code show us a MultiPolygon, but there is no need to repeat that information twice in the same description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Selma, this two sentences describe different things. Let me try to explain:
Row 3855 explains that you will never get Polygon geometry as a result - it will always be MultiPolygon even when not needed - this is a decision made by developer and should be documented. Many results could be simple Polygons, but result is always promoted to MultiPolygon because then we would need two layers on output - Polygon layer and MultiPolygon layer for specific cases (see table below for difference between Polygon and MultiPolygon for the same geometry shape - they are pretty different in WKB (in database or on hard disk, while on screen they are identical). Although to casual QGIS user doesn't mean much - it is still different geometry type, just like Point and Polygon are different just on a different level. This sentence is very important because if user wants Polygons in final output he must additionally run native:multiparttosingleparts
algorithm (rough estimation from AI - multipolygons are 5% larger in storage size compared to same geometry stored as polygons).
Row 3857 explains that result is EMPTY geometry - but technically there is no such thing as "EMPTY geometry", you can only have data-types (literally Point EMPTY
, LineString EMPTY
, Polygon EMPTY
etc.) that are defined by standard, and if you want to parse them you need to know what to expect - Point EMPTY
is very different from MultiPolygon EMPTY
, see table:
Geometry Type | WKT | WKB (Hexadecimal) |
---|---|---|
POINT | POINT EMPTY |
01 01000000 000000000000F87F 000000000000F87F |
POINT (1 2) |
01 01000000 000000000000F03F 0000000000000040 |
|
MULTIPOINT | MULTIPOINT EMPTY |
01 04000000 00000000 |
MULTIPOINT ((1 2), (3 4)) |
01 04000000 02000000 0101000000000000000000F03F0000000000000040 010100000000000000000008400000000000001040 |
|
LINESTRING | LINESTRING EMPTY |
01 02000000 00000000 |
LINESTRING (1 2, 3 4) |
01 02000000 02000000 000000000000F03F 0000000000000040 0000000000000840 0000000000001040 |
|
MULTILINESTRING | MULTILINESTRING EMPTY |
01 05000000 00000000 |
MULTILINESTRING ((1 2, 3 4)) |
01 05000000 01000000 010200000002000000000000000000F03F000000000000004000000000000008400000000000001040 |
|
POLYGON | POLYGON EMPTY |
01 03000000 00000000 |
POLYGON ((1 2, 3 4, 5 6, 1 2)) |
01 03000000 01000000 04000000 000000000000F03F 0000000000000040 0000000000000840 0000000000001040 0000000000001440 0000000000001840 000000000000F03F 0000000000000040 |
|
MULTIPOLYGON | MULTIPOLYGON EMPTY |
01 06000000 00000000 |
MULTIPOLYGON (((1 2, 3 4, 5 6, 1 2))) |
01 06000000 01000000 01030000000100000004000000000000000000F03F00000000000000400000000000000840000000000000104000000000000014400000000000001840000000000000F03F0000000000000040 |
So, because of technical reasons I would like to keep name MultiPolygon EMPTY
whenever we are explaining output. That's at least how I would write documentation - so that it is technically correct and exact, if this is not practice in QGIS User Guide, or if you consider this overkill we can remove this and keep only "EMPTY" geometry, I'm not against it by any mean. Please let me know your final decision.
P.S. I've been personally through these problems before - trying to parsing binary geometry in PHP (I didn't have any good library to parse this for me) - so - in this type of cases it is very useful to know whether should you expect MULTIPOLYGON (((0 0, 0 0, 0 0, 0 0)))
or MULTIPOLYGON EMPTY
data-type, or possibly even more problematic POINT EMPTY
.
The attribute table of the output layer is the same as the one of | ||
the input layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t have a strong opinion on this, but I’m concerned that if we adopt this kind of repetition for every example in every algorithm (we will probably add more examples in the future) it could lead to a lot of unnecessary text in the documentation?
Examples | ||
........ | ||
|
||
Lines to polygons on linestrings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: I think the titles “Lines to polygons on lines” vs “...on linestrings” could be a bit confusing, especially for users who aren’t deeply familiar with GIS geometry types (as you mentioned someone with a CAD background). Since both examples use LineString inputs, maybe we could rephrase the titles to reflect the real difference, like “with two points” vs “with three or more points” or “simple vs complex LineStrings”?
Might also be worth adding a short intro sentence before each table to explain the behavior upfront (e.g. that lines with only two points will result in empty polygons)...
Hi @selmaVH1, thanks, yes, I meant new commit, not pull request... But I do not know if these comments will be deleted if I push new commit for changes that I've fixed or will they stay? That's why I didn't pushed new commit yet, I would like to hear opinions on some of my comments from Harissou if that's OK? |
If you push a commit with changes on the commented lines, they will be labeled as “Outdated,” but you can still see them if you click that label.
Of course 😄 |
@phidrho Sorry for the delay but no worries, i didn't forget this PR. |
no problem, there's no rush |
@phidrho May I request that you commit all the changes we agree on, please. I'm trying to review but I'm struggling to find my way among the different comments (and am afraid of contradictory suggestions). Sorry for the inconvenience. Thanks. |
Hi @DelazJ, I committed my fixes and marked resolved wherever applicable, here is local render: |
Goal:
Create a better documentation for native LinesToPolygons processing algorthm.
Please review and suggest additional edits if needed.
After we finish final version, I will prepare similar one for native Polygonize algorithm.
Ticket(s): #