Skip to content

Update linestopolygons algorithm description #9784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

phidrho
Copy link
Contributor

@phidrho phidrho commented Apr 11, 2025

Goal:
Create a better documentation for native LinesToPolygons processing algorthm.

Please review and suggest additional edits if needed.

After we finish final version, I will prepare similar one for native Polygonize algorithm.

Ticket(s): #

  • Backport to LTR documentation is requested

@phidrho
Copy link
Contributor Author

phidrho commented Apr 11, 2025

I didn't attach visualization (print screen) because it's a pretty long one in render.
But I can deliver it if that will be needed.

Copy link
Collaborator

@DelazJ DelazJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @phidrho
Really nice stuff. It will certainly help a lot of users.

Comment on lines +3865 to +3868
Lines to polygons on linestrings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Features with 3 or more vertices are expected on input: **start point - vertex/vertices - end point**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example contradicts this statement. And as I suggest later, I'd just merge concept of single and multiline segments together

Suggested change
Lines to polygons on linestrings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Features with 3 or more vertices are expected on input: **start point - vertex/vertices - end point**

Copy link
Contributor Author

@phidrho phidrho May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am aware that first example is opposite of description, but as noted in previous comment it's for people who will immediately jump to examples. My logic is to prove the opposite - just like in math - this is an example how algorithm works - and it is only example that show that in this case algorithm will have an output visible in attribute table only, with EMPTY geometry so it does fit in this table since output does exist it is not rejected.

I didn't merge examples from single segment lines and multi segment lines because the first table explains how algorithm works and it's much more important, and second table is like an appendix for users that are coming from CAD background and finding name of algorithm - "lines to polygons" confusing. The second table could be dismissed at all, but I would like to keep it separate because I find it important for understanding how algorithm works in these cases.

Copy link
Contributor Author

@phidrho phidrho May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ADDITIONAL NOTES:

  1. first example from second table with examples (appendix - as I call it) is also to avoid possible bug reports for this case, where a user might be confused why the algorithm doesn't work - it is not obvious in normal conditions (see image below) - and if not documented with example image could possibly be hard to determine cause of behavior (imagine thousands of features on input):
  • same layer, lines look exactly the same

image

image

  1. second example from second table is overkill, I agree, but it is very important for connection with description of Polygonize algorithm - user then can see clearly that this algorithm is not what he/she needs - and we have a "see also" note where he/she can jump to Polygonize algorithm and then user can see there an example that on the same input - algorithm produces polygon for this case
  • same case, example that will be in Polygonize algorithm documentation:

image


So both of these examples are intended for better user experience and not directly for describing algorithm so I moved them to another table/section.

algorithm will automatically connect last to first point when forming a polygon.
Result is always promoted to MultiPolygon.
LineString geometry that have less than three vertices will produce
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept.
new polygon features with **EMPTY** geometry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if this is just QGIS representation, or there is actually some information that generated geometry is indeed MultiPolygon:
image

Comment on lines 3859 to 3860
The attribute table of the output layer is the same as the one of
the input layer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The attribute table of the output layer is the same as the one of
the input layer.
Attributes of the input line feature are ALWAYS kept in the output polygon geometry.

And then I wonder if we need to repeat it for every example below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add this new sentence to new PR, but I would like to keep it repeated for every example.
Why? Because of users who do not read documentation thoroughly but are more of a "visual persons" and want to see examples immediately. Me personally is one of these persons - I would jump straight to examples - probably without reading entire description properly. But I also agree with you that's not needed redundancy - please make a decision and I'll fix it the way it should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t have a strong opinion on this, but I’m concerned that if we adopt this kind of repetition for every example in every algorithm (we will probably add more examples in the future) it could lead to a lot of unnecessary text in the documentation?

|
| **Output has a valid (empty) geometry.**

* - **Feature b**: LineString with three vertices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the number of vertice/segments matters? It is more the situation of a not closed linestring with more than one segment, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linestring doesn't need to be closed. Algorithm always assume vertices are closing a loop - that's why I included example as open linestring with three vertices. I have also tried some other examples when all three vertices lie on same line but these are special cases, and I think that I decided to discard these examples.

:width: 25 em
:align: center

Two distinct LineStrings, one geometrically contained inside another
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(here and below) Are LineString and MultiPolygon (written with that casing) pluralizable? How about

Suggested change
Two distinct LineStrings, one geometrically contained inside another
Two distinct line features, one geometrically contained inside another

Comment on lines +3976 to +3977
- | **Feature e**: MultiPolygon
| **Feature f**: MultiPolygon
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this render in HTML?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that this forces a paragraph in rendering, without it, it would all be in same line:
image

and by adding new line they these two rows are too close in my opinion, see images:
image
image

I would like to keep my version if you agree.

@DelazJ DelazJ added Processing Alg (Features of) An algorithm. Use the Help button in the application to find the page. backport release_3.40 On merge create a backported pull request to 3.40 labels Apr 12, 2025
@phidrho
Copy link
Contributor Author

phidrho commented Apr 29, 2025

sorry for the delay, I will catch up with this PR till the end of the week

@phidrho
Copy link
Contributor Author

phidrho commented May 4, 2025

Hi @DelazJ,

please see my comments and resolve conversations, and give your opinion on things that are left open from me. Then I'll create a new PR and we can finish it refining details.

Sorry once again for delay, I couldn't find free time earlier.

Copy link
Collaborator

@selmaVH1 selmaVH1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @phidrho this is a great improvement.
Just a tip: it might be easier to create a new commit for the suggestions (you can make more than one), rather than opening a whole new PR.

or more vertices as polygon rings. Input LineString geometry doesn't need to close a loop,
algorithm will automatically connect last to first point when forming a polygon.
Result is always promoted to MultiPolygon.
LineString geometry that have less than three vertices will produce
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @phidrho , "vertices" is also a term used in the log message - "One or more line ignored due to geometry not having a minimum of three vertices."

algorithm will automatically connect last to first point when forming a polygon.
Result is always promoted to MultiPolygon.
LineString geometry that have less than three vertices will produce
new polygon features with **EMPTY** MultiPolygon geometry, attributes are kept.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code show us a MultiPolygon, but there is no need to repeat that information twice in the same description?

Copy link
Contributor Author

@phidrho phidrho May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Selma, this two sentences describe different things. Let me try to explain:

Row 3855 explains that you will never get Polygon geometry as a result - it will always be MultiPolygon even when not needed - this is a decision made by developer and should be documented. Many results could be simple Polygons, but result is always promoted to MultiPolygon because then we would need two layers on output - Polygon layer and MultiPolygon layer for specific cases (see table below for difference between Polygon and MultiPolygon for the same geometry shape - they are pretty different in WKB (in database or on hard disk, while on screen they are identical). Although to casual QGIS user doesn't mean much - it is still different geometry type, just like Point and Polygon are different just on a different level. This sentence is very important because if user wants Polygons in final output he must additionally run native:multiparttosingleparts algorithm (rough estimation from AI - multipolygons are 5% larger in storage size compared to same geometry stored as polygons).

Row 3857 explains that result is EMPTY geometry - but technically there is no such thing as "EMPTY geometry", you can only have data-types (literally Point EMPTY, LineString EMPTY, Polygon EMPTY etc.) that are defined by standard, and if you want to parse them you need to know what to expect - Point EMPTY is very different from MultiPolygon EMPTY, see table:

Geometry Type WKT WKB (Hexadecimal)
POINT POINT EMPTY 01 01000000 000000000000F87F 000000000000F87F
POINT (1 2) 01 01000000 000000000000F03F 0000000000000040
MULTIPOINT MULTIPOINT EMPTY 01 04000000 00000000
MULTIPOINT ((1 2), (3 4)) 01 04000000 02000000 0101000000000000000000F03F0000000000000040 010100000000000000000008400000000000001040
LINESTRING LINESTRING EMPTY 01 02000000 00000000
LINESTRING (1 2, 3 4) 01 02000000 02000000 000000000000F03F 0000000000000040 0000000000000840 0000000000001040
MULTILINESTRING MULTILINESTRING EMPTY 01 05000000 00000000
MULTILINESTRING ((1 2, 3 4)) 01 05000000 01000000 010200000002000000000000000000F03F000000000000004000000000000008400000000000001040
POLYGON POLYGON EMPTY 01 03000000 00000000
POLYGON ((1 2, 3 4, 5 6, 1 2)) 01 03000000 01000000 04000000 000000000000F03F 0000000000000040 0000000000000840 0000000000001040 0000000000001440 0000000000001840 000000000000F03F 0000000000000040
MULTIPOLYGON MULTIPOLYGON EMPTY 01 06000000 00000000
MULTIPOLYGON (((1 2, 3 4, 5 6, 1 2))) 01 06000000 01000000 01030000000100000004000000000000000000F03F00000000000000400000000000000840000000000000104000000000000014400000000000001840000000000000F03F0000000000000040

So, because of technical reasons I would like to keep name MultiPolygon EMPTY whenever we are explaining output. That's at least how I would write documentation - so that it is technically correct and exact, if this is not practice in QGIS User Guide, or if you consider this overkill we can remove this and keep only "EMPTY" geometry, I'm not against it by any mean. Please let me know your final decision.

P.S. I've been personally through these problems before - trying to parsing binary geometry in PHP (I didn't have any good library to parse this for me) - so - in this type of cases it is very useful to know whether should you expect MULTIPOLYGON (((0 0, 0 0, 0 0, 0 0))) or MULTIPOLYGON EMPTY data-type, or possibly even more problematic POINT EMPTY.

Comment on lines 3859 to 3860
The attribute table of the output layer is the same as the one of
the input layer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t have a strong opinion on this, but I’m concerned that if we adopt this kind of repetition for every example in every algorithm (we will probably add more examples in the future) it could lead to a lot of unnecessary text in the documentation?

Examples
........

Lines to polygons on linestrings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: I think the titles “Lines to polygons on lines” vs “...on linestrings” could be a bit confusing, especially for users who aren’t deeply familiar with GIS geometry types (as you mentioned someone with a CAD background). Since both examples use LineString inputs, maybe we could rephrase the titles to reflect the real difference, like “with two points” vs “with three or more points” or “simple vs complex LineStrings”?
Might also be worth adding a short intro sentence before each table to explain the behavior upfront (e.g. that lines with only two points will result in empty polygons)...

@phidrho
Copy link
Contributor Author

phidrho commented May 12, 2025

Hi @selmaVH1,

thanks, yes, I meant new commit, not pull request...

But I do not know if these comments will be deleted if I push new commit for changes that I've fixed or will they stay?

That's why I didn't pushed new commit yet, I would like to hear opinions on some of my comments from Harissou if that's OK?

@selmaVH1
Copy link
Collaborator

But I do not know if these comments will be deleted if I push new commit for changes that I've fixed or will they stay?

If you push a commit with changes on the commented lines, they will be labeled as “Outdated,” but you can still see them if you click that label.
Don’t worry, I don’t think there’s any case where github deletes comments automatically... Even if you marked the conversation as resolved, you can still see the comments by clicking on the "Show resolved".

That's why I didn't pushed new commit yet, I would like to hear opinions on some of my comments from Harissou if that's OK?

Of course 😄

@DelazJ
Copy link
Collaborator

DelazJ commented May 15, 2025

@phidrho Sorry for the delay but no worries, i didn't forget this PR.

@phidrho
Copy link
Contributor Author

phidrho commented May 15, 2025

no problem, there's no rush

@DelazJ
Copy link
Collaborator

DelazJ commented Jun 5, 2025

@phidrho May I request that you commit all the changes we agree on, please. I'm trying to review but I'm struggling to find my way among the different comments (and am afraid of contradictory suggestions). Sorry for the inconvenience. Thanks.

@phidrho
Copy link
Contributor Author

phidrho commented Jun 20, 2025

Hi @DelazJ,

I committed my fixes and marked resolved wherever applicable, here is local render:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport release_3.40 On merge create a backported pull request to 3.40 Processing Alg (Features of) An algorithm. Use the Help button in the application to find the page.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants