Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements 🚀 #488

Open
juliohm opened this issue Jun 29, 2023 · 15 comments
Open

Performance improvements 🚀 #488

juliohm opened this issue Jun 29, 2023 · 15 comments
Labels
help wanted Extra attention is needed performance

Comments

@juliohm
Copy link
Member

juliohm commented Jun 29, 2023

This issue tracks our efforts to improve the performance of different algorithms and data structures implemented in the project. The public API is stabilizing and there is only one major breaking change planned before a possible v1.0.

Given the solid design that separates user-facing functions from internals, we can easily refactor the code without breaking code downstream. We kindly ask the community to add comments here whenever they encounter a performance problem.

@juliohm juliohm added help wanted Extra attention is needed performance labels Jun 29, 2023
@juliohm juliohm pinned this issue Jun 29, 2023
@juliohm
Copy link
Member Author

juliohm commented Jul 3, 2023

Copying from Zulip:

One slightly breaking change I would consider is changing the vararg constructors of geometries to use static arrays (possibly mutable). This would make it much easier to write performant code that handles a lot of small geometries (e.g. segments or triangles) and avoid code such as Segment(view(v, [i, i + 1])) that currently exists (which in this case allocates the index vector plus possibly the view, afaik). I think I also have a general expectation that the arguments of a constructor are stored within the struct and not allocated, so having Triangle(a, b, c) allocate [a, b, c] is rather surprising to me and having expressions such as centroid(Segment(a, b)) and area(Triangle(a, b, c)) perform well would be very desirable imo. The only drawback that I see is that the number of points becomes fixed, which is arguably a good thing for Ngon but maybe not in the case of Chain/Bezier (though constructing them with a regular vector would still be possible of course).

comment by @mfsch

@juliohm
Copy link
Member Author

juliohm commented Jul 3, 2023

That is something we have in mind. In the past we had considered StaticVector for the list of vertices inside some of the Polytope types, but refactored and erased this. We may need to reintroduce this optimization and use NTuple directly whenever possible. Some of our Polytope types like PolyArea and Chain's subtypes require dynamic vectors due to the use cases.

@BloodWorkXGaming
Copy link

Related to this, I found another potential bottleneck: segments() function for chains is slower by about a factor of 6 than the "manual" version with StaticArrays:
This small change had an improvement from 30 to 5 seconds in my case:

# @views for seg in segments(chain)
@views for (p1, p2) in zip(verts[1:n-1], verts[2:n])
        seg = Segment(SA[p1, p2])
       .... do stuff ....
end

@juliohm
Copy link
Member Author

juliohm commented Jul 7, 2023 via email

@mfsch
Copy link
Contributor

mfsch commented Jul 7, 2023

Switching Segment(view(v, [i, i + 1])) to Segment(SVector(v[i], v[i+1])) in the segments function should correspond to what @BloodWorkXGaming did, but as I wrote I think it would be more useful to make Segment(a, b) expand to Segment(SVector(a, b)) (and similarly for all n-gons), instead of changing individual uses of Segment. If you prefer not to change the varargs constructor of Ngon, then it might make sense to start improving the performance of individual use cases, but I think it would be helpful to first make that more general decision.

@BloodWorkXGaming
Copy link

BloodWorkXGaming commented Jul 7, 2023

Another point of performance improvement might be the boundingbox implementations:
These have a lot of allocations, which shouldn't be necessary for any boundingbox calculation.
E.g. for this shape it's over 17kb of RAM

julia> poly
2 GeometrySet{2,Float64}
  └─PolyArea(50-Ring)
  └─PolyArea(356-Ring, 302-Ring)

julia> @time boundingbox(poly)
  0.000021 seconds (21 allocations: 17.172 KiB)
Box{2, Float64}(Point(0.604086399, -13.3619730947), Point(149.4063610679, 13.8699076501))

This isn't really a priority as it is still fast, but could be a problem when running in a tight loop.

@juliohm
Copy link
Member Author

juliohm commented Jul 7, 2023 via email

@juliohm
Copy link
Member Author

juliohm commented Jul 7, 2023

@BloodWorkXGaming regarding the boundingbox allocations, did you identify the harmful lines? Feel free to submit PRs for review. If it is also related to the current internal representation of vertices in Polytope types, we will fix it in a more systematic way.

@BloodWorkXGaming
Copy link

Ye I found some harmful lines, it is rather related to the MVectors and collect operations in the functions. Will try to open a PR after the weekend with some improvements :)

@juliohm
Copy link
Member Author

juliohm commented Jul 17, 2023

Update: we already implemented a series of changes to the vertices representation inside Polytope geometries. They are NTuple of Point now, which means, more compile-time optimizations.

@juliohm
Copy link
Member Author

juliohm commented Jul 21, 2023

Update: refactored boundingbox and hull methods to allocate zero memory whenever possible.

@BloodWorkXGaming all cases covered now from your previous PR.

@juliohm juliohm changed the title Performance improvements Performance improvements 🚀 Jul 21, 2023
@BloodWorkXGaming
Copy link

Thanks! Sorry for not refactoring the PR, was very low on time the past weeks

@juliohm
Copy link
Member Author

juliohm commented Jul 26, 2023

Update: pointwise transforms no longer allocate intermediate vectors. Rotate, Translate and Stretch are examples.

@juliohm
Copy link
Member Author

juliohm commented Feb 1, 2024

We need to improve the performance of the FIST triangulation method. I think it is already type stable, we need algorithmic improvements and multiple threads now.

@juliohm
Copy link
Member Author

juliohm commented Jul 4, 2024

The new sideof(point, ring) algorithm by Hao et al, which we implemented with multiple-threads leads to ~12x speedup. point in poly queries benefit equally from the improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance
Projects
None yet
Development

No branches or pull requests

3 participants