Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ind2x mvec #32

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

mattuntergassmair
Copy link
Contributor

It seems that pre-specifying the vector size using MVector from StaticArrays helps to significantly improve the performance of ind2x
Screenshot from 2020-03-05 23-57-18

@mattuntergassmair
Copy link
Contributor Author

Could potentially be a breaking change if users were relying on the return type of ind2x to be Array rather than AbstractArray

@coveralls
Copy link

coveralls commented Mar 5, 2020

Coverage Status

Coverage decreased (-0.07%) to 95.726% when pulling 16403a7 on mattuntergassmair:ind2x_mvec into 5dcb458 on sisl:master.

@mykelk
Copy link
Member

mykelk commented Mar 6, 2020

@MaximeBouton shall we merge this?

@zsunberg
Copy link
Member

zsunberg commented Mar 6, 2020

I like using StaticArrays! Shouldn't you be able to get it down to 0 allocs though? You have only reduced it to 1000000 from 2000000

ndims = dimensions(grid)
x = Array{Float64}(undef, ndims)
x::MVector{D, Float64} = @MVector zeros(D)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to reduce allocation to zero, adding a container for x in grid might do it. @zsunberg @mattuntergassmair

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean adding a MVector{D, Float64} as a field to the RectangleGrid struct, so we can simply use that to store the result instead of allocating - did I understand the suggestion correctly?
It would help with the allocations, but I think what would happen is that if you call ind2x() twice, and you are still holding on to the result computed in the first call, then it gets overwritten in the second call without the user noticing, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in that case users should just pre-allocate memory on their side and use ind2x!() rather than ind2x()

Copy link
Member

@zsunberg zsunberg Mar 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x is typically pretty low-dimensional, right? I think ind2x should return an SVector. I think that should get rid of all the allocations and save a large amount of time.

In fact, I think the calls between ind2x and ind2x! should be switched, i.e. ind2x should contain the logic, and ind2x! should be implemented as

function ind2x!(grid, ind, x)
x[:] = ind2x(grid, ind)
end

ind2x!(grid, ind, x)
x::Array{Float64}
x::MVector{D, Float64}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need the type annotation here? I would check with @code_warntype that this is type stable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just kept the type annotation that was already present in the form of ::Array{Float64}, but should not be necessarry, agreed. My understanding was that it's more explicit than required, but also doesn't harm, right?

@mattuntergassmair
Copy link
Contributor Author

I like using StaticArrays! Shouldn't you be able to get it down to 0 allocs though? You have only reduced it to 1000000 from 2000000

As for the allocations, I'm running the function benchmark_ind2x() with default parameters n_dims = 6, n_points_per_dim = 10, leading to 10^6 points. The benchmark function then calls ind2x() 10^6=1000000 times. So I think the allocs are as expected (as opposed to ind2x!() which should be non-allocating). I think what was happening earlier was that there was a copy somewhere leading to 2 x 10^6 allocs.

@zsunberg
Copy link
Member

zsunberg commented Mar 6, 2020

Could potentially be a breaking change if users were relying on the return type of ind2x to be Array rather than AbstractArray

I think this is fine. If people are relying on a concrete array type, I will not feel sorry if their code gets broken :)

@MaximeBouton
Copy link
Member

Shouldn't we expect a similar speed up by replacing those:
https://github.com/sisl/GridInterpolations.jl/blob/master/src/GridInterpolations.jl#L36
by mutable statically sized arrays?

@mattuntergassmair
Copy link
Contributor Author

Shouldn't we expect a similar speed up by replacing those:
https://github.com/sisl/GridInterpolations.jl/blob/master/src/GridInterpolations.jl#L36
by mutable statically sized arrays?

True, we could probly fix the size of most vectors that are part of the RectangleGrid struct and that should help with allocation and performance.
I could look into it but might need some help since I don't feel I understand the code base sufficiently well. Anyone want to team up on this (for a different PR)?

@zsunberg
Copy link
Member

zsunberg commented Mar 7, 2020 via email

@mattuntergassmair
Copy link
Contributor Author

Those allocations are in the constructor, so I don't think it will make too much difference because they are only done once when the object is constructed.

On Fri, Mar 6, 2020 at 4:29 PM Maxime @.***> wrote: Shouldn't we expect a similar speed up by replacing those: https://github.com/sisl/GridInterpolations.jl/blob/master/src/GridInterpolations.jl#L36 by mutable statically sized arrays? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32?email_source=notifications&email_token=ABALI26NZBQ4T2K4OVNV5OTRGGBODA5CNFSM4LCUEXH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODFVIY#issuecomment-596007587>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABALI27K3ONJYOXWJJKLJTLRGGBODANCNFSM4LCUEXHQ .

Agreed that it may not make a difference in the constructor itself, but we could make the member variables static in size, maybe the fixed size helps with memory optimization. I can give it a shot and report back.

@zsunberg
Copy link
Member

zsunberg commented Mar 7, 2020

As for the allocations, I'm running the function benchmark_ind2x() with default parameters n_dims = 6, n_points_per_dim = 10, leading to 10^6 points. The benchmark function then calls ind2x() 10^6=1000000 times. So I think the allocs are as expected (as opposed to ind2x!() which should be non-allocating). I think what was happening earlier was that there was a copy somewhere leading to 2 x 10^6 allocs.

@MVector zeros(D) Allocates. We should really do this with SVectors. Right now we are getting a 2x speedup with mvectors. I bet we will get a 10 or 100x speedup with svectors

@MaximeBouton
Copy link
Member

For a broader utilization of MVector for the interpolation code, one thing to keep in mind is that if the size is larger than 100 we should expect decrease in performance according to the StaticArrays.jl readme. Since the size is given by 2^num_dimensions, it grows pretty fast.

@mykelk
Copy link
Member

mykelk commented Dec 5, 2023

What is the status of this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants