Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen for multi-dimensional arrays is poor #35293

Closed
tannergooding opened this issue Apr 22, 2020 · 11 comments
Closed

Codegen for multi-dimensional arrays is poor #35293

tannergooding opened this issue Apr 22, 2020 · 11 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Apr 22, 2020

The overall codegen for multi-dimensional arrays is poor and has worse performance than a jagged array, even though the former is linear in memory.
This has resulted in multiple StackOverflow posts on the topic and even Unity recommending that multi-dimensional arrays not be used: https://docs.unity3d.com/2019.3/Documentation/Manual/BestPracticeUnderstandingPerformanceInUnity8.html

It looks like there are a few obvious issues that should hopefully be trivially handled:

  1. The Array.GetLength() method currently generates fairly unoptimal code. It actually winds up as a call rather than just a lookup to the underlying array data. Ideally these would be straight lookups when the dimension being looked up is constant and definitely in bounds of the rank.
  2. The JIT doesn't appear to be eliding bounds checks against Array.GetLength(), whether cached or otherwise
  3. The JIT doesn't recognize that for (y = 0; y < yLength; y++) { for (x = 0; x < xLength; x++) { data[x, y]; } } results in linear access of the underlying data

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACMhgYQYG8aHunGBLAHYYGAZWz4ADgBsY5ABSCMAbTQBdBgBNsGbAEoOXHkb4AzBnK06GfXAwEBXKVP3EA7AwAMAbkNHuvv0UGWFxHYQBeTx9qPx4AoxNocyCATwZI7wY0gB5NbWwAOgBxGAwAGRgBAHMMAAs5cl0vLIBqFt14nk4Y2L9EqGShBgR0qO4R3MtCkvLKmvqPJuG2jp7ers717hCwhhbIqaUENCzVaK2eAF9Nhmu1vxu3YJhQqQxzq4CA+mshsUkZKQFEIVOopvpuutTOYptZbA4nC53N4bjcgjs3qMUfcjDcAG7YAYpCrVOqjKbFUok+YNJr4wnDalkg75SmzUkLOk4uLc7j9QbCNIZZo5LJM2oilY3SEXBj84HCEbC8YMXIIcXNBBS3l+GWyngY4T7PI6I4nFJnG5GO7rG29R7uQ0fbh2/z3H5Bf7SGB0BWgk16Aw66EWfJwuyOZxMZHOh7BoaGrGx3E6gkDdVzZkBtniuSLZM8NNizO1cmsma5xoFt3reVBJVjYaqxklzXa9Z6rZ1oZCsai4mt1rtK0bHXrRPGw7HU7VvyuueonVPJ0BefffhDCsljx+tTDCEO4Y5kt5rnWr7ujfCLcc+SKf0IA9L9wIY8c2mxu6XIA

category:cq
theme:md-arrays
skill-level:expert
cost:large

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Apr 22, 2020
@kunalspathak
Copy link
Member

Related: #35056

@jkotas
Copy link
Member

jkotas commented Apr 23, 2020

The JIT doesn't appear to be eliding bounds checks against Array.GetLength()

Multidimensional arrays have lower-bounds. It makes eliding bounds checks against Array.GetLength() impossible (without loop cloning for arrays with zero lower bounds). It would need to be against Array.GetLowerBound/GetUpperBound.

@tannergooding
Copy link
Member Author

It looks like the same issues exist with GetLowerBound and GetUpperBound. They are calls when they could be direct reads (like is done for the bounds checks) and there is no eliding happening.

@BruceForstall
Copy link
Member

@jkotas It's my understanding that neither C# nor VB allow non-zero lower bounds. Could we consider a breaking change to .NET Core to eliminate the non-zero lower bounds for multi-dimensional arrays?

@jkotas
Copy link
Member

jkotas commented Apr 24, 2020

I think we can entertain that option. We would need data about the impact and benefits. https://apisof.net/catalog/System.Array.CreateInstance(Type,Int32(),Int32()) shows these arrays may be used by 3.2% apps.

@svick
Copy link
Contributor

svick commented Apr 24, 2020

If completely eliminating non-zero-based multidimensional arrays is being considered, a softer breaking change could be just to prevent casting them to zero-based multidimensional arrays. This way, such arrays could still be used as Array, but would no longer be valid C# multidimensional arrays, so bounds check elimination when GetLength() is used would become possible for those.

E.g. consider this code:

var a = (int[,])Array.CreateInstance(typeof(int), new[] { 1, 1 }, new[] { 1, 1 });
_ = a[0,0];

The relevant parts of the IL are:

…
call class [System.Private.CoreLib]System.Array [System.Private.CoreLib]System.Array::CreateInstance(class [System.Private.CoreLib]System.Type, int32[], int32[])
castclass int32[0..., 0...]
…
call instance int32 int32[0..., 0...]::Get(int32, int32)

Today, this throws at the call to the indexer/Get: notice that even though castclass specifies a zero-based array, this is ignored and the cast succeeds.

If non-zero-based arrays were eliminated, this would presumably already throw at the call to CreateInstance.

Under my suggestion, this would throw at the cast to int[,].

@jkotas
Copy link
Member

jkotas commented Apr 24, 2020

The multidimensional arrays with and without lower bounds use the same type. The lower bounds in the signatures have no meaning at runtime, they are for compile-time only (e.g. like nullable annotations). Preventing casting would require introducing a distinct type for these that seems hard to reason about.

If we were to consider introducing new types, we may also look at multidimensional spans (e.g. Span2, Span3, Span4, 5+ dimensions are probably too rare to bother with) that would also address the problem with accessing unmanaged multidimensional memory efficiently.

@AndyAyersMS
Copy link
Member

See also #5481. We don't recognize and optimize invariant parts of md array address computations early enough; doing so would at least bring perf back in line with what one could get with jit64. Impact of that is probably more significant than eliding bounds checks.

I'm going to mark this as future. We may have some opportunity to look at improvements here for 5.0, but not sure yet if this will make the cut.

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label May 6, 2020
@AndyAyersMS AndyAyersMS added this to the Future milestone May 6, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@kunalspathak kunalspathak modified the milestones: Future, 6.0.0 Nov 30, 2020
@kunalspathak kunalspathak removed the JitUntriaged CLR JIT issues needing additional triage label Nov 30, 2020
@kunalspathak
Copy link
Member

This need to happen in .NET 6.0

@JulieLeeMSFT JulieLeeMSFT added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 23, 2021
@kunalspathak kunalspathak modified the milestones: 6.0.0, Future Apr 1, 2021
@kunalspathak
Copy link
Member

I don't think we will have bandwidth to do this in .NET 6.0. Marking this for future.

@JulieLeeMSFT JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 7, 2021
@BruceForstall
Copy link
Member

This is addressed in #70271. There are follow-up items there to continue to improve MD array performance.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Archived in project
Development

No branches or pull requests

8 participants