Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in-place (scaled) matrix transposition: imatcopy #1017

Open
rileyjmurray opened this issue May 21, 2024 · 6 comments
Open

in-place (scaled) matrix transposition: imatcopy #1017

rileyjmurray opened this issue May 21, 2024 · 6 comments

Comments

@rileyjmurray
Copy link

Intel MKL has a useful function for (scaled) in-place transposition:

https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-fortran/2024-1/mkl-imatcopy.html

I raised the issue of having this function as a utility in LAPACK proper, and Julian (@langou) expressed support for that. I'm making a note of it here so we don't lose track of this.

@thijssteel
Copy link
Collaborator

I had a look at in-place matrix transposes at some point, but they are far from trivial to implement.

If the matrix is square, it is just a question of swapping A(i,j) and A(j,i). A simple recursive implementation is even reasonably efficient.

If the matrix is not square, the cycle lengths can be longer, so we almost certainly require some memory (not just for an efficient implementation, but for a reference too).

@ilayn
Copy link
Contributor

ilayn commented May 22, 2024

Indeed cache-oblivious transpose is really hard to get it right and quite challenging in terms of low-level decisions. But I would be really happy if we get it. I do think it won't be as performant as the copy-transpose but there is still a lot to reap compared to a double loop.

@mgates3
Copy link
Contributor

mgates3 commented May 22, 2024

copy is an odd name and description for an in-place operation. From Intel's docs, "A transposition operation can be a normal matrix copy, a transposition, a conjugate transposition, or just a conjugation. The operation is defined as follows:

AB := alpha*op(AB)."

(emphasis mine)

Since the source and destination are both AB, I gather that "a normal matrix copy" (trans=N) is basically changing the stride in place, from lda to ldb? (And applying scaling.)

Gustavson has several papers on in-place transpose, e.g., https://doi.org/10.1145/2168773.2168775. Something akin to this in implemented in PLASMA.

@rileyjmurray
Copy link
Author

@mgates3, while I agree that imatcopy is not a great name, it's a fairly common one:

In situations like these I think it's preferable to go with the crowd, and just have good documentation about what can be done with the function.

@thijssteel and @ilayn: it'd be fine with me to use some workspace. When I said in-place, I really just meant "don't force the user to make a full copy."

@mgates3
Copy link
Contributor

mgates3 commented May 24, 2024

@rileyjmurray Respectfully, I disagree. If Reference BLAS / LAPACK is a (de facto) standard, it should be reasonably self-consistent. imatcopy and omatcopy don't follow the BLAS naming scheme in any way, though the arguments seem conforming. If we propose a standard name, it should be trivial for libraries that have imatcopy to add a new name.

This goes to my point on gemmt and batch gemm, that we as a community need a mechanism to propose and adopt new standard routines.

@rileyjmurray
Copy link
Author

@mgates3, fair enough! I don't actually have a strong opinion on what the function should be called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants