Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xGEQP3 changes July 2024 #1035

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

frjohnst
Copy link

This PR for xGEQP3 has two purposes:

  1. to a fix a work-space bug in DGEQP3, SGEQP3

  2. to add a quick return for M=0 to all 4 xGEQP3 routines
    since lower level routines currently return non-zero INFO
    when M=0, N>1, LWORK=1 even though the top level xGEQP3
    accept this combination.

These are described in more detail below:


  1. fix work-space bug in DGEQP3, SGEQP3

The computation of a block size in dgeqp3.f and sgeqp3.f
is based in a variable "SN", possibly smaller than the
input argument "N",

  +272           SN = N - NFXD
...
  +293                 MINWS = 2*SN + ( SN+1 )*NB
  +294                 IWS = MAX( IWS, MINWS )
  +295                 IF( LWORK.LT.MINWS ) THEN
  +296  *
  +297  *                 Not enough workspace to use optimal NB: Reduce NB and
  +298  *                 determine the minimum value of NB.
  +299  *
  +300                    NB = ( LWORK-2*SN ) / ( SN+1 )
"lapack/SRC/dgeqp3.f"

But subsequently, offsets into the "WORK" array, passed to lower
level routines, are based on "N" rather than "SN" which is
incorrect/inconsistent and can cause problems. For example:


  +335                 CALL DLAQPS( M, N-J+1, J-1, JB, FJB, A( 1, J ), LDA,
  +336       $                      JPVT( J ), TAU( J ), WORK( J ), WORK( N+J ),
  +337       $                      WORK( 2*N+1 ), WORK( 2*N+JB+1 ), N-J+1 )

dgeqp3.f and sgeqp3.f should be changed to use "N" instead of "SN" for the work-space
check (using "MINWS") and blocksize ("NB") calculation.

In other words,
MINWS = 2*SN + ( SN+1 )*NB
should be changed to
MINWS = 2* N + ( SN+1 )*NB
and
NB = ( LWORK-2*SN ) / ( SN+1 )
should be changed to
NB = ( LWORK-2* N ) / ( SN+1 )

in both files.


  1. add quick return for M=0 to all 4 xGEQP3 routines.

As currently coded, CGEQP3, DGEQP3, SGEQP3, ZGEQP3 all
accept M=0, N>1 with LWORK=1, but this combination causes
lower level routines (e.g. DORMQR in the case of DGEQP3)
to return non-zero INFO for insufficient work space.

For example when M=0 in DGEQP3, LWORK=1
passes the initial check (since IWS will be 1, not
greater than LWORK, which equals 1 in this example)

  +202        IF( INFO.EQ.0 ) THEN
  +203           MINMN = MIN( M, N )
  +204           IF( MINMN.EQ.0 ) THEN
  +205              IWS = 1
  +206              LWKOPT = 1
  +207           ELSE
  +208              IWS = 3*N + 1
  +209              NB = ILAENV( INB, 'DGEQRF', ' ', M, N, -1, -1 )
  +210              LWKOPT = 2*N + ( N + 1 )*NB
  +211           END IF
  +212           WORK( 1 ) = LWKOPT
  +213  *
  +214           IF( ( LWORK.LT.IWS ) .AND. .NOT.LQUERY ) THEN
  +215              INFO = -8
  +216           END IF
  +217        END IF
"lapack/SRC/dgeqp3.f"

If M=0, N>1 DORMQR will be called

  +259              CALL DORMQR( 'Left', 'Transpose', M, N-NA, NA, A, LDA,
  +260       $                   TAU,
  +261       $                   A( 1, NA+1 ), LDA, WORK, LWORK, INFO )
"lapack/SRC/dgeqp3.f"

but DORMQR returns non-zero INFO if N>1, LWORK=1, even though LWORK=1 passed
the check in DGEQP3: (in this example NW will equal N which is greater than
both 1 and LWORK)

  +214        IF( LEFT ) THEN
  +215           NQ = M
  +216           NW = MAX( 1, N )
  +217        ELSE
...
  +235        ELSE IF( LWORK.LT.NW .AND. .NOT.LQUERY ) THEN
  +236           INFO = -12
  +237        END IF
"lapack/SRC/dormqr.f"

To improve the end-user experience, We believe this should be
fixed by adding a quick return for the M=0 case to
all four xGEQP3 routines.

For example in DGEQP3, change (around line 242)

   10 CONTINUE
      NFXD = NFXD - 1
*
*     Factorize fixed columns
"lapack/SRC/dgeqp3.f"

to (add quick return for M=0)

   10 CONTINUE
*
*     Quick return if possible
*
      IF( M.EQ.0 ) RETURN
*
      NFXD = NFXD - 1
*
*     Factorize fixed columns

*
* Quick return if possible.
*
IF( M.EQ.0 ) RETURN
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use MINMN.EQ.0?

Copy link
Author

@frjohnst frjohnst Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used M.EQ.0 because I was trying to change the code as little as possible.
MINMN.EQ.0 would be equivalent to M.EQ.0 .OR. N.EQ.0.
The N=0 case appears to work with the existing code (NFXD is 0 when N=0 and most code is skipped).

But I have no objection to replacing M.EQ.0 with MINMN.EQ.0.
If this additional change is required, what would be the best way to accomplish this?
(Please forgive my ignorance, I am not very familiar with git.)

Could the PR be accepted in it's current form in order to pick up the work space fix and since it does function correctly as it is now? I would be happy to submit a second PR which changes M.EQ.0 to MINMN.EQ.0 in all 4 xGEQP3 files. Is that acceptable? (I am afraid I don't know how to update the contents of an open PR, but I can try to find out, if needed.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, perhaps I would only need to change M.EQ.0 to MINMN.EQ.0 and do a second
git commit ...
from the same branch? Do you want me to try that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants