Skip to content

Conversation

angsch
Copy link
Collaborator

@angsch angsch commented Jun 11, 2023

Following the removal of LABAD, it looks reasonable to remove the comment warning about non IEEE 754 compliant machines. The machines mentioned in the comment date back to the 80s.

Besides deleting the comments in LAED3, also the call to LAMC3 is removed. The purpose of LAMC3 is to add 2 real numbers without compiler optimization. LAMC3 appears to me like a outdated function that should be fully removed from LAPACK. Besides LAED3, there are a bunch of other routines such as SLALS0 where LAMC3 is called. None of these explain explicitly why LAMC3 is required. I think that these occurrences serve the same reason as in LAED3, ie avoid accuracy problems on non-IEEE compliant machines. Do you agree?

@langou
Copy link
Contributor

langou commented Jun 15, 2023

Hi Angelika,

From Jim Demmel:

Looking at all the uses of slamc3 in SRC code, most of them are indeed motivated by non-IEEE754 machines, like long-gone Crays. But both slals0 and slasd8 appear to use it to enforce parentheses in an expression of the form (x+y)+z, where x, y and z are all different, so not (x+x)-x, which is used to protect against non-IEEE754 arithmetic, by using (x+x)-x to (sometimes) zero-out the bottom bit of x. Presumably, if the compiler is asked to aggressively optimize, we can't guarantee that (x+y)+z respects parentheses, so we should keep slamc3. Of course, if the compiler in-lines slamc3 and then ignores parentheses, something could go wrong. I'm not sure if compilers are likely to be this aggressive.

SLASD8 uses SLAMC3 in two places:
(1) to enforce parentheses in a way worth keeping, and
(2) to deal with non-IEEE arithmetic.
So we only want to keep case (1).

Cheers,
Julien

langou
langou previously approved these changes Jun 15, 2023
@langou
Copy link
Contributor

langou commented Jun 15, 2023

So based on Jim's comments, it looks like all the changes of Angelika are great. 👍

The changes to (S/D)LAED3 are good.

Then I think to be consistent, we might, in addition, want to do the "same" changes to (S/D)LAED9 and (S/D)LASD3.

This would leave the use of (S/D)LAMC3 in (S/D)LALS0 and (S/D)LASD8. Which is what, I think, we want.

This agrees with Jim's email.

So here are a few more suggested changes.

(S/D)LAED9:
Remove these three lines of code:

lapack/SRC/dlaed9.f

Lines 233 to 235 in a83d8d2

DO 10 I = 1, N
DLAMDA( I ) = DLAMC3( DLAMDA( I ), DLAMDA( I ) ) - DLAMDA( I )
10 CONTINUE

And then DLAMDA (sic.) would be INPUT (as opposed to IN/OUT), but it is already declared as INPUT only (by mistake), so nothing to change then:
*> \param[in] DLAMDA

(S/D)LASD3:
Remove these three lines of code:

lapack/SRC/dlasd3.f

Lines 330 to 332 in a83d8d2

DO 20 I = 1, K
DSIGMA( I ) = DLAMC3( DSIGMA( I ), DSIGMA( I ) ) - DSIGMA( I )
20 CONTINUE

And then DSIGMA would be INPUT (as opposed to IN/OUT), so need to edit this line of comment:
*> \param[in,out] DSIGMA

@langou
Copy link
Contributor

langou commented Jun 15, 2023

If possible, since we are editing (S/D)LAED3 and the likes, maybe we can change DLAMDA to DLAMBDA?

A grep of LAMDA reveals that we have LAMDA in the following ten files:

  • (C/D/S/Z)LAED8
  • (D/S)LAED2
  • (D/S)LAED3
  • (D/S)LAED9

So, it would, be nice to change all these LAMDA to LAMBDA.

@langou
Copy link
Contributor

langou commented Jun 15, 2023

@angsch writes

Besides LAED3, there are a bunch of other routines such as SLALS0 where LAMC3 is called. None of these explain explicitly why LAMC3 is required.

The fact that we are lacking comments is a good point.

Maybe, in (S/D)LALS0 and (S/D)LASD8, just before calling LACM3, we add something like:

We use LACM3 to perform (x+y)+z to try to enforce the correct parenthesis. The goal is to try to avoid the compiler doing x+(y+z).

And, in (S/D)LAMC3, we write

This routine is currently used by (S/D)LALS0 and (S/D)LASD8. This routine is used to to try to enforce the correct parenthesis in the expression (x+y)+z. The goal is to try to avoid the compiler doing x+(y+z).

@angsch
Copy link
Collaborator Author

angsch commented Jun 15, 2023

Thank you, Julien, for following up and checking with Prof Demmel 👍. It really makes sense to fix LAED9 and LASD3, too. I will add a commit with the changes you suggest tomorrow.

@angsch
Copy link
Collaborator Author

angsch commented Jun 16, 2023

Summary of changes:

  • The initial commit missed some comments on non-IEEE 743 compliant machines.
  • Rename lamba to lambda, change formatting to meet the line length limit when needed
  • Remove calls to LAMC3 when used as a workaround for Cray machines
  • Update the documentation when the removal of LAMC3 leaves the eigenvalues as in rather than inout
  • Add comment on the purpose of LAMC3 where it used to enforce the execution order

@angsch
Copy link
Collaborator Author

angsch commented Jun 16, 2023

We are getting a bunch of errors with CMake due to missing executables. At least some appear to be related to the CBLAS suffix change.

 80/123 Test  #28: example2_64_CBLAS ................***Not Run   0.00 sec
        Start  63: LAPACK-xeigtstd_dsg_in
Unable to find executable: /home/runner/work/lapack/lapack/build/bin/xexample2_64_CBLAS

@langou
Copy link
Contributor

langou commented Jun 21, 2023

Thanks. I reverted #856. Can you try again to run the test?

angsch added 3 commits June 21, 2023 18:01
This comment is irrelevant on any machine that realizes
IEEE 754.
* Rename dlamda -> dlambda
* Remove those calls to lamc3 that are a workaround for
  old Cray machines
* Document the purpose of the remaining lamc3 calls
* Update documentation where dlambda has become an input
  rather than input & output

Thanks to @langou and Prof Demmel for investigating
the purpose of the lamc3 calls and suggesting
a solution.
@angsch angsch force-pushed the clean-no-guard-digit branch from 991c965 to e186ae4 Compare June 21, 2023 16:07
@codecov
Copy link

codecov bot commented Jun 21, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (ccdc1a5) 0.00% compared to head (8901f33) 0.00%.

❗ Current head 8901f33 differs from pull request most recent head e186ae4. Consider uploading reports for the commit e186ae4 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #852   +/-   ##
=======================================
  Coverage    0.00%    0.00%           
=======================================
  Files        1916     1916           
  Lines      188509   188485   -24     
=======================================
+ Misses     188509   188485   -24     
Impacted Files Coverage Δ
SRC/cgelsd.f 0.00% <ø> (ø)
SRC/cgesdd.f 0.00% <ø> (ø)
SRC/chbevd.f 0.00% <ø> (ø)
SRC/chbevd_2stage.f 0.00% <ø> (ø)
SRC/chbgvd.f 0.00% <ø> (ø)
SRC/cheevd.f 0.00% <ø> (ø)
SRC/cheevd_2stage.f 0.00% <ø> (ø)
SRC/chegvd.f 0.00% <ø> (ø)
SRC/chpevd.f 0.00% <ø> (ø)
SRC/chpgvd.f 0.00% <ø> (ø)
... and 64 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@angsch
Copy link
Collaborator Author

angsch commented Jun 21, 2023

Rebasing solved the CI problems. 👍

@langou langou merged commit 934313c into Reference-LAPACK:master Jun 21, 2023
@angsch angsch deleted the clean-no-guard-digit branch June 21, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants