Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting default JAVA_TOOL_OPTIONS #77

Closed
mboisson opened this issue Jun 4, 2021 · 13 comments
Closed

Setting default JAVA_TOOL_OPTIONS #77

mboisson opened this issue Jun 4, 2021 · 13 comments

Comments

@mboisson
Copy link
Member

mboisson commented Jun 4, 2021

Java has common memory issues, documented here:
https://docs.computecanada.ca/wiki/Java#Memory_Issues

The question is, would anything break if we set this by default ?

export _JAVA_OPTIONS="-Xms256m -Xmx2g"
@poquirion
Copy link

This overwrites the jvm options entered on the command line, so yes, it would break things. However It could be good to have it by default on the login nodes only, not on the compute nodes. But I do not know how difficult it would be to implement it like that.

Also note that as soon as we will have cgroups implemented on the login nodes instead of relying on ulimit RAM confinement, this java memory problem will go away.

@ostueker
Copy link
Member

ostueker commented Jun 6, 2021

Indeed, if _JAVA_OPTIONS takes precedence over command line options, we shouldn’t set it by default.

This thread on StackOverflow discusses this variable and also comes to the conclusion that _JAVA_OPTIONS takes precedence over command line options and further more is undocumented.

However there is also the (documented) variable JAVA_TOOL_OPTIONS that has lower precedence than command line options that could be used for our purpose.

@poquirion : Setting _JAVA_OPTIONS only on login nodes won’t work, because environment variables from the user’s sessions will propagate to the jobs.

@mboisson
Copy link
Member Author

mboisson commented Jun 7, 2021

Thanks @ostueker and @poquirion, that's good feedback. I would never have imagined an environment variable overridding command line options.

Changing the issue to define JAVA_TOOL_OPTIONS instead.

@mboisson mboisson changed the title Setting default _JAVA_OPTIONS Setting default JAVA_TOOL_OPTIONS Jun 7, 2021
@mboisson
Copy link
Member Author

mboisson commented Jun 11, 2021

I tested by installing rJava on a login node, which fails with a Java OOM by default. Both

JAVA_TOOL_OPTIONS="-Xmx2g"

or

_JAVA_OPTIONS="-Xmx2g"

work to resolve the OOM. Setting -Xms is not necessary (but is probably not hurtful). I therefore suggest setting

JAVA_TOOL_OPTIONS="-Xmx2g"

The question is then where to set it. Two options:

  1. In each of the java modules (this requires re-deploying every java module).
  2. Globally, in the CCconfig.lua. This would avoid changing the behavior on non-CC systems. https://github.com/ComputeCanada/software-stack-custom/blob/main/modules/CCconfig.lua
  3. In the gentoo and nixpkgs modules. This changes it everywhere but avoid redeploying the java modules.

@mboisson
Copy link
Member Author

mboisson commented Jun 14, 2021

This is now deployed. I redeployed the java modules, option 1. It seemed like the cleanest.

@mboisson
Copy link
Member Author

mboisson commented Aug 9, 2021

Apparently, this causes this issue:
nextflow-io/nextflow#1716

@mboisson mboisson reopened this Aug 9, 2021
@clemgoub
Copy link

Hello! My student and I are indeed encountering this problem now on beluga. Our Nextflow pipeline stopped working with the same error as nextflow-io/nextflow#1716

CAPSULE EXCEPTION: Could not parse version line: Picked up JAVA_TOOL_OPTIONS: -Xmx2g (for stack trace, run with -Dcapsule.log=verbose)
USAGE: <options> ../../cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/nextflow/20.10.0/bin/nextflow

Actions:
  capsule.version - Prints the capsule and application versions.
  capsule.modes - Prints all available capsule modes.
  capsule.jvms - Prints a list of all JVM installations found.
  capsule.help - Prints this help message.

Options:
  capsule.mode=<value> - Picks the capsule mode to run.
  capsule.reset - Resets the capsule cache before launching. The capsule to be re-extracted (if applicable), and other possibly cached files will be recreated.
  capsule.log=<value> (default: quiet) - Picks a log level. Must be one of none, quiet, verbose, or debug.
  capsule.java.home=<value> - Sets the location of the Java home (JVM installation directory) to use; If 'current' forces the use of the JVM that launched the capsule.
  capsule.java.cmd=<value> - Sets the path to the Java executable to use.
  capsule.jvm.args=<value> - Sets additional JVM arguments to use when running the application.
Unable to initialize nextflow environment

Setting JAVA_TOOL_OPTIONS="-Xmx2g" did not change the outcome. Unfortunately, I am not familiar enough with Java and Nextflow, so I am willing to hear from you! Thanks!

Clément

@mboisson
Copy link
Member Author

@clemgoub, for now, the only workaround is to unset JAVA_TOOL_OPTIONS. This environment variable is required for other Java usage. Nextflow is bugged in not supporting this.

@clemgoub
Copy link

clemgoub commented Aug 10, 2021

Thank you @mboisson; I tried to add JAVA_TOOL_OPTIONS='' to my sbatch file; I also added export JAVA_TOOL_OPTIONS='' and sourced my ~/.bashrc with no success. $JAVA_TOOL_OPTIONS is empty when echoed in my work environment though.
Would you mind walking me through the proper procedure? -- Thanks again!

EDIT: FIXED. Following @robsyme advice, adding this to the sbatch file did the trick:

export NXF_OPTS=$JAVA_TOOL_OPTIONS
unset JAVA_TOOL_OPTIONS

@mboisson
Copy link
Member Author

This will be fixed once I reinstall nextflow and patch it in place:
ComputeCanada/easybuild-computecanada-config@c4135d2

@clemgoub
Copy link

Excellent, thanks all for the support!

@mboisson
Copy link
Member Author

This is now fixed.

@clemgoub
Copy link

Thanks a lot @mboisson!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants