Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI on macOS failing due to IQTREE #382

Closed
tsibley opened this issue Jul 24, 2024 · 11 comments
Closed

CI on macOS failing due to IQTREE #382

tsibley opened this issue Jul 24, 2024 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@tsibley
Copy link
Member

tsibley commented Jul 24, 2024

IQTREE is triggering an illegal instruction:

ERROR: Shell exited from fatal signal SIGILL when running: iqtree2 -ntmax 1 -s results/aligned-delim.fasta -m GTR -ninit 2 -n 2 -me 0.05 -nt AUTO -redo > results/aligned-delim.iqtree.log
@tsibley tsibley added the bug Something isn't working label Jul 24, 2024
@tsibley
Copy link
Member Author

tsibley commented Jul 29, 2024

This seems persistent, so I suspect some issue with an IQ-TREE version change in Bioconda, which is used to setup and test an ambient runtime for CI.

Notably, the last successful CI run on 23 July got bioconda/osx-64::iqtree==2.3.5==h63051a7_1. The first failing CI run, just the next day on 24 July, got bioconda/osx-64::iqtree==2.3.5==h49642b6_2. The build number change from 1 → 2 potentially implicates the Bioconda recipe change. Both ran on macOS 14 arm64 runners, with the same runner image (20240714.2).

@tsibley
Copy link
Member Author

tsibley commented Jul 29, 2024

blab@blab-mac ~ % sw_vers
ProductName:	macOS
ProductVersion:	12.6.8
BuildVersion:	21G725

blab@blab-mac ~ % arch
arm64

blab@blab-mac ~ % CONDA_SUBDIR=osx-64 micromamba create -p /tmp/iqtree -c bioconda -c conda-forge 'bioconda/osx-64::iqtree 2.3.5 h49642b6_2'
Found conda-prefix at '/private/tmp/iqtree'. Overwrite?: [y/N] y
bioconda/osx-64                                             Using cache
bioconda/noarch                                             Using cache
conda-forge/osx-64                                          Using cache
conda-forge/noarch                                          Using cache

Transaction

  Prefix: /private/tmp/iqtree

  Updating specs:

   - bioconda/osx-64::iqtree==2.3.5=h49642b6_2


  Package        Version  Build       Channel           Size
──────────────────────────────────────────────────────────────
  Install:
──────────────────────────────────────────────────────────────

  + libcxx        18.1.8  hef8daea_1  conda-forge     Cached
  + llvm-openmp   18.1.8  h15ab845_0  conda-forge     Cached
  + iqtree         2.3.5  h49642b6_2  bioconda        Cached

  Summary:

  Install: 3 packages

  Total download: 0 B

──────────────────────────────────────────────────────────────


Confirm changes: [Y/n] y

Transaction starting
Linking libcxx-18.1.8-hef8daea_1
Linking llvm-openmp-18.1.8-h15ab845_0
Linking iqtree-2.3.5-h49642b6_2

Transaction finished

To activate this environment, use:

    micromamba activate /private/tmp/iqtree

Or to execute a single command in this environment, use:

    micromamba run -p /private/tmp/iqtree mycommand

blab@blab-mac ~ % file /tmp/iqtree/bin/iqtree2                                                                                         
/tmp/iqtree/bin/iqtree2: Mach-O 64-bit executable x86_64

blab@blab-mac ~ % /tmp/iqtree/bin/iqtree2                                                                                              
zsh: illegal hardware instruction  /tmp/iqtree/bin/iqtree2

blab@blab-mac ~ % echo $?
132

@tsibley
Copy link
Member Author

tsibley commented Jul 29, 2024

The h63051a7_1 build installed the same way as above doesn't trigger an illegal instruction.

@corneliusroemer I think you might want to look into this? It seems to me the build changes in bioconda/bioconda-recipes@f8218b4 broke the osx-64 build, at least on arm64 hardware. I can't test it on x86_64 hardware at the moment.

tsibley added a commit that referenced this issue Jul 29, 2024
The latest version will now use Conda's osx-arm64 channel on macOS when
available.  As it seems we're encountering issues with osx-64 packages
on arm64 hardware¹, let's see if this switch helps avoid those issues.

¹ <#382>
@corneliusroemer
Copy link
Member

Ah damn. I did see this illegal instruction in local tests but didn't connect the dots.

Need to disable AVX for the intel build so it can be emulated by Rosetta2.

This is similar to e.g. https://news.ycombinator.com/item?id=25547198

@tsibley tsibley self-assigned this Jul 30, 2024
@tsibley
Copy link
Member Author

tsibley commented Jul 30, 2024

@corneliusroemer Ah, that makes sense. Will you handle fixing the Bioconda iqtree build?

@tsibley
Copy link
Member Author

tsibley commented Jul 30, 2024

summarizing #384 (comment)

#384 fixes the illegal instruction for the ambient runtime tests, by causing the set up to use the osx-arm64 package instead of the osx-64 package. This avoids the AVX instructions in the latter that aren't emulated by macOS Rosetta 2.

However, we then run into the same illegal instructions in the Conda runtime tests since our nextstrain-base package is only published for osx-64 and incorporates the bad iqtree build.

There are two sustainable ways forward:

  1. Fix the Bioconda iqtree build to not generate AVX instructions for the osx-64 package, thus making it runnable under Rosetta.
  2. Release an osx-arm64 version of nextstrain-base, i.e. finish feat: Make osx-arm64 version of conda-base conda-base#80 and Use osx-arm64 packages on Apple silicon #379.

Notably, (1) requires no changes downstream and will fix the issue for anyone else who's installing osx-64 packages on arm64 hardware, unlike (2). But also, we want to do (2) anyway.

@corneliusroemer
Copy link
Member

@tsibley sorry for the disturbance I caused here. I've now disabled AVX for both iqtree and cmaple bioconda recipes, so this issue should also be fixed upstream. I've tested that I can run the new iqtree osx-64 build through Rosetta2.

As an aside, the latest macOS 15 (=Sequoia) currently in public beta will add Rosetta2 support for AVX. Won't have an impact here as we'll want to stay compatible with old OSes, but interesting coincidence.

Fixes:

@corneliusroemer
Copy link
Member

Fixed now after having triggering conda-base CI.

A third option would have been to exclude the offending iqtree build in conda-base recipe.

@tsibley
Copy link
Member Author

tsibley commented Jul 31, 2024

@corneliusroemer Awesome, thank you. 🙏 No worries about the breakage. It's hard to foresee.

A third option would have been to exclude the offending iqtree build in conda-base recipe.

That was the first thing I looked into, but I don't think it's possible to exclude a specific Conda package build, just a package version (which would have been too broad an exclusion). I tried a few things based on what I knew about package specs, was unsuccessful, then re-read the package specs spec and concluded it wasn't possible.

tsibley added a commit that referenced this issue Jul 31, 2024
The latest version will now use Conda's osx-arm64 channel on macOS when
available.  While the original reason for this change was because we
were encountering issues with osx-64 packages on arm64 hardware¹, this
upgrade alone wasn't enough to fix our issues.²  Still, let's upgrade
since v3 is the supported version, and it'll be good to use non-emulated
osx-arm64 binaries going forward anyway.

¹ <#382>
² <#384 (comment)>
@corneliusroemer
Copy link
Member

That was the first thing I looked into, but I don't think it's possible to exclude a specific Conda package build, just a package version (which would have been too broad an exclusion). I tried a few things based on what I knew about package specs, was unsuccessful, then re-read the package specs spec and concluded it wasn't possible.

That peaked my curiosity and I went down the rabbit hole looking into conda match specs to see if this was at all possible and the answer is yes and no.

In theory, match specs support regexes in build numbers, so to exclude a particular build, the following should work:

micromamba create -n matchspec2 'iqtree[build="^(?!h4.*_2).*$"]' --platform osx-64

This would exclude pretty much only the exact build in question (except possibly also the build number 2 of the next version in case the build string doesn't change).

In practice, conda and micromamba v1 have issues with regexes like that.

However, micromamba v2 will work with this.

Not sure about boa - since we use it as our build system that's what really matters here. Maybe worth bearing in mind to try out if we ever need to exclude a particular build in the future.

corneliusroemer added a commit to nextstrain/conda-base that referenced this issue Aug 1, 2024
@tsibley
Copy link
Member Author

tsibley commented Aug 1, 2024

@corneliusroemer Ah, nice, thanks for looking into that. I vaguely recall using/seeing the square-bracketed match selectors now. Where are those documented? I was looking at the package match specs docs. I'm also not sure the square-bracketed match selectors are valid within a recipe file; are they only supported on the command-line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants