Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bot: status fails in a setup with multiple bot instances configured to build for (partially) different architectures #256

Open
trz42 opened this issue Feb 24, 2024 · 1 comment

Comments

@trz42
Copy link
Contributor

trz42 commented Feb 24, 2024

Below is part of the log on Saga where we run the bot for NESSI. The bot also runs on eX3 and AWS where it is configured to build for partially different targets.

[20240224-T07:03:17] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-skylake_avx512' not found in first line 'New job on instance `AWS-MC-NESSI` for architecture `x86_64-generic` for repository `nessi-2023.06-swl-deb11` in job dir `/project/def-nessi/SHARED/jobs/2024.02/pr_281/6520`'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-broadwell' not found in first line 'New job on instance `AWS-MC-NESSI` for architecture `x86_64-generic` for repository `nessi-2023.06-swl-deb11` in job dir `/project/def-nessi/SHARED/jobs/2024.02/pr_281/6520`'
[20240224-T07:03:17] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-skylake_avx512' not found in first line 'New job on instance `eX3-NESSI` for architecture `aarch64-generic` for repository `nessi-2023.06-swl-deb10` in job dir `/home/thomarob/pilot.nessi.no/jobs/2024.02/pr_281/135494`'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-broadwell' not found in first line 'New job on instance `eX3-NESSI` for architecture `aarch64-generic` for repository `nessi-2023.06-swl-deb10` in job dir `/home/thomarob/pilot.nessi.no/jobs/2024.02/pr_281/135494`'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-generic' not found in first line 'New job on instance `eX3-NESSI` for architecture `aarch64-generic` for repository `nessi-2023.06-swl-deb10` in job dir `/home/thomarob/pilot.nessi.no/jobs/2024.02/pr_281/135494`'
[20240224-T07:03:17] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-skylake_avx512' not found in first line 'New job on instance `AWS-MC-NESSI` for architecture `x86_64-amd-zen2` for repository `nessi-2023.06-swl-deb11` in job dir `/project/def-nessi/SHARED/jobs/2024.02/pr_281/6521`'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-broadwell' not found in first line 'New job on instance `AWS-MC-NESSI` for architecture `x86_64-amd-zen2` for repository `nessi-2023.06-swl-deb11` in job dir `/project/def-nessi/SHARED/jobs/2024.02/pr_281/6521`'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-generic' not found in first line 'New job on instance `AWS-MC-NESSI` for architecture `x86_64-amd-zen2` for repository `nessi-2023.06-swl-deb11` in job dir `/project/def-nessi/SHARED/jobs/2024.02/pr_281/6521`'
[20240224-T07:03:17] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-intel-skylake_avx512' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-broadwell` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753816`^M'
[20240224-T07:03:17] request_bot_build_issue_comments(): target_arch 'x86_64-generic' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-broadwell` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753816`^M'
[20240224-T07:03:18] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:18] request_bot_build_issue_comments(): target_arch 'x86_64-intel-broadwell' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-skylake_avx512` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753817`'
[20240224-T07:03:18] request_bot_build_issue_comments(): target_arch 'x86_64-generic' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-skylake_avx512` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753817`'
[20240224-T07:03:18] get_architecture_targets(): arch target map '{"linux/x86_64/intel/skylake_avx512": "--partition=normal --exclude=c1-47,c2-3,c10-[1-60],c11-[1-60]", "linux/x86_64/intel/broadwell": "--partition=hugemem --mem=120G --ntasks-per-node=20 --time=1-00:00:00", "linux/x86_64/generic": "--partition=normal"}'
[20240224-T07:03:18] request_bot_build_issue_comments(): target_arch 'x86_64-intel-skylake_avx512' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-broadwell` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753824`'
[20240224-T07:03:18] request_bot_build_issue_comments(): target_arch 'x86_64-generic' not found in first line 'New job on instance `Saga-NESSI` for architecture `x86_64-intel-broadwell` for repository `nessi-2023.06-swl-deb11` in job dir `/cluster/projects/nn9992k/pilot.nessi.no/jobs/2024.02/pr_281/10753824`'
[20240224-T07:03:18] Unexpected err=list index out of range, type(err)=<class 'IndexError'>
[20240224-T07:03:18] WARNING: A crash occurred!
Traceback (most recent call last):
  File "/cluster/projects/nn9992k/pilot.nessi.no/venv_bot_p39/lib/python3.9/site-packages/pyghee/lib.py", line 170, in process_event
    self.handle_event(event_info, log_file=log_file)
  File "/cluster/projects/nn9992k/pilot.nessi.no/venv_bot_p39/lib/python3.9/site-packages/pyghee/lib.py", line 102, in handle_event
    handler(event_info, log_file=log_file)
  File "/cluster/projects/nn9992k/pilot.nessi.no/eessi-bot-software-layer/eessi_bot_event_handler.py", line 222, in handle_issue_comment_event
    update = self.handle_bot_command(event_info, cmd)
  File "/cluster/projects/nn9992k/pilot.nessi.no/eessi-bot-software-layer/eessi_bot_event_handler.py", line 399, in handle_bot_command
    return handler(event_info, bot_command)
  File "/cluster/projects/nn9992k/pilot.nessi.no/eessi-bot-software-layer/eessi_bot_event_handler.py", line 504, in handle_bot_command_status
    comment_status += f"\n|{status_table['arch'][x]}|"
IndexError: list index out of range

An idea to fix this could be to split the start of the comment for a new job into

  • some prefix New job,
  • information for the instance on instance {instance},
  • information about the architecture for architecture {architecture},
  • information about the repository for repository {repository},
  • information about the job directory in job dir {job_working_directory}
    which are configurable via app.cfg.

Then, the function to parse the comments could use these different parts to

  • check if a comment belongs to the instance processing the bot: status command
  • (as before) check if a comment belongs to the architecture that is being processed (list of architectures is still generated from arch_target_map)
@trz42
Copy link
Contributor Author

trz42 commented Feb 24, 2024

Changed difficulty to medium because a change as outlined in the description would require to carefully check if other functions have to be changed too, e.g., functions that determine a comment based on what a comment includes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant