-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: GWAS catalog top-hit DAG #33
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another note, we have to merge the change to use yarn asap. This will be in #34
@@ -0,0 +1,47 @@ | |||
dataproc: | |||
cluster_name: otg-tophit-gwascatalog | |||
autoscaling_policy: otg-gwascatalog-tophit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT thing, but could be keep the naming consistent?
for step in top_hits_config["nodes"]: | ||
task = submit_gentropy_step( | ||
cluster_name=config["dataproc"]["cluster_name"], | ||
step_name=step["id"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note to me. The 'params' need to have the 'step:step_name', so the 'step_name' parameter should refer to the 'task_id' only
This DAG handles GWAS catalog top hits independently of any other run.
It was tested and produced good outputs in 3h (slow because it only ran in the master node)
This PR requires the next PR to be merged in gentropy opentargets/gentropy#808
Co-authored with @DSuveges