Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on_failure_callback to alert during test failure #341

Closed
ajithshetty opened this issue Jun 24, 2023 · 10 comments
Closed

on_failure_callback to alert during test failure #341

ajithshetty opened this issue Jun 24, 2023 · 10 comments

Comments

@ajithshetty
Copy link

ajithshetty commented Jun 24, 2023

There is a feature to send call back during WARN.
https://astronomer.github.io/astronomer-cosmos/dbt/configuration.html#warn-notification
Which sends the list of tests which are in WARN state.

In case, if the test fails, the task itself will fail.

With on_failure_callback, Requirement is to see the tests which are failed.

@CorsettiS
Copy link
Contributor

since cosmos wraps over airflow original functionalities, why not just use on_failure_callback provided by airflow itself?

@ajithshetty
Copy link
Author

The main requirement is to able to see the dbt test results which are failed just like we have for WARN
on_failure_callback context doesnt include the result from the underneath task.
Not sure if my understanding is fully correct here.

@CorsettiS
Copy link
Contributor

@ajithshetty I see. Indeed, although ideal, it would be a very hard work actually. I was the one that actually implemented the on_warn_callback feature, and to do so I basically scan the log output, do some regex and add the key-values to the airflow context. To accomplish the same for the on_failure_callback it would be harder for some technical reasons though.

If you set the airflow on_failure_callback - although without the context - you can still receive the notifications via email or slack channel, for instance. For warnings, this feature was non-existent, which motivated me to work on it. failures will stop your process, forcing you to look into them. warns, on the other hand, are soft and may go unnoticed. By the nature of the problem and complexity involved, it does not sound worth it to work on an on_failure_callback feature as of now.

@jlaneve
Copy link
Collaborator

jlaneve commented Jun 26, 2023

I wonder if we could modify the subprocess hook to throw a Python exception we define (with all the metadata we need) - and then a user could use it as described in: https://stackoverflow.com/questions/51822029/get-exception-details-on-airflow-on-failure-callback-context

@CorsettiS
Copy link
Contributor

@jlaneve it is possible for sure, however I do not know how easy to implement it would be. Ideally we would pass it to the task level, but then a DbtDag method would overwrite it with airflow default, breaking the consistency. maybe changing the name from on_failure_callback to on_test_failure_callback and working on it as a virgin project ? and obviously do the same for warn (on_warn_callback to on_test_warn_callback) for consistency. That way we keep the airflow default method clean and it is easier to work from there.

@jlaneve
Copy link
Collaborator

jlaneve commented Jun 28, 2023

@jlaneve it is possible for sure, however I do not know how easy to implement it would be. Ideally we would pass it to the task level, but then a DbtDag method would overwrite it with airflow default, breaking the consistency. maybe changing the name from on_failure_callback to on_test_failure_callback and working on it as a virgin project ? and obviously do the same for warn (on_warn_callback to on_test_warn_callback) for consistency. That way we keep the airflow default method clean and it is easier to work from there.

Love this idea! on_test_failure_callback makes it super clear what it's doing, and I agree we should update the warning to on_test_warn_callback as well

@CorsettiS
Copy link
Contributor

@jlaneve I can work on that when I have some free time then 💯 I have a good idea of how to implement it tbh.
As of now I am still trying to solve some errors I am facing after the recent changes in the project (#346 & #342 ) so I will first see how I can solve that first, also because I am trying to implement cosmos in the current company I am working at so I can also take some dedicated tickets to improve the tool with the aim to benefit us. I believe that some patches (#352 ) may help me out in some of the problems I am facing. As soon as I have cosmos up & running properly in my company I will work on this issue then.

@ajithshetty
Copy link
Author

I'm not fully sure if it goes against the core principles, but do you think having "soft fail" makes sense for DBT tests which fails but the task still succeeds.

@CorsettiS
Copy link
Contributor

@ajithshetty sorry, I did not totally understand the point you made. astronomer-cosmos consider the tests as a task itself, therefore if it fails, it makes sense the task should fail. A soft fail would be the same as a warning from my perspective, no?

Copy link

dosubot bot commented Nov 9, 2023

Hi, @ajithshetty! I'm Dosu, and I'm helping the Cosmos team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you requested a new feature to add an on_failure_callback that will alert users when a test fails during a task. There was a discussion about using the on_failure_callback provided by Airflow itself, but it was mentioned that it does not include the result from the underneath task. There was also a suggestion to modify the subprocess hook to throw a Python exception with all the necessary metadata. Additionally, there was a suggestion to change the name of the callback to on_test_failure_callback and on_test_warn_callback for consistency.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the Cosmos repository. If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 9, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 16, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants