Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_yyyyMMdd_format_for_legacy_mode failed in Dataproc Serverless integration tests #11501

Closed
yinqingh opened this issue Sep 26, 2024 · 7 comments
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@yinqingh
Copy link
Collaborator

Describe the bug
Seeing test failures in rapids-it-dataproc-serverless-2.2#32. The full test logs can be found in Dataproc Serverless job with name "rapids-it-dataproc-serverless-22-32-3-20240925124028"

FAILED rapids-it-dataproc-serverless-32/integration_tests/src/main/python/date_time_test.py::test_yyyyMMdd_format_for_legacy_mode[DATAGEN_SEED=0, TZ=UTC]
a = ('xro1511250', <py4j.clientserver.JavaClient object at 0x7f3ebd714160>, 'o1511249', 'collectToPython')
kw = {}, converted = IllegalArgumentException()

    def deco(*a: Any, **kw: Any) -> Any:
        try:
            return f(*a, **kw)
        except Py4JJavaError as e:
            converted = convert_exception(e.java_exception)
            if not isinstance(converted, UnknownException):
                # Hide where the exception came from that shows a non-Pythonic
                # JVM exception message.
>               raise converted from None
E               pyspark.errors.exceptions.captured.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec
E               Project [unix_timestamp(a#175292, yyyyMMdd, Some(Etc/UTC), false) AS unix_timestamp(a, yyyyMMdd)#175294L, from_unixtime(unix_timestamp(a#175292, yyyyMMdd, Some(Etc/UTC), false), yyyyMMdd, Some(Etc/UTC)) AS from_unixtime(unix_timestamp(a, yyyyMMdd), yyyyMMdd)#175295, date_format(gettimestamp(a#175292, yyyyMMdd, TimestampType, Some(Etc/UTC), false), yyyyMMdd, Some(Etc/UTC)) AS date_format(to_timestamp(a, yyyyMMdd), yyyyMMdd)#175296]
E               +- Scan ExistingRDD[a#175292]

Steps/Code to reproduce bug
Please provide a list of steps or a code sample to reproduce the issue.
Avoid posting private or sensitive data.

Expected behavior
A clear and concise description of what you expected to happen.

Environment details (please complete the following information)

  • Dataproc Serverless version 2.2.20
  • Scala213

Additional context
Add any other context about the problem here.

@yinqingh yinqingh added ? - Needs Triage Need team to review and classify bug Something isn't working labels Sep 26, 2024
@pxLi
Copy link
Collaborator

pxLi commented Sep 26, 2024

the case was added in #11449

cc @res-life to help

@res-life
Copy link
Collaborator

I tried to reproduce on Spark 3.5.1 Scala213, and the case passed.
Will test on Dataproc.

@res-life
Copy link
Collaborator

Verified on Dataproc, also did not reproduce.

@pxLi
Copy link
Collaborator

pxLi commented Sep 26, 2024

lets wait for another round of this job, it may miss some cudf updates, also cc @yinqingh to trigger some runs after tonight's latest snapshot is out thanks

@yinqingh
Copy link
Collaborator Author

Will monitor the test builds with the latest snapshot jar

@res-life
Copy link
Collaborator

res-life commented Sep 27, 2024

#33 shows this case passed:

[2024-09-26T15:44:30.917Z] 
[2024-09-26T15:44:30.917Z] rapids-it-dataproc-serverless-33/integration_tests/src/main/python/date_time_test.py::test_yyyyMMdd_format_for_legacy_mode[DATAGEN_SEED=0, TZ=UTC] PASSED [ 41%]
[2024-09-26T15:44:30.917Z] 24/09/26 12:28:09 WARN GpuOverrides: 
[2024-09-26T15:44:30.917Z]   ! <RDDScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.RDDScanExec
[2024-09-26T15:44:30.917Z]     @Expression <AttributeReference> a#175305 could run on GPU
[2024-09-26T15:44:30.917Z] 

@yinqingh Please help check if CI used an old Plugin jar?

@yinqingh
Copy link
Collaborator Author

Confirmed that CI job used inconsistent plugin jar (00cd422) and IT package (a34f33e) due to prerelease version shifting. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants