Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some async tests are missing asyncStart/asyncEnd #2398

Open
osa1 opened this issue Nov 27, 2023 · 10 comments
Open

Some async tests are missing asyncStart/asyncEnd #2398

osa1 opened this issue Nov 27, 2023 · 10 comments
Assignees
Labels
bad-test Report tests in need of updates. When closed, the tests should be considered good

Comments

@osa1
Copy link
Member

osa1 commented Nov 27, 2023

While debugging dart-lang/sdk#54140 @mkustermann realized that some of the async tests in co19 are missing asyncStart and asyncEnd:

When testing in the browser, without asyncStart and asyncEnd, the test driver is not waiting for the test to finish and declaring a success even when the test actually fails.

I quickly searched for all the files with an async main and an asyncStart, and when I compared the outputs I see that there are dozens of files where main is async but the file doesn't have asyncStart. For example:

  • LanguageFeatures/Control-flow-collections/dynamic_semantics_list_A03_t02.dart
  • Language/Statements/For/Asynchronous_For_in/execution_A02_t01.dart

I used ag -Q 'main() async' -c to find tests with async main and ag -Q 'asyncStart' -c to find the tests with asyncStart. (I think it might make sense to search for await instead of main() async).

@sgrekhov sgrekhov self-assigned this Nov 27, 2023
@sgrekhov sgrekhov added the bad-test Report tests in need of updates. When closed, the tests should be considered good label Nov 27, 2023
sgrekhov added a commit to sgrekhov/co19 that referenced this issue Nov 27, 2023
@sgrekhov
Copy link
Contributor

@osa1 thank you! I added some missed asyncStart/End(). It's not supposed that all async test should have asyncStart/End(). Tests like

main() async {
  await test1();
  ...
  await test2();
}

don't need asyncStart/End(). But I found a lot of tests that have excessive async.

eernstg pushed a commit that referenced this issue Nov 30, 2023
Remove excessive `async`. Add an explicit return type in declarations of `main` that had none.
eernstg pushed a commit that referenced this issue Nov 30, 2023
…ncMultiTest (#2406)

Update asyncStart/End() to correspond SDK version. Replace asyncMultiStart by asyncStart.
sgrekhov added a commit to sgrekhov/co19 that referenced this issue Dec 1, 2023
eernstg pushed a commit that referenced this issue Dec 1, 2023
…guage and LanguageFeatures tests (#2407)

Update async tests to avoid false-positive results on web, Language and LanguageFeatures tests
copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Dec 4, 2023
2023-12-01 49699333+dependabot[bot]@users.noreply.github.com Bump actions/setup-java from 3.13.0 to 4.0.0 (dart-lang/co19#2410)
2023-12-01 [email protected] dart-lang/co19#2398. Update async tests to avoid false-positive results on web. Language and LanguageFeatures tests (dart-lang/co19#2407)
2023-12-01 [email protected] Fixes dart-lang/co19#2408. Fix roll failures (dart-lang/co19#2409)
2023-11-30 [email protected] dart-lang/co19#2398. Update asyncStart/End() to correspond SDK version. Replace asyncMultiTest (dart-lang/co19#2406)
2023-11-30 [email protected] dart-lang/co19#2398. Remove excessive async. Add explicit `void` (dart-lang/co19#2400)
2023-11-28 [email protected] dart-lang/co19#2350. Update existing factory constructor tests. Part 1 (dart-lang/co19#2353)
2023-11-28 [email protected] Fixes dart-lang/co19#2390. Add expected error to static_analysis_extension_types_A30_t02.dart (dart-lang/co19#2401)
2023-11-28 [email protected] Fixes dart-lang/co19#2399. Update expected errors locations for CFE (dart-lang/co19#2402)
2023-11-24 [email protected] dart-lang/co19#2388. Rename and reorder static_analysis_member_invocation_A06_t* tests (dart-lang/co19#2397)

Change-Id: Ie4b51caa12a9a0896c893cc02b099a07ef09fbd7
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/339560
Reviewed-by: Alexander Thomas <[email protected]>
Reviewed-by: Erik Ernst <[email protected]>
Commit-Queue: Erik Ernst <[email protected]>
sgrekhov added a commit to sgrekhov/co19 that referenced this issue Dec 6, 2023
copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Dec 11, 2023
2023-12-07 [email protected] Fixes dart-lang/co19#2413. Add missing expected compile-time errors for CFE (dart-lang/co19#2418)
2023-12-07 [email protected] dart-lang/co19#2119. Run dart formatter on LibTest/async tests (dart-lang/co19#2417)
2023-12-06 [email protected] dart-lang/co19#2398. Make asyncStart/End() safe in LibTest/async (dart-lang/co19#2416)
2023-12-06 [email protected] Fixes dart-lang/co19#2355. Add more typed_date.setRange() tests (dart-lang/co19#2412)
2023-12-06 [email protected] dart-lang/co19#2404. Small code-style improvements and description update (dart-lang/co19#2414)
2023-12-04 [email protected] dart-lang/co19#2004. Add deferred libraries tests according to the changed spec (dart-lang/co19#2411)
2023-12-04 [email protected] Fixes dart-lang/co19#2383. Add more constant evaluation tests (dart-lang/co19#2396)

Change-Id: I15e0d681538fa0f2a311f74d1930fad7270b81a0
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/340561
Commit-Queue: Alexander Thomas <[email protected]>
Reviewed-by: Erik Ernst <[email protected]>
Reviewed-by: Alexander Thomas <[email protected]>
@nshahan
Copy link

nshahan commented Sep 3, 2024

@sgrekhov
I've noticed that these tests are flaky for dart2js and DDC when running on Firefox and Safari even though they all use the TouchEvent constructor which is not present in those browsers.

LibTest/html/Element/onTouchCancel_A01_t01
LibTest/html/Element/onTouchEnd_A01_t01
LibTest/html/Element/onTouchEnter_A01_t01
LibTest/html/Element/onTouchLeave_A01_t01
LibTest/html/Element/onTouchMove_A01_t01
LibTest/html/Element/onTouchStart_A01_t01
LibTest/html/Element/touchCancelEvent_A01_t01
LibTest/html/Element/touchEndEvent_A01_t01
LibTest/html/Element/touchEnterEvent_A01_t01
LibTest/html/Element/touchLeaveEvent_A01_t01
LibTest/html/Element/touchMoveEvent_A01_t01
LibTest/html/Element/touchStartEvent_A01_t01
LibTest/html/IFrameElement/onTouchCancel_A01_t01
LibTest/html/IFrameElement/onTouchEnd_A01_t01
LibTest/html/IFrameElement/onTouchEnter_A01_t01
LibTest/html/IFrameElement/onTouchLeave_A01_t01
LibTest/html/IFrameElement/onTouchMove_A01_t01
LibTest/html/IFrameElement/onTouchStart_A01_t01

They should consistently fail but the fact that they can sometimes pass leads me to believe that they might complete without running the code or recognizing the failure. It makes me question the reliability of the asyncStart/End guards. Can you think of other explanations?

Since the API simply doesn't exist I'm going to start skipping these tests on those browsers but I wanted to call out this observation.

@sgrekhov
Copy link
Contributor

sgrekhov commented Sep 4, 2024

@nshahan where do you see these flaky results? According to the https://dart-current-results.web.app/#/filter=co19/LibTest/html/Element/,co19/LibTest/html/IFrameElement/ all of these tests are not flaky but fail with TouchEvent is not defined error.
May be it makes sense to change the expected result in this case? If we have no event to test then test should succeed. It's not an error if we have nothing to test. I'm going to update the tests this way.

sgrekhov added a commit to sgrekhov/co19 that referenced this issue Sep 4, 2024
@mkustermann
Copy link
Member

I've noticed that these tests are flaky for dart2js and DDC when running on Firefox and Safari even though they all use the TouchEvent constructor which is not present in those browsers.

How did you determine they are flaky?
Did they actually flip between Pass and Fail/RuntimeError?

They should consistently fail but the fact that they can sometimes pass leads me to believe that they might complete without running the code or recognizing the failure. It makes me question the reliability of the asyncStart/End guards. Can you think of other explanations?

The go/dart-current-results page seems to indicate they consistently pass.

When SQL querying the test results from dart-ci.results.results_60d table for the co19/LibTest/html/Element/touch% name it seems that they are flaking between Timeout and RuntimeError.

=> @nshahan Can it be that it's actually flakily timeout instead of failing?

/cc @athomas

The asyncStart & asyncEnd mechanism should be working correctly. It's a collaboration between the test itself and the test controller (running in browser) and tools/test.py - once an asyncStart has been issued the test is considered not done until an error or a asyncEnd.

@nshahan
Copy link

nshahan commented Sep 4, 2024

How did you determine they are flaky?

I keep seeing them turn our bots red from time to time when they change from flaky -> RuntimeError.

Did they actually flip between Pass and Fail/RuntimeError?

Oh good intuition! They are never actually passing. Sorry about that assumption. Sometimes they get the Runtime error I would expect and sometimes they timeout.

You can see an example of oscillating between flaky and RuntimeError here:
https://dart-ci.web.app/#showLatestFailures=false&configurations=ddc-linux-firefox&test=co19/LibTest/html/Element/touchEndEvent_A01_t01

@sgrekhov
Copy link
Contributor

sgrekhov commented Sep 5, 2024

With this change these tests should start passing

@mkustermann
Copy link
Member

@nshahan @sgrekhov The asyncStart / asyncEnd machinery works perfectly fine here. The fact that the tests fail on some browsers should also be perfectly fine (so we shouldn't change the tests because of it). I think the root cause may be something else entirely: For example it can mean that the test runner starts too many browsers (it runs N tests in parallel) and that one test may consume a lot of CPU or RAM and cause other browser tabs to crash, ...

@sgrekhov
Copy link
Contributor

sgrekhov commented Sep 5, 2024

If there is nothing to tests in some brausers why the tests should fail? And the change (already commited but I can revert it if needed) should fix the flakeness. IMHO it's fine

@mkustermann
Copy link
Member

If there is nothing to tests in some brausers why the tests should fail? And the change (already commited but I can revert it if needed) should fix the flakeness. IMHO it's fine

In general our tests should test the behavior we want/expect. So if an implementation doesn't adhere to this behavior/spec, the test should fail.

This way we can go to our test results database and check: Are e.g. touch events working on all browsers? And we'll see all the browsers where they work and those where they don't work (Instead of seeing all touch event tests passing, giving us the illusion of this working everywhere - even though on some browsers the API throws)

There's a gray line when APIs are e.g. optional (e.g. not in official W3C spec, ...). Since it's web-specific, I'll leave this to web team @nshahan @sigmundch.

@eernstg
Copy link
Member

eernstg commented Sep 5, 2024

We have several tests where a specific behavior is expected based on the number semantics. They would make the distinction that some configurations have web numbers and others have native numbers, and the expectations differ. So perhaps we need to make more than one distinction:

  • Feature F comes in variants F1 for configuration C11 .. C1k, F2 for C21 .. C2m, ...
  • Feature F is broken on configurations C1 .. Ck

In the former case we just need a run-time test that determines which kind of configuration we're currently running, or we may have several tests that are only executed with specific configurations (using status files, no less ;-).

If we have a situation like the latter then we could have a test that expects the known outcome (e.g., it could expect that there's a timeout, or the browser crashes with a bus error, or whatever), but the test runner doesn't (perhaps can't) support that. So even though item 2 looks like a special case of item 1, we may not be able to test it in the usual way.

Next, even if we can arrange for every test succeeding on every configuration (that is, it works, or we expect that it crashes and it actually crashes, etc.), the information about known deficiencies of specific platforms/tools is still implicit.

It seems likely that we'd want some kind of provenance for this. E.g., it could be useful to be able to query things like "which features are known to be broken with this and that specific browser", in particular if the status changes and a bunch of tests need to be updated. A bit like the approval associations where a test failure should allow us to look up the issue where that failure is being addressed.

In any case, I'm not convinced that we should consider a test failure to be a normal behavior that nobody should ever bother to look into.

copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Sep 10, 2024
2024-09-10 [email protected] Fixes dart-lang/co19#2852. Fix the new roll failures (dart-lang/co19#2853)
2024-09-09 [email protected] dart-lang/co19#2825. Add scope tests. Part 2. (dart-lang/co19#2847)
2024-09-09 [email protected] Fixes dart-lang/co19#2849. Fix new roll failures (dart-lang/co19#2850)
2024-09-09 [email protected] dart-lang/co19#2559. Replace augmentation libraries by parts (dart-lang/co19#2848)
2024-09-05 [email protected] dart-lang/co19#2825. Add scope tests. Part 1. (dart-lang/co19#2845)
2024-09-04 [email protected] dart-lang/co19#2398. Update TouchEvent tests to not fail on platforms where it is not defined (dart-lang/co19#2844)
2024-09-03 [email protected] dart-lang/co19#2825. Add import inheritance and grammar tests (dart-lang/co19#2843)
2024-09-02 [email protected] dart-lang/co19#2559. Add augmenting members tests. Part 2 (dart-lang/co19#2799)
2024-09-02 [email protected] dart-lang/co19#2825. Add enhanced parts tests for top-level declarations and ownership (dart-lang/co19#2842)
2024-09-01 49699333+dependabot[bot]@users.noreply.github.com Bump actions/setup-java from 4.2.1 to 4.2.2 in the github-actions group (dart-lang/co19#2841)
2024-08-30 [email protected] dart-lang/co19#2559. Add extension types to augmenting types tests (dart-lang/co19#2839)

Cq-Include-Trybots: luci.dart.try:analyzer-linux-release-try
Change-Id: I26f83579708b877918b5a0a904f8172aa1cf4724
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/384480
Reviewed-by: Erik Ernst <[email protected]>
Reviewed-by: Alexander Thomas <[email protected]>
Commit-Queue: Alexander Thomas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bad-test Report tests in need of updates. When closed, the tests should be considered good
Projects
None yet
Development

No branches or pull requests

5 participants