collector: always consider all monomorphic functions to be 'mentioned' #122862

RalfJung · 2024-03-22T07:40:45Z

This would fix #122814. But it's probably not going to be cheap...

Ideally we'd avoid building the optimized MIR for these new roots, and only request mir_drops_elaborated_and_const_checked -- but that MIR is often getting stolen so I don't see a way to do that. (Zulip)

r? @oli-obk @tmiasko

RalfJung · 2024-03-22T07:47:31Z

@bors try @rust-timer queue

bors · 2024-03-22T07:48:41Z

⌛ Trying commit 59803ef with merge d0df954...

collector: always consider all monomorphic functions to be 'mentioned' This would fix rust-lang#122814. But it's probably not going to be cheap... Ideally we'd avoid building the optimized MIR for these new roots, and only request `mir_drops_elaborated_and_const_checked` -- but that MIR is often getting stolen so I don't see a way to do that. TODO before landing: - [ ] Figure out if there is a testcase [here](rust-lang#122814 (comment)). r? `@oli-obk` `@tmiasko`

bors · 2024-03-22T09:19:16Z

☀️ Try build successful - checks-actions
Build commit: d0df954 (d0df954d8bedc6b4baa80485170b02fda0e0042f)

rust-timer · 2024-03-22T10:38:01Z

Finished benchmarking commit (d0df954): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.1%	[0.2%, 4.0%]	66
Regressions ❌ (secondary)	1.3%	[0.3%, 4.3%]	23
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.1%	[0.2%, 4.0%]	66

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.9%	[1.9%, 1.9%]	1
Regressions ❌ (secondary)	4.8%	[2.9%, 8.4%]	14
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.9%	[1.9%, 1.9%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.0%	[1.5%, 7.2%]	16
Regressions ❌ (secondary)	2.2%	[1.4%, 2.7%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	3.0%	[1.5%, 7.2%]	16

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 667.777s -> 669.759s (0.30%)
Artifact size: 315.07 MiB -> 315.10 MiB (0.01%)

RalfJung · 2024-03-22T10:42:43Z

It's again mostly incr builds which are affected -- I guess that makes sense as then the collector represents a larger fraction of the total rustc execution time than for full builds.

This seems to affect different benchmarks than #122568.

Would be interesting to figure out where the extra time is spent; this time it can't be metadata (de)serialization. I wonder if skipping MIR opts would help or if the actual cost is elsewhere. @saethlin I think you had a working setup for getting cachegrind diffs?

RalfJung · 2024-03-22T10:45:46Z

Oli made a suggestion for how to about the MIR opts here; that should probably be the next step. I have to put this on hold for now though.

Kobzol · 2024-03-22T11:20:07Z

FWIW, it you click on a row with a specific benchmark result, it will show you a command that you can copy paste to get a cachegrind diff.

RalfJung · 2024-03-22T11:46:25Z

Yeah and last time I tried that Debian's valgrind was too old and that command did not work. Maybe I should check if it has been updated in the mean time.

saethlin · 2024-03-22T12:10:42Z

Here's the top of the most-regressed primary benchmark (unicode-normalization debug incr-unchanged)

35,048,416  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir          file:function
--------------------------------------------------------------------------------
-3,749,950  ???:<hashbrown::raw::RawTable<(rustc_ast::node_id::NodeId, rustc_hir::hir_id::ItemLocalId)>>::reserve_rehash::<hashbrown::map::make_hasher<rustc_ast::node_id::NodeId, rustc_hir::hir_id::ItemLocalId, core::hash::BuildHasherDefault<rustc_hash::FxHasher>>::{closure#0}>
 3,749,950  ???:<hashbrown::raw::RawTable<(rustc_span::def_id::LocalDefId, rustc_hir::hir_id::ItemLocalId)>>::reserve_rehash::<hashbrown::map::make_hasher<rustc_span::def_id::LocalDefId, rustc_hir::hir_id::ItemLocalId, core::hash::BuildHasherDefault<rustc_hash::FxHasher>>::{closure#0}>
 2,790,508  ???:<rustc_query_system::dep_graph::graph::DepGraphData<rustc_middle::dep_graph::DepsType>>::try_mark_previous_green::<rustc_query_impl::plumbing::QueryCtxt>
 2,640,064  ???:<rustc_metadata::rmeta::decoder::DecodeContext as rustc_span::SpanDecoder>::decode_span
 1,781,794  <all-jemalloc-files>:<all-jemalloc-functions>
 1,627,701  ???:<rustc_middle::mir::BasicBlockData as rustc_data_structures::stable_hasher::HashStable<rustc_query_system::ich::hcx::StableHashingContext>>::hash_stable
 1,575,493  ???:<rustc_middle::mir::interpret::AllocId as rustc_data_structures::stable_hasher::HashStable<rustc_query_system::ich::hcx::StableHashingContext>>::hash_stable::{closure#0}
-1,515,559  ???:<rustc_middle::mir::interpret::AllocId as rustc_data_structures::stable_hasher::HashStable<rustc_query_system::ich::hcx::StableHashingContext>>::hash_stable
 1,248,878  ???:<rustc_span::caching_source_map_view::CachingSourceMapView>::span_data_to_lines_and_cols
 1,148,391  ???:<rustc_middle::mir::Body as rustc_data_structures::stable_hasher::HashStable<rustc_query_system::ich::hcx::StableHashingContext>>::hash_stable
 1,129,590  ???:<rustc_data_structures::sip128::SipHasher128>::short_write_process_buffer::<8>
   971,146  ???:<rustc_middle::ty::context::CtxtInterners>::intern_ty
   920,975  ???:<rustc_middle::mir::Body as rustc_serialize::serialize::Decodable<rustc_metadata::rmeta::decoder::DecodeContext>>::decode
   848,301  ???:<rustc_middle::ty::Ty as rustc_serialize::serialize::Decodable<rustc_metadata::rmeta::decoder::DecodeContext>>::decode
   846,228  ???:rustc_incremental::persist::load::setup_dep_graph
   807,165  ???:rustc_monomorphize::collector::collect_items_rec
   526,864  ???:<rustc_data_structures::sip128::SipHasher128>::finish128
   465,911  ???:<rustc_middle::query::on_disk_cache::CacheDecoder as rustc_span::SpanDecoder>::decode_def_id
   425,077  ???:rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::DynamicConfig<rustc_query_system::query::caches::DefIdCache<rustc_middle::query::erase::Erased<[u8; 8]>>, false, false, false>, rustc_query_impl::plumbing::QueryCtxt, true>
   380,687  ???:rustc_data_structures::unord::hash_iter_order_independent::<rustc_query_system::ich::hcx::StableHashingContext, (&rustc_span::def_id::DefId, &rustc_span::def_id::DefId), std::collections::hash::map::Iter<rustc_span::def_id::DefId, rustc_span::def_id::DefId>>
   360,997  ???:<rustc_middle::ty::generic_args::ArgFolder as rustc_type_ir::fold::TypeFolder<rustc_middle::ty::context::TyCtxt>>::fold_ty
   342,871  /usr/src/debug/glibc/glibc/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:__memcpy_avx_unaligned_erms
   333,648  ???:rustc_monomorphize::collector::visit_instance_use

I looked at a few other of the top primary regressions and they all look almost exactly the same in cachegrind

RalfJung · 2024-03-22T12:32:13Z

Thanks! Hm... Mir opts don't seem to show up there as far as I can tell. Strange.

saethlin · 2024-03-22T13:32:47Z

If MIR opts were significant, you'd probably see them in this view: https://perf.rust-lang.org/detailed-query.html?commit=d0df954d8bedc6b4baa80485170b02fda0e0042f&benchmark=unicode-normalization-0.1.19-debug&scenario=incr-unchanged&base_commit=cdb683f6e4b0774b85c60eebe12af87f29d8ee4d&sort_idx=-11 they're all called mir_pass_something.

RalfJung · 2024-03-22T13:40:32Z

I do see metadata_decode_entry_optimized_mir there. So... all the extra time is spent loading the MIR...?

saethlin · 2024-03-22T13:49:08Z

Yup, that would be my first guess.

RalfJung · 2024-03-22T14:37:50Z

In that case maybe a dedicated query for "mentioned and required items" would indeed help as it would not have to load the entire MIR for that. (This was proposed by Oli.)

RalfJung · 2024-04-03T17:24:48Z

In that case maybe a dedicated query for "mentioned and required items" would indeed help as it would not have to load the entire MIR for that. (This was proposed by Oli.)

I have zero knowledge about the crate metadata handling and I'm unlikely to have the time to learn about it any time soon -- so if anyone wants to pick this up, please feel free to do so. Meanwhile I will close this PR as I'm not currently working on this.

Create a separate query for required and mentioned items instead of tracking them in the MIR body implements rust-lang#122862 (comment) May permit further improvements without sacrificing perf... iff this PR isn't horrible for perf 🙃

rustbot assigned oli-obk Mar 22, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 22, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 22, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 22, 2024

RalfJung added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 22, 2024

RalfJung marked this pull request as draft March 24, 2024 08:48

RalfJung added 2 commits March 24, 2024 09:55

collector: always consider all monomorphic functions to be 'mentioned'

f076f17

adjust incremental tests

a84f508

RalfJung force-pushed the collect-all-mono branch from 59803ef to a84f508 Compare March 24, 2024 08:55

RalfJung mentioned this pull request Apr 3, 2024

Which functions are "reachable", and therefore subject to monomorphization-time checks, is optimization-dependent #122814

Open

RalfJung closed this Apr 3, 2024

oli-obk mentioned this pull request Apr 5, 2024

Create a separate query for required and mentioned items instead of tracking them in the MIR body #123488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

collector: always consider all monomorphic functions to be 'mentioned' #122862

collector: always consider all monomorphic functions to be 'mentioned' #122862

RalfJung commented Mar 22, 2024 •

edited

Loading

RalfJung commented Mar 22, 2024

This comment has been minimized.

bors commented Mar 22, 2024

bors commented Mar 22, 2024

This comment has been minimized.

rust-timer commented Mar 22, 2024

RalfJung commented Mar 22, 2024 •

edited

Loading

RalfJung commented Mar 22, 2024

Kobzol commented Mar 22, 2024

RalfJung commented Mar 22, 2024 via email

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024 via email

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024 •

edited

Loading

RalfJung commented Apr 3, 2024

collector: always consider all monomorphic functions to be 'mentioned' #122862

collector: always consider all monomorphic functions to be 'mentioned' #122862

Conversation

RalfJung commented Mar 22, 2024 • edited Loading

RalfJung commented Mar 22, 2024

This comment has been minimized.

bors commented Mar 22, 2024

bors commented Mar 22, 2024

This comment has been minimized.

rust-timer commented Mar 22, 2024

Overall result: ❌ regressions - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Binary size

RalfJung commented Mar 22, 2024 • edited Loading

RalfJung commented Mar 22, 2024

Kobzol commented Mar 22, 2024

RalfJung commented Mar 22, 2024 via email

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024 via email

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024

saethlin commented Mar 22, 2024

RalfJung commented Mar 22, 2024 • edited Loading

RalfJung commented Apr 3, 2024

RalfJung commented Mar 22, 2024 •

edited

Loading

RalfJung commented Mar 22, 2024 •

edited

Loading

RalfJung commented Mar 22, 2024 •

edited

Loading