-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow SHARED_MEMORY
flag to be used independently.
#22683
Conversation
Currently, there are a couple of restrictions on the `SHARED_MEMORY` flag: * Attempting to use the `SHARED_MEMORY` flag without the `PTHREADS` or `WASM_WORKERS` flags simply fails, because we don't have variations of the system libraries that have this flag standalone. Instead, emcc attempts to use the single-threaded system libraries, which are not compatible with the user's build because they don't have the correct wasm feature flags (i.e. -matomics and -mbulk-memory). This PR adds `SHARED_MEMORY`-only variations of the system libraries, which allows users to build with a standalone `SHARED_MEMORY` flag without error. * Currently, `SHARED_MEMORY` also implies `IMPORTED_MEMORY`. However, `IMPORTED_MEMORY` is only important for builds that use the `PTHREADS` or `WASM_WORKERS` flags, since they actually do multi-threading and need to read and write to the same memory across workers. This PR changes this so that if a standalone `SHARED_MEMORY` flag is specified, it can still avoid importing the memory. The use-case for enabling the `SHARED_MEMORY` flag without actually having an actual threading implementation is to be compatible with other modules which may import that memory whose import declaration specifies the memory as shared. This is an important use-case for Flutter, as we want to create two variations of the emscripten-generated module (one single-threaded and one multi-threaded, depending on whether the environment supports it) while keeping the module that imports the emscripten-generated module the same (which means keeping the memory import declaration as shared).
So (just to make sure I understand) the use case here is you want to be able to import Emscripten's memory into your Flutter module as shared, so you can use the same Flutter module with either the single or multi-threaded version. The problem with 1. is that if you did what you describe, you'd also need to build all of your user code with |
To answer the second question first:
That is essentially what this PR is trying to accomplish. The problem is that the system libs cache fails in that case because it tries to link against the single-threaded variants of the system libs, which were not compiled with
The question here is basically "why not just enable a multi-threading strategy and simply not use it?" I also tried this, but it causes the module not to work in a non-crossOriginIsolated context. The reason is that when emscripten/src/runtime_init_memory.js Line 49 in aa5778e
crossOriginIsolated context, this fails because SharedArrayBuffer doesn't exist. This PR fixes this issue by making sure that SHARED_MEMORY doesn't actually imply IMPORTED_MEMORY , and we can just directly embed and export the module's memory (marked as shared) and that omits the code that was failing in a non-crossOriginIsolated context.
|
Would it be enough for that use case to modify the module at the very end? (something like |
I actually tried this as well by just editing the module that emscripten built in a hex editor. The problem is that there actually are some necessary differences in the JavaScript code emscripten emits based on whether the memory is shared. See https://github.com/emscripten-core/emscripten/blob/aa5778efeaf9f1707cad8e7ca1fb6af504a6562d/src/parseTools.mjs#L1034C16-L1034C16 for one example. |
This seems like something it could make sense to fix or change. I think a shared memory can be created even when not origin-isolated (it just can't be |
I'm happy to change this too if you'd like, but I do think it is useful to be able to just specify |
Ah, good point, that does make the simple solution not work, unfortunately... |
The one concern I have here is of adding another variation to the system libraries. Each time we double the variations of core system libraries it has a cost, so we really try to avoid that. Otherwise, though, all of your reasons make sense to me.
What about using wasm workers instead of pthreads? That mode is intentionally designed to be minimal and pay-as-you-go. I am hoping you can enable WASM_WORKERS and just never create a wasm worker, basically, and without the downsides of pthreads. As for the imported memory, perhaps that can be fixed specifically, that is, make whether the memory is imported customizable? |
This cost you're talking about is general project maintenance cost, right? Not compile/runtime cost? I always assumed that these variations were created and populated in the cache on demand based on the user's flags, so if people don't actually ever use that configuration, they never pay the cost for building the system libraries with that particular set of flags.
I also tried One other thing I'm wondering is, how should that check actually be changed in order to make this work? It seems like the intention is to have a expressive hard failure when we detect that the pthreads or wasm workers implementations will simply not work. I could change the check to be based on whether we are doing pthreads or wasm workers specifically (so that it doesn't error based on shared memory alone) but that would still prevent me from using the strategy you're suggesting and enabling pthreads or wasm workers and just not using them. I feel like completely removing the check altogether might degrade the user experience (give failure which is much more difficult to understand) for people who are actually trying to use multithreading but have some sort of browser/environment limitation. I think the decision around this check illustrates my hesitation to compile against pthreads or wasm workers for our project. Compiling our module with a threading runtime that is actually broken in the environments I am attempting to run in feels like a footgun. Even if we remove this check and it works right now, it seems pretty easy to imagine that someone could make a change to emscripten later which makes the reasonable assumption that if I'm compiling with pthreads/wasm workers that I should have access to a
I'm not sure if this is exactly what you mean, but it actually is customizable (see the docs). The change I am making regarding imported vs shared memory is to just decouple those two options from each other, so that one doesn't imply the other. That seems to work perfectly fine, and it is supported by llvm/clang. I think overall, I do understand not wanting to increase the surface area of support for emscripten builds. However, passing the |
I've actually been pushing to remove If we can find a way to make Alon's suggestion above of some kind of post-link tool ( |
Mostly the former, but we do also need to build all the variations on CI, and to decide which to prebuild and bundle in the SDK. Not a dealbreaker from my perspective, but a downside.
Sorry, I've lost you here regarding that check. Is it that you want a single build that can work with and without SAB enabled by the browser? That is, you want fixed JS that can work in both cases? (If so then I've misunderstood you before; if not then I am not certain what your goal is yet.)
We can post-process the wasm easily, but post-processing the JS is the problem. I'm not sure how we would handle that. There are just some tweaks to the JS that require us to know in advance that we are compiling with SAB. |
I wonder if we can work around these "necessary differences" somehow? Do you happen to know if there are others in addition to https://github.com/emscripten-core/emscripten/blob/aa5778efeaf9f1707cad8e7ca1fb6af504a6562d/src/parseTools.mjs#L1034C16-L1034C16? |
I can't think of any, but I'd guess if there are they are similar to that one, workarounds for things SAB can't be used directly. I do agree it's worth investigating, but I'd expect to be able to use it we'd need a new option "support optional SAB in JS", which seems unpromising. |
This is regarding Derek's suggestion, which was to (1) change or remove this check so that the build will actually work in a non-crossOriginIsolated context and (2) use a pthreads build for the single-threaded renderer (I have yet to confirm this would work, but let's assume for the sake of argument it does). So what I'm asking is, how can this check actually be changed that actually allows me to do this strategy? The check's purpose is to make sure that the pthreads implementation will actually not function properly in this environment, a fact that is true in my intended use case (non-crossOriginIsolated). So would the check be simply removed so that a pthreads build can actually support my use-case?
I understand this sentiment, but it is unfortunate for me as "just shared memory please" is my exact use case. I will say though that the reality is that whether the flag exists or not, users with this use-case exist (myself, and I found another example of someone asking for it: #11750). If the flag is removed, it just means you are supporting this other use case under the That being said, I obviously defer to your judgment on how y'all want to move forward. But I would like to find a resolution that unblocks me. So far the two suggestions are: Option 1: Accept the PR, which adds the downside of a new configuration for system libraries in the cache. Or is there another option that I haven't covered? |
How about if we moved this check so that it occurs on thread creation instead of on startup? |
That sounds good to me. With that, I am hoping that building with WASM_WORKERS and not creating threads should work here, and with minimal overhead. The one thing that might be missing is if WASM_WORKERS/pthreads assume SAB in JS somewhere, but if some code assumes that I think we can generalize it in the same way as that check (that is, WASM_WORKERS/pthreads could allow non-SAB up until the moment a thread is created, at which time SAB would be a problem). |
c67c4ea
to
a2618d9
Compare
Hey folks. I made some changes here that I'm hoping will address your concerns. Instead of adding a new build variant to I think in the future, it might be possible to even get the Also to note: my use-case is kind of niche, but I do think that a mode that allows the user to BYO-threading implementation is useful, and that's sort of what a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still really like to see if we #22710 works for you so we can have a path to removing the -sSHARED_MEMORY
flag completely.
I'm not super keep on the new -sm
suffix, but we can bikeshed that if we decide to move forward with this.
.globaltype __stack_base, PTR | ||
.globl __stack_base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love that these are now globals symbols...I wish we could find a way to avoid the need for divergence here between wasm workers and pthreads, but for now this seems reasonable.
I feel the same as @sbc100 : This is definitely a big improvement, and I could be ok with it, but #22710 is even simpler and I hope it can work for you equally well @eyebrowsoffire ? If not I'd be curious what the downsides to that are. |
Also, I think it might be worth looking into the "single threaded build with shared memory approach". I noticed that this works:
So I think that might be worth looking into. It should result in the optimally small binary too, without any of pthread or WW overhead. |
This is the exact approach I am trying to actually do here in this PR. The |
tools/system_libs.py
Outdated
|
||
def get_files(self): | ||
files = [] | ||
if (self.is_ww): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No braces here
Sorry, there was some back and forth on that in the meeting and I thought we ultimately landed on leaving it as-is, but maybe I misunderstood. I am happy to change it back to ww though, let me do that. |
tools/system_libs.py
Outdated
return super().get_default_variation(is_mt=settings.PTHREADS, is_ww=settings.WASM_WORKERS and not settings.PTHREADS, **kwargs) | ||
return super().get_default_variation( | ||
is_mt=settings.PTHREADS, | ||
is_sm=settings.SHARED_MEMORY and not settings.PTHREADS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can revert all changes to this class except this one line?
tools/system_libs.py
Outdated
@@ -725,7 +725,7 @@ def get_cflags(self): | |||
if self.is_mt: | |||
cflags += ['-pthread', '-sWASM_WORKERS'] | |||
if self.is_ww: | |||
cflags += ['-sWASM_WORKERS'] | |||
cflags += ['-sSHARED_MEMORY=1'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now the only remaining controversial change. This does mean that no -ww
libraries can ever use __EMSCRIPTEN_WASM_WORKERS__
which seems odd. So perhaps revert this line too.
The net result of this PR would then me "SHARED_MEMORY uses WASM_WORKERS libraries for now, but most them will work just fine even without the wasm workers library itself".
I think we could land that and then iterate from there if it doesn't work in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long run I agree we want to almost all libraries to be built only with SHARED_MEMORY
and then only libwasm_workers and libpthreads (and perhaps one or two more) to be build with specificly wasm workers or pthreads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I'll look into changing those
tools/system_libs.py
Outdated
@@ -1420,6 +1425,8 @@ def __init__(self, **kwargs): | |||
|
|||
def get_cflags(self): | |||
cflags = super().get_cflags() | |||
if self.is_ww: | |||
cflags += ['-sWASM_WORKERS'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make the above change you can also revert this change.
Sorry to keep asking your to make the change smaller, but I think if we can land it in its minimal form we can iterate from there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Thanks for working on this!
I think I have a change that can help with the new/current build failure.. |
The MTLibrary subclass is designed for libraries that build in all different modes. I think the wasm_workers library and the plans pthreads library should not be inheriting from this class. This is needed for both emscripten-core#22683 and emscripten-core#22735.
The MTLibrary subclass is designed for libraries that build in all different modes. I think the wasm_workers library and the plans pthreads library should not be inheriting from this class. This is needed for both emscripten-core#22683 and emscripten-core#22735.
system/lib/wasm_worker/wasm_worker.S
Outdated
@@ -0,0 +1,64 @@ | |||
.extern __stack_pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one last change. Can we call this file emscripten_wasm_worker_initialize.S
or wasm_worker_initialize.S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Followup to emscripten-core#22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`.
Followup to emscripten-core#22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`.
Followup to emscripten-core#22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`.
Followup to emscripten-core#22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`.
Followup to emscripten-core#22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`.
Followup to #22683. This change avoids making __stack_base and __stack_end into public/external symbols and avoids reimplementing `emscripten_stack_set_limits`. This is a minor code size win too.
Currently, there are a couple of restrictions on the
SHARED_MEMORY
flag:SHARED_MEMORY
flag without thePTHREADS
orWASM_WORKERS
flags simply fails, because we don't have variations of the system libraries that have this flag standalone. Instead, emcc attempts to use the single-threaded system libraries, which are not compatible with the user's build because they don't have the correct wasm feature flags (i.e. -matomics and -mbulk-memory). This PR addsSHARED_MEMORY
-only variations of the system libraries, which allows users to build with a standaloneSHARED_MEMORY
flag without error.SHARED_MEMORY
also impliesIMPORTED_MEMORY
. However,IMPORTED_MEMORY
is only important for builds that use thePTHREADS
orWASM_WORKERS
flags, since they actually do multi-threading and need to read and write to the same memory across workers. This PR changes this so that if a standaloneSHARED_MEMORY
flag is specified, it can still avoid importing the memory.The use-case for enabling the
SHARED_MEMORY
flag without actually having an actual threading implementation is to be compatible with other modules which may import that memory whose import declaration specifies the memory as shared. This is an important use-case for Flutter, as we want to create two variations of the emscripten-generated module (one single-threaded and one multi-threaded, depending on whether the environment supports it) while keeping the module that imports the emscripten-generated module the same (which means keeping the memory import declaration as shared).