Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emscripten threading #855

Open
digitalsignalperson opened this issue May 28, 2024 · 16 comments
Open

emscripten threading #855

digitalsignalperson opened this issue May 28, 2024 · 16 comments

Comments

@digitalsignalperson
Copy link

Hi, I'm curious what the challenges to move forward with emscripten threading.

As of today:

Emscripten has support for multithreading using SharedArrayBuffer in browsers. That API allows sharing memory between the main thread and web workers as well as atomic operations for synchronization, which enables Emscripten to implement support for the Pthreads (POSIX threads) API. This support is considered stable in Emscripten.

from https://emscripten.org/docs/porting/pthreads.html

@mackron
Copy link
Owner

mackron commented May 31, 2024

That first note at the top of that article isn't something I find appealing. I'm not much of a web person and I don't know anything about COOP or COEP so not entirely sure what the implications are on that front, but miniaudio needs to "just work", so certainly enabling pthreads wholesale without an option to disable it sounds bad considering that note.

When using pthreads with Emscripten, is it using actual real threads, or is it just emulating it? If it's just emulating it, what are the tangible real-world benefits you'd get out of it? Looking at that article they make it sound like it's real threads?

@digitalsignalperson
Copy link
Author

digitalsignalperson commented Jun 1, 2024

Thanks for those questions. Looking into it a bit, this is what I understand:

  • by default emscripten implements the pthreads api, and any code using pthreads will appear to work but will actually run in a single thread
  • using -pthread compiler flag actually makes code using pthreads multi-threaded, but the COOP/COEP headers need to be set by the webserver for this to work

So for miniaudio, I think if the emscripten builds used pthreads, everything "just works". And if anyone wants to do the extra work to compile with -pthread and serve their site with COOP/COEP headers, then it doesn't actually change any code on the miniaudio side.

This blog was helpful https://unlimited3d.wordpress.com/2021/12/21/webassembly-and-multi-threading/ including the sections on "Cross-origin isolation headers" / "Isolating multi-threaded WebAssembly – what for?" to motivate why COOP/COEP are involved.

@mackron
Copy link
Owner

mackron commented Jun 3, 2024

If I'm reading the Emscripten documentation correctly, it looks like __EMSCRIPTEN_PTHREADS__ will be defined if -pthread is being used. That, combined with it using actual real threads, probably makes it a reasonable thing to support in miniaudio. I'm assuming if __EMSCRIPTEN_PTHREADS__ is enabled, we just use pthreads like any other platform, and otherwise just leave it like it is now. Don't expect there to be too much additional code maintenance. I'll leave this ticket open and investigate when I get a chance. No time frame. Thanks for making me aware of this.

@teropa
Copy link

teropa commented Aug 20, 2024

So for miniaudio, I think if the emscripten builds used pthreads, everything "just works". And if anyone wants to do the extra work to compile with -pthread and serve their site with COOP/COEP headers, then it doesn't actually change any code on the miniaudio side.

This seems to be the case. Building for the Emscripten worklets API with pthreads API enabled I can see an ma_resource_manager_job_thread running as a separate Web Worker process. Seems to work just fine.

@digitalsignalperson
Copy link
Author

@teropa is the performance ok for you? My miniaudio dataCallback does some heavy lifting but curiously I didn't notice any performance difference with pthreads enabled.

I've been experimenting with this in a sokol project.
I had to include -pthread -Wl,-u,_emscripten_run_callback_on_thread for it to compile.

Also my hack for the http-server the project uses I modify .local/lib/node_modules/http-server/lib/http-server.js to add

  this.headers['Cross-Origin-Embedder-Policy'] = 'require-corp';
  this.headers['Cross-Origin-Opener-Policy'] = 'same-origin';

I haven't explored all the considerations in https://emscripten.org/docs/porting/pthreads.html and there's other flags like PTHREAD_POOL_SIZE https://emscripten.org/docs/tools_reference/settings_reference.html#pthread-pool-size

I'm also curious about AudioWorklets per
https://emscripten.org/docs/api_reference/wasm_audio_worklets.html

Audio Worklets API is based on the Wasm Workers feature. It is possible to also enable the -pthread option while targeting Audio Worklets, but the audio worklets will always run in a Wasm Worker, and not in a Pthread.

Which sounds like audio will be in another thread (without pthread support, and no change if including pthread support), but my program freezes at runtime when I try including the -DMA_ENABLE_AUDIO_WORKLETS -sAUDIO_WORKLET=1 -sWASM_WORKERS=1 -sASYNCIFY and I haven't debugged further.

@mackron
Copy link
Owner

mackron commented Aug 20, 2024

@teropa It surprised me to read that you have a ma_resource_manager_job_thread instance running because I thought I explicitly disabled threading on the Emscripten build:

/* The Emscripten build cannot use threads. */
#if defined(MA_EMSCRIPTEN)
{
    resourceManagerConfig.jobThreadCount = 0;
    resourceManagerConfig.flags |= MA_RESOURCE_MANAGER_FLAG_NO_THREADING;
}
#endif

Are you using ma_engine? Or are you using a self-managed ma_resource_manager? I'm wondering if that might be working for you by coincidence rather than by design.

@digitalsignalperson
Copy link
Author

for me I'm using

#define MINIAUDIO_IMPLEMENTATION
#define MA_ENABLE_ONLY_SPECIFIC_BACKENDS
#if defined(__EMSCRIPTEN__)
    #define MA_ENABLE_WEBAUDIO
    #define MA_NO_RESOURCE_MANAGER
#endif

@teropa
Copy link

teropa commented Aug 21, 2024

@digitalsignalperson Performance seems good, though I've yet to measure it systematically. My audio processing is fairly light, and this translates to the audio thread being about 98% idle most of the time. The job pthread where I'm doing opus decoding looks much busier though.

Compilation flags: -pthread
Linker flags: -sASYNCIFY -sAUDIO_WORKLET=1 -sWASM_WORKERS=1 -pthread -sPTHREAD_POOL_SIZE=2 -sALLOW_MEMORY_GROWTH
miniaudio flags: -DMA_ENABLE_AUDIO_WORKLETS -DMA_AUDIO_WORKLETS_THREAD_STACK_SIZE=524288

The pthread pool size could probably be just 1, but I'm using an additional one for my own purposes. The trickiest bit was finding out I had to increase MA_AUDIO_WORKLETS_THREAD_STACK_SIZE as there was an obscure Emscripten error from the worklet thread otherwise. Running with the clang address sanitizer uncovered that problem as running out of stack space.

And yeah, we do also have COOP/COEP headers enabled. I assume the shared memory via SharedArrayBuffer just would not work otherwise.

Which sounds like audio will be in another thread (without pthread support, and no change if including pthread support), but my program freezes at runtime when I try including the -DMA_ENABLE_AUDIO_WORKLETS -sAUDIO_WORKLET=1 -sWASM_WORKERS=1 -sASYNCIFY and I haven't debugged further.

Right, with worklets enabled audio will always be on the Web Audio thread created by the browser, not a pthread created by emscripten. I was happy to find that's all pretty transparent with the Emscripten Audio Worklet support though. It creates the audio context, thread, and worklet, and I didn't really have to think about it. I haven't experienced any freezes either. I'm not doing any capture, so I assume that also simplifies things somewhat.

@teropa
Copy link

teropa commented Aug 21, 2024

@mackron Right, yes, I'm wiring up my own ma_resource_manager with an Opus decoder backend. So I assume that's why I'm not hitting the code path where you disable threading.

@mackron
Copy link
Owner

mackron commented Aug 21, 2024

Looking at the code, it looks like I disable threading in ma_engine, but I don't at the ma_resource_manager level. This was unintentional. When I first added Emscripten support, pthreads was experimental and my intention was to just not do any threading at all. With the exception of that code snippet I posted earlier, is there anything I need to do to allow you to use -pthread as miniaudio stands right now in your particular cases?

@digitalsignalperson
Copy link
Author

I don't think I have it working, but I'm still learning the ropes of how to actually debug things in the browser.

Is the worklet part required? With just -pthread -Wl,-u,_emscripten_run_callback_on_thread -sPTHREAD_POOL_SIZE=1 I see the extra thread created in the devtools debugger on firefox

image

If I try to pause execution in this thread, the button greys out and says "Waiting for next execution".

image

In the debug build without -pthread my demo is showing 90fps initially, then when I click "Allow" to use my microphone, it drops to 54fps. When I enable -pthread it's exactly the same.

If I try in addition to enable audio worklets like with -DMA_ENABLE_AUDIO_WORKLETS -sAUDIO_WORKLET=1 -sWASM_WORKERS=1 -sASYNCIFY I get an assertion failure during ma_device_init() with this traceback

printErr
abort
___assert_fail
x
ma_device__on_notification
ma_device__on_notification_unlocked
x
createExportWrapper
unlock
(Async: promise callback)
unlock
(Async: EventlListener.handleEvent)
881419
881419
runEmAsmFunction
_emscripten_asm_const_int
x
ma_context_init__webaudio
ma_context_init
ma_device_init_ex
ma_device_init

For other things on the miniaudio side, I saw one #if !defined(__EMSCRIPTEN__) that seems like it can be removed. All the pthread functions are implemented, and while pthread_attr_setschedpolicy() and pthread_attr_setschedparam() are no-op, pthread_attr_setstacksize() does set the stacksize and pthread_create() uses it. Emscripten implementations here https://github.com/emscripten-core/emscripten/tree/main/system/lib/libc/musl/src/thread

@mackron
Copy link
Owner

mackron commented Aug 22, 2024

@digitalsignalperson Try doing a fresh sync of the dev branch and try again. It might be fixed with this PR #888.

@digitalsignalperson
Copy link
Author

@mackron on the dev branch now when I use the audio worklet flags that does seem to resolve the assert fail, but my app renders one frame and then is frozen without any error messages and not responding to inputs. I can get the devtools debugger to pause seemingly only in a registerOrRemoveHandler() javascript function. Maybe it's something on my end, though it works as expected without the audio worklet flags. I'll have to figure out how to debug the wasm and step through it or something.

@teropa
Copy link

teropa commented Aug 22, 2024

@digitalsignalperson Does your app work if you do pure playback (no microphone activation / capture). I haven't tested that side of things and I know capture brings in a whole bunch of additional machinery on the web. Might help narrow things down.

@teropa
Copy link

teropa commented Aug 22, 2024

With the exception of that code snippet I posted earlier, is there anything I need to do to allow you to use -pthread as miniaudio stands right now in your particular cases?

For our case (playback only, managed resource manager, audio worklets enabled) everything seems to be running smoothly with pthreads with the latest from dev branch. Have tested on current versions of Chrome, Firefox, Safari (Mac+iOS).

On Safari, especially on iOS, I'm seeing some memory issues triggered by pthreads and shared memory but I don't believe that's a miniaudio problem: emscripten-core/emscripten#19374

@digitalsignalperson
Copy link
Author

@teropa great suggestion thank you. My app is entirely audio capture.

To test with this I did a hack to my init function

#ifndef NO_CAPTURE_TEST
    ma_device_config deviceConfig = ma_device_config_init(ma_device_type_capture);
    deviceConfig.capture.format = ma_format_f32;
    deviceConfig.capture.channels = 2;
#else
    ma_device_config deviceConfig = ma_device_config_init(ma_device_type_playback);
    deviceConfig.playback.format = ma_format_f32;
    deviceConfig.playback.channels = 2;
#endif

and in my dataCallback

void dataCallback(ma_device* pDevice, void* pOutput, const void* pInput, ma_uint32 num_frames) {
    (void) pOutput; // unused
#ifndef NO_CAPTURE_TEST
    float* in_buffer = (float*)pInput;
    unsigned channels = pDevice->capture.channels;
#else
    (void) pInput;
    float in[2048];
    float* in_buffer = in;
    unsigned channels = 1;
    for (int i = 0; i < 2048; i++) {
        in[i] = float(i)/512 - 1.0f;
    }
#endif

So now I'm faking capture with a simple triangle wave, meanwhile it's operating in playback mode.

When I do this, I'm still seeing no performance improvement with or without -pthread. I'll have to figure out how to do proper wasm debugging when I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants