Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit: Threads: Add Fence, Wait, and Notify #285

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

matetokodi
Copy link
Contributor

No description provided.

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 6baa21d to b33af1f Compare September 2, 2024 15:02
src/jit/ByteCodeParser.cpp Outdated Show resolved Hide resolved
src/jit/ByteCodeParser.cpp Outdated Show resolved Hide resolved
static void atomicNotifyCallback(ExecutionContext* context, sljit_s32 memOffset)
{
Instance* instance = context->instance;
Memory** memories = reinterpret_cast<Memory**>(reinterpret_cast<uintptr_t>(instance) + Instance::alignedSize());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we can get this with a helper function.

src/jit/MemoryUtilInl.h Outdated Show resolved Hide resolved
Instance* instance = context->instance;
Memory** memories = reinterpret_cast<Memory**>(reinterpret_cast<uintptr_t>(instance) + Instance::alignedSize());
if (!memories[0]->isShared()) {
context->tmp1[0] = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an error code? What about returning with an error code from this function.

sljit_s32 offset = atomicWait32Operation->offset();

Operand* operands = instr->operands();
MemAddress addr(MemAddress::CheckNaturalAlignment, instr->requiredReg(0), instr->requiredReg(1), instr->requiredReg(2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this uses MemAddress, then it needs to be in the MemoryInl.h

src/jit/MemoryUtilInl.h Outdated Show resolved Hide resolved

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp1), count.arg, count.argw);

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R1, 0, memOffset.arg, memOffset.argw);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the address is already computed, can we used it in any way? The offset is not passed in any way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callbacks use offsets and not the absolute address, so I do not think so.

static sljit_s32 atomicWaitCallback(ExecutionContext* context, sljit_s32 memOffset, sljit_s32 size)
{
Instance* instance = context->instance;
if (!instance->memory(0)->isShared()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add some newlines to this function.

CompileContext* context = CompileContext::get(compiler);
sljit_s32 size = (instr->opcode() == ByteCode::MemoryAtomicWait64Opcode ? 8 : 4);

MemoryAtomicWait32* atomicWait32Operation = reinterpret_cast<MemoryAtomicWait32*>(instr->byteCode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type casting is valid for AtomicWait64?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still valid.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks somewhat confusing, because MemoryAtomicWait32 and MemoryAtomicWait64 are handled together here.
Would it be better to merge these two bytecodes into one unfied bytecode like MemoryAtomicWait?

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp2), timeout.arg, timeout.argw);
#endif /* SLJIT_32BIT_ARCHITECTURE */

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R1, 0, memOffset.arg, memOffset.argw);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not this be SLJIT_MOV32? Furthermore, how the constant offset is passed?


sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp1), count.arg, count.argw);

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R1, 0, memOffset.arg, memOffset.argw);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with offset

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 8f946f1 to 668df67 Compare September 9, 2024 15:09
Copy link
Collaborator

@zherczeg zherczeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch should be ready for review

@@ -271,6 +271,7 @@ jobs:
- name: Run Tests
run: |
$RUNNER --engine="$GITHUB_WORKSPACE/out/extended/walrus" wasm-test-extended
$RUNNER --jit --engine="$GITHUB_WORKSPACE/out/extended/walrus" wasm-test-extended
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this patch, does the extension director work with jit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by extension director?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean all tests will pass with jit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes they do.

@@ -1958,6 +1961,51 @@ static void compileFunction(JITCompiler* compiler)
operands[3] = STACK_OFFSET(atomicRmwCmpxchg->dstOffset());
break;
}
case ByteCode::AtomicFenceOpcode: {
Instruction* instr = compiler->append(byteCode, Instruction::AtomicFence, opcode, 0, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no other similar opcode? Data drop maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it together with UnreachableOpcode, which is also a 0-0 argument non-callback instruction (Data Drop is marked as a callback)

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp2), timeout.arg, timeout.argw);
#endif /* SLJIT_32BIT_ARCHITECTURE */

sljit_emit_op1(compiler, SLJIT_MOV32, SLJIT_R1, 0, memOffset.arg, memOffset.argw);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waht about sljit_s32 offset = atomicWait32Operation->offset(); immediate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I forgot to use the modified wait callback that uses the absolute address; so I no longer need the offset, and use addr.basereg instead, like it was already used in notify.

CompileContext* context = CompileContext::get(compiler);
sljit_s32 size = (instr->opcode() == ByteCode::MemoryAtomicWait64Opcode ? 8 : 4);

MemoryAtomicWait32* atomicWait32Operation = reinterpret_cast<MemoryAtomicWait32*>(instr->byteCode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still valid.

@@ -4,6 +4,7 @@
(memory 1 1 shared)

(func (export "init") (param $value i64) (i64.store (i32.const 0) (local.get $value)))
(func (export "initOffset") (param $value i64) (param $offset i32) (i64.store (local.get $offset) (local.get $value)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these files comes from the standard. We should not add custom modifications here. They should go separate files.

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 1bced88 to 5bafd6d Compare September 10, 2024 11:47
@matetokodi matetokodi marked this pull request as ready for review September 10, 2024 12:37

(func (export "initOffset") (param $value i64) (param $offset i32) (i64.store (local.get $offset) (local.get $value)))

(func (export "i32.atomic.load") (param $addr i32) (result i32) (i32.atomic.load (local.get $addr)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the offset? It should look like:

i32.atomic.load offset=32 (local.get $addr))

The offset does not need to be aligned, just the final address.

(module
(memory 1 1 shared)

(func (export "initOffset") (param $value i64) (param $offset i32) (i64.store offset=33 (local.get $offset) (local.get $value)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, but could we add some variations to the offset not just 33? Some aligned, some unaligned, and some bigger values.

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp2), timeout.arg, timeout.argw);
#endif /* SLJIT_32BIT_ARCHITECTURE */

sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R1, 0, addr.baseReg, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure it is if used with the MemAddress::CheckNaturalAlignment | MemAddress::AbsoluteAddress memory address flags; or should we use SLJIT_EXTRACT_REG(addr.memArg.arg) like it is in emitAtomic?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper way is the latter.

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 831011f to ae0618a Compare September 11, 2024 12:04
Copy link
Collaborator

@zherczeg zherczeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Instruction* instr = compiler->append(byteCode, Instruction::AtomicNotify, opcode, 2, 1);
instr->addInfo(Instruction::kIsCallback);

MemoryAtomicNotify* memoryAtomicWait = reinterpret_cast<MemoryAtomicNotify*>(byteCode);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a typo (memoryAtomicWait -> memoryAtomicNotify)

CompileContext* context = CompileContext::get(compiler);
sljit_s32 size = (instr->opcode() == ByteCode::MemoryAtomicWait64Opcode ? 8 : 4);

MemoryAtomicWait32* atomicWait32Operation = reinterpret_cast<MemoryAtomicWait32*>(instr->byteCode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks somewhat confusing, because MemoryAtomicWait32 and MemoryAtomicWait64 are handled together here.
Would it be better to merge these two bytecodes into one unfied bytecode like MemoryAtomicWait?

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 012326d to 815c9b5 Compare September 13, 2024 15:19
@@ -827,6 +828,33 @@ class Const128 : public ByteCode {
uint32_t m_value[4];
};

// dummy ByteCode for 3-input 1-output operation with offset
class TernaryOperationOffset : public ByteCode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the other common bases, the name should be ByteCodeOffset4Value. The class should be declared after ByteCodeOffset2Value.

@matetokodi matetokodi force-pushed the jit_threads_fence_wait_notify branch 2 times, most recently from 161c745 to 430f844 Compare September 16, 2024 09:48
Add a few test cases with non-zero offsets

Signed-off-by: Máté Tokodi [email protected]
@zherczeg
Copy link
Collaborator

@clover2123 could you review this patch?

Copy link
Collaborator

@clover2123 clover2123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clover2123 clover2123 merged commit e5e2c64 into Samsung:main Sep 23, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants