You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At some point, they get the following ERROR-level log message: { code:"ERR_WORKER_OUT_OF_MEMORY" }.
After that point, it appears that the Worker is no longer making any progress on pending Workflow Tasks, despite the fact that the Worker process is still running.
From the Server side, metrics indicate a considerable increase in calls to PollWorkflowExecutionHistory and a considerable reduction in calls to PollWorkflowTaskQueue. Provided data doesn't allow identifying the exact provenance of those calls (they normally operate ~12 active Workers on that NS).
The situation continues until the Worker process is restarted, after which Workflow progress resumes.
Analysis
ERR_WORKER_OUT_OF_MEMORY is an error from Node itself. That means that Node terminated a Worker Thread due to running out of memory.
Given the symptoms, it is sensible to assume that the Worker Thread killed by Node would be the Workflow Worker Thread, which means the Temporal Worker will no longer be able to process incoming Workflow Task.
What happens next is not clear, but the correct behavior would be for the Worker to initiate shutdown. There’s very little we can do at that point, but at least, let’s not pretend that everything’s all right. Should probably print a clear CRITICAL level message to the log, and terminate the process ASAP.
The text was updated successfully, but these errors were encountered:
Describe the bug
A user reported the following sequence of events:
{ code:"ERR_WORKER_OUT_OF_MEMORY" }
.PollWorkflowExecutionHistory
and a considerable reduction in calls toPollWorkflowTaskQueue
. Provided data doesn't allow identifying the exact provenance of those calls (they normally operate ~12 active Workers on that NS).Analysis
ERR_WORKER_OUT_OF_MEMORY
is an error from Node itself. That means that Node terminated a Worker Thread due to running out of memory.Given the symptoms, it is sensible to assume that the Worker Thread killed by Node would be the Workflow Worker Thread, which means the Temporal Worker will no longer be able to process incoming Workflow Task.
What happens next is not clear, but the correct behavior would be for the Worker to initiate shutdown. There’s very little we can do at that point, but at least, let’s not pretend that everything’s all right. Should probably print a clear CRITICAL level message to the log, and terminate the process ASAP.
The text was updated successfully, but these errors were encountered: