Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node-redis Client Refuses to Buffer Commands Upon An ECONNRESET Error From Socket Connection #2785

Open
sohhaofeng1989 opened this issue Jul 5, 2024 · 1 comment
Labels

Comments

@sohhaofeng1989
Copy link

sohhaofeng1989 commented Jul 5, 2024

Description

Recently, we noticed that the node-redis client refuses to buffer redis commands upon encountering an ECONNRESET Error from the underlying socket connection. This is demonstrate-able by doing the following.

  1. Set up a node-redis client that connects to a locally spun-up redis-server
  2. Perform the following hardcoded 10s sleep/delay before a SET command gets executed
    try {
        console.log(
          'Execute iptables command to reject TCP connections with RST: iptables -A OUTPUT -p tcp --dport 6379 -j REJECT --reject-with tcp-reset\n' +
            'Execute iptables command to drop packets received on TCP connection (simulating request timeouts): iptables -A OUTPUT -p tcp --dport 6379 -j DROP\n',
          'Execute iptables command to simulate host unreachable (icmp-host-unreachable): iptables -A OUTPUT -p tcp --dport 6379 -j REJECT --reject-with icmp-host-unreachable',
        )
       await new Promise((resolve) => setTimeout(resolve, 10000))
       await this.client.set(key, value, setConfig)
    } catch (error) {
      console.log(error)
    }
  1. Execute the script/server and once the 10s timer starts, run the following iptables command to simulate an ECONNRESET.
iptables -A OUTPUT -p tcp --dport 6379 -j REJECT --reject-with tcp-reset
  1. Wait for the 10s to be up and notice that the SET command did not get buffered into the offline queue (to be re-sent once the socket connection gets re-established). Instead, it throws an ECONNRESET error right away like the below
2024-07-05 19:40:22     error: {
2024-07-05 19:40:22       "errno": -104,
2024-07-05 19:40:22       "code": "ECONNRESET",
2024-07-05 19:40:22       "syscall": "read",
2024-07-05 19:40:22       "name": "Error",
2024-07-05 19:40:22       "message": "read ECONNRESET",
2024-07-05 19:40:22       "stack":
2024-07-05 19:40:22           Error: read ECONNRESET
2024-07-05 19:40:22               at TCP.onStreamRead (node:internal/stream_base_commons:217:20)
2024-07-05 19:40:22               at TCP.callbackTrampoline (node:internal/async_hooks:130:17)
2024-07-05 19:40:22     }
  1. Corresponding logs that were captured by subscribing to the error Connection Events.
2024-07-05 19:40:22 [11:40:22.131] INFO (410): redis store client disconnected Error: read ECONNRESET
2024-07-05 19:40:22     status: "INFO"
2024-07-05 19:40:22     scope: "RedisStore"
2024-07-05 19:40:22 [11:40:22.128] INFO (410): redis store client attempting to reconnect to redis server
2024-07-05 19:40:22     status: "INFO"
2024-07-05 19:40:22     scope: "RedisStore"
2024-07-05 19:40:22 [11:40:22.134] INFO (410): redis store client disconnected Error: connect ECONNREFUSED 172.18.0.3:6379
2024-07-05 19:40:22     status: "INFO"
2024-07-05 19:40:22     scope: "RedisStore"
2024-07-05 19:40:22 [11:40:22.136] INFO (410): redis store client attempting to reconnect to redis server
2024-07-05 19:40:22     status: "INFO"
2024-07-05 19:40:22     scope: "RedisStore"
2024-07-05 19:40:22 [11:40:22.137] INFO (410): redis store client disconnected Error: connect ECONNREFUSED 172.18.0.3:6379
2024-07-05 19:40:22     status: "INFO"
2024-07-05 19:40:22     scope: "RedisStore"

It's noteworthy to point out 3 things

  1. node-redis does buffer redis commands for all other socket connection errors like ECONNREFUSED , EADDRNOTAVAIL etc. Only ECONNRESET doesn't trigger the offline queue buffering mechanism
  2. Just to make sure that we ain't trippin or this isn't some Redis-prescribed agreement to handle ECONNRESET Errors. We did a side-by-side comparison with the other popular redis client library ioredis and found out that ioredis does buffer redis commands into its offline queue upon ECONNRESET.
  3. Some context here is that we're using AWS Elasticache Serverless for Redis. From time to time, AWS does some Service Updates that are transparent to its users. These Service Updates runs upgrades on their Elasticache Serverless proxies every 1 to 3 weeks, and whenever the proxy is upgraded, there will be disconnections that result in the aforementioned ECONNRESET errors. Whenever this happens, we sometimes drop user traffic which is definitely undesirable.

Node.js Version

20.8.1

Redis Server Version

7.0.7-alpine

Node Redis Version

4.6.12

Platform

Alpine Linux

Logs

2024-07-05 19:40:22     error: {
2024-07-05 19:40:22       "errno": -104,
2024-07-05 19:40:22       "code": "ECONNRESET",
2024-07-05 19:40:22       "syscall": "read",
2024-07-05 19:40:22       "name": "Error",
2024-07-05 19:40:22       "message": "read ECONNRESET",
2024-07-05 19:40:22       "stack":
2024-07-05 19:40:22           Error: read ECONNRESET
2024-07-05 19:40:22               at TCP.onStreamRead (node:internal/stream_base_commons:217:20)
2024-07-05 19:40:22               at TCP.callbackTrampoline (node:internal/async_hooks:130:17)
2024-07-05 19:40:22     }
@leibale
Copy link
Collaborator

leibale commented Jul 22, 2024

I'll need to debug a bit more, but for now I'll at least explain what should happen:
node-redis have 2 queues: waiting to be sent, and waiting for reply. When you execute a command it starts in the "waiting to be sent" queue, and after writing it to the socket (using nodejs socket.write) it moves to the "waiting for reply" queue.
When a socket drops, the client rejects all "waiting for reply" commands since they might have been executed and might not, and then if the "offline queue" is enabled (see disabledOfflineQueue), it'll keep the "waiting to be sent" queue, and executed them once a new connection established.
maybe socket.write does not write the data immediately in some cases (?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants