Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to pass a socket handle directly to uWS? #766

Open
hunterloftis opened this issue Jul 20, 2022 · 28 comments
Open

Is it possible to pass a socket handle directly to uWS? #766

hunterloftis opened this issue Jul 20, 2022 · 28 comments

Comments

@hunterloftis
Copy link

hunterloftis commented Jul 20, 2022

eg, something equivalent to handleUpgrade.

The reason this is useful is that, combined with passing socket handles across processes, you can route all sockets based on the same topic to the same process in a node cluster:

@hunterloftis
Copy link
Author

That's what it looked like to me too, just wanted to check. Thanks for the quick reply @e3dio!

@ghost
Copy link

ghost commented Jul 20, 2022

It's still a good feature request

@ghost ghost reopened this Jul 20, 2022
@uNetworkingAB
Copy link
Contributor

I'm closing this, it's not a bad idea but it is a too complex idea for the gain - just let sockets fall wherever they fall and use the pub/sub or direct connections based on some URL query value or something like that. Point is - you can achieve pretty good results without this kind of low level hackery. But yes it is a good idea. But no, I have no time to implement and maintain this edge case (and it kind of goes against the simple nature of uWS)

@myndzi
Copy link

myndzi commented Aug 28, 2024

Wondering if you'll reconsider this. My use-case here is graceful deployments for an application with long-lived websockets.

I help maintain a game server that uses websockets to communicate. Since interrupting game sessions is very painful to the players, deployments are done by launching a new instance of the server and sending new connections there, which allows the old instance to remain running while existing players finish their games. It's not perfect, since if players get disconnected they can't reconnect. What I'm looking for is the ability to route connections based on ... anything I can control. The hostname (via SNI), a header, a path or query component. This would allow me to tell the game which address to connect to and route it to the same process every time, as well as change the "active" server to execute deployments, without needing proxy infrastructure.

I've tried various solutions.

  • On-host reverse proxy (nginx) adds an unacceptable amount of CPU overhead. I've done my best to minimize the bulk of the position data being passed frequently, but the fanout still gets extreme with ~100 players, and passing the data between two processes on the host instead of one more or less doubles the CPU usage.
  • Listening on different ports causes problems for some users when the websocket connection is not using the default port (443)
  • Cloudflare can proxy websockets, but it frequently terminates them (presumably as their infrastructure shifts around)
  • DigitalOcean does allow you to bind up to two IP addresses per host, but it's a bit weird and painful to manage. I'm currently using Cloudflare, but can't recall the specific other problems I encountered with the multi-IP solution
  • The previous comment mentions "direct connections based on some URL query value or something like that", but I don't see any way to do that with this library
  • As best I can tell, listening on the same port via multiple processes (as mentioned in Ability to handle an incoming connection from a socket descriptor (enabling multi-process) uWebSockets#1360) results in the process that receives the request being random. I'm also not sure whether I can use that approach when running the server under Docker.

There are still solutions available, such as standing up another server to act as a reverse proxy, but the game is too niche to justify the extra operational overhead and it's undesireable to double the infrastructure cost when a software solution is available. I'm considering just switching back to ws since it will allow me to make the user experience better (assuming I can integrate ws with Node's IPC socket transfer), and uWS.js would have to be twice as performant as ws for that to be a downgrade vs the on-host reverse-proxy solution. Last time I measured it, that was not the case (since the limiting factor is pubsub throughput).

I understand the perspective of "we have defined the scope of our library and are holding to it", but it's a bit of a shame whenever a great library exists and I can't use it to meet my goals :(

@uasan
Copy link

uasan commented Aug 28, 2024

But we need to make a mapping of FD to the user context (session), this is necessary to restore the connection with the user session, for such mapping we need the response.getFd() method.

Then the easiest option is to make a cluster (as in your example) and the main process will store FD and mapping sessions, child workers of this process can be rebooted, when the worker is loaded it will take open FD and mapping sessions from the main parent process

@uNetworkingAB uNetworkingAB reopened this Aug 28, 2024
@uNetworkingAB
Copy link
Contributor

Have you looked at the very latest release which changes approach for worker threads? See https://github.com/uNetworking/uWebSockets.js/releases/tag/v20.48.0

This release adds App.getDescriptor() and App.addChildAppDescriptor

What you need, if I understand, is exactly this but you also need App.removeChildAppDescriptor so that you can unregister a child (worker) instance when swapping.

Alternatively we just expose App.adoptSocket(fd) and you do the distribution yourself (not as efficient). With adoptSocket(fd) you should be able to do whatever you want, really. Using the Node.js net.Server as listener.

@uasan
Copy link

uasan commented Aug 28, 2024

Oh, thanks, I didn't know about these methods.

Instead of server nodes, to accept sockets, it's better to use Systemd.socket, by the way, they can save socket descriptors when the process is rebooted.

Thanks!

@myndzi
Copy link

myndzi commented Aug 28, 2024

Adding removeChildAppDescriptor does seem like it would get the job done.

Exposing adoptSocket would potentially be more generally useful since it allows for more flexibility. Being a little "less efficient" in accepting the sockets is a completely worthwhile tradeoff here (to me) since it allows for accepting reconnects to the "shutting down" instance and less complex code for clustering players in the same room together(?).

It remains unclear, though, whether it's possible to gain any information about the socket before distributing it to the worker. If I were to use SNI for example, I still have to accept the TCP connection and negotiate the TLS handshake to know which host the connection was intended for. So the socket would need to be handed over once the connection is established. Using HTTP data (path, query string, header...), the initial request would have to be parsed before selecting a worker, handing the socket over before handling the websocket upgrade.

@uNetworkingAB
Copy link
Contributor

I can add both adoptSocket(fd) and removeChildAppDescriptor. SNI sounds like a nightmare.

@uNetworkingAB
Copy link
Contributor

If you let latest binaries build it should have App.adoptSocket(fd) that should make it work for you

@uasan
Copy link

uasan commented Aug 28, 2024

It's strange, in FD you store context, usually context specific to all applications and can be expanded as you like, therefore mapping FD and context is a more scalable solution

@myndzi
Copy link

myndzi commented Aug 28, 2024

Just to be clear, SNI is not acutally my preferred solution, it just allows routing "earlier" in the request lifecycle. My preferred solution would be to hand off before websocket upgrade, but I'm not sure how viable that one is. They both sound like they could be relatively complex since they're handing off a live socket rather than a socket that hasn't been accepted yet :\

@uNetworkingAB
Copy link
Contributor

adoptSocket should be called right after accepting the socket. So immediately you get it, just hand it over to whatever process you want.

@myndzi
Copy link

myndzi commented Sep 20, 2024

Took me a while to get around to digging into this in detail; after managing to reproduce the build process and modifying the typedefs, app.adoptSocket was just doing nothing (no request handling, no error logged, etc). The patch seems to expect an int32 as the first argument to adoptHandle:
6fb88f7#diff-96cdcd203b073e1035b76e07abe84c54a4ab1d42ce62cc59716d35a3e50a67a9R722

(I may not be interpreting this correctly, unfamiliar with the native module API)

However, I can't find any way (publicly exposed or otherwise) to actually get the underlying file descriptor for the socket. The thing that gets passed between Node processes is an instance of Socket (I haven't tried with TLS yet, this was just a basic attempt).

If I'm understanding the expectations correctly, there doesn't appear to be any way to acquire the argument that the library expects, so it can never work :\

I'm not sure what would be required to get at the actual underlying fd from the Socket instance, though presumably it's achievable on the C++ side somehow(?)

@uNetworkingAB
Copy link
Contributor

I don't follow. What do you mean by "the Socket instance".

@myndzi
Copy link

myndzi commented Sep 20, 2024

In Node.js, a socket is an instance of a Javascript "class": https://nodejs.org/api/net.html#class-netsocket

This is what authors are working with in code, not file descriptors for sockets at the OS level. This object is also what gets transferred by the IPC mechanism. Obviously it must do the kernel magic to hand over the FD too, but inspecting the contents of the Socket, I do not see a direct reference to the FD, so I'm guessing it's held on the C++ side of Node's stdlib.

@uNetworkingAB
Copy link
Contributor

This is kind of why adoptSocket is a bad idea. You can get the FD from net.Socket (at least you could 8 years ago last time I checked). But then you need to dup2 it and destroy the Node.js counterpart. This is what early versions of uWS did when there was no HTTP server in uWS.

So if you think this is messy, I would go with the childApp approach instead.

@myndzi
Copy link

myndzi commented Sep 20, 2024

I cannot see any way to get the FD from the net.Socket period. I inspected the net.Socket object, including via Object.getOwnPropertyDescriptors and Object.getOwnPropertySymbols and briefly peeked at the node.js source code for the net module to see if I could identify what property contained it.

I was informing you so you can either find a way to support it or revert the change, since the change as-is seems to not be usable

@uNetworkingAB
Copy link
Contributor

uWS has no compatibility with Node.js types other than FD. adoptSocket obviously will only work if you have an FD.

You have, in last commit, addChildAppDescriptor and removeChildAppDescriptor. Those are all you need to migrate players

@uasan
Copy link

uasan commented Sep 21, 2024

I cannot see any way to get the FD from the net.Socket period. I inspected the net.Socket object, including via Object.getOwnPropertyDescriptors and Object.getOwnPropertySymbols and briefly peeked at the node.js source code for the net module to see if I could identify what property contained it.

You can get it with

socket._handle.fd
// or for TLS sockets.
socket._parent._handle.fd

@myndzi
Copy link

myndzi commented Sep 21, 2024

@uasan thanks. Don't know why it didn't seem to show up when I logged the data. Maybe non-enumerable property and it was nested inside.

Doesn't seem to work anyway though. I get a connection reset by peer instead of a stalled socket when passing that value to adoptSocket. So I guess exposing this method is not useful*

* at least for porting a socket from one process to another

@uNetworkingAB
Copy link
Contributor

Doesn't seem to work anyway though

adoptSocket is the base for addChildApp, removeChildApp. So it's known to work. When you close your Node.js socket it will close the FD so you need to dup2 it. But like said, this is unnecessarily complex path to go.

Latest commit as of today should have both addChildAppDescriptor, removeChildAppDescriptor working so using that approach is way simpler.

@myndzi
Copy link

myndzi commented Sep 24, 2024

By "doesn't work", I meant only that passing the FD obtained from the Socket instance does not succeed. I assume that the underlying C++ code in uWS works, just the interop between it and Node doesn't work in this way. addChildAppDescriptor solves half of what I want to do, but not the other half (the ability to be selective in which app receives a connection), which is why I was exploring if adoptSocket could be made to work.

@uNetworkingAB
Copy link
Contributor

Yes addChildAppDescriptor is half. The other half is removeChildAppDescriptor which was added just now in latest release today.

Whatever child apps you have, will be round robin used for new connections. So by changing this set with the 2 functions, you have full control of how new connections migrate.

@uasan
Copy link

uasan commented Sep 24, 2024

That is, the concept is simple, we create workers in each worker an instance of a child uWS application, when we want to update the code, we delete the child application (call removeChildAppDescriptor), turn off the worker and start a new worker with new code that creates a new child application (call addChildAppDescriptor), did I understand correctly?

Then the question is, how will the new application receive sockets from the old application and how will these sockets correspond to their user sessions?

@uNetworkingAB
Copy link
Contributor

uNetworkingAB commented Sep 24, 2024 via email

@uasan
Copy link

uasan commented Sep 24, 2024

Okay, these are good functions for multithreading, but to migrate sockets you still need adoptSocket.

@uNetworkingAB
Copy link
Contributor

uNetworkingAB commented Sep 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants