Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate means of authenticating a node ID when adding to NodeGraph #322

Closed
9 tasks done
joshuakarp opened this issue Feb 3, 2022 · 33 comments · Fixed by #378
Closed
9 tasks done

Investigate means of authenticating a node ID when adding to NodeGraph #322

joshuakarp opened this issue Feb 3, 2022 · 33 comments · Fixed by #378
Assignees
Labels
development Standard development discussion Requires discussion r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@joshuakarp
Copy link
Contributor

joshuakarp commented Feb 3, 2022

Specification

With #344 more or less solved we need a way of authenticating when adding forward connections. this includes authenticating the old connections when a bucket fills up. In both cases we can so a pingNode to check if a node is alive. However the current implementation of pingNode is insufficient for the job as it only checks if a node is alive.

NodeManager.pingNode needs to be updated to match the following criteria;

  1. It needs to establish a proxy connection to the target. if a connection can be established then the target is considered alive. if the connection is established then we can consider it authenticated as well.
  2. It needs to take an optional address information for checking if direct connections without the nat bashing proceedure can be done. This will be useful for ICE

The NodeManager.setNode needs to be updated to do the following.

  1. when a bucket is full the oldest node needs to be verified with pingNode. This is already done, just need to update ping for this. This will need to concurrently check alpha of the oldest connections.
  2. Needs an optional force parameter to garentee adding the node.

NodeConnectionManager needs to be updated to;

  1. takes NodeManager as part of start.
  2. establishing a connection should trigger NM.qeueSetNode

NodeManager needs to be updated to;

  1. startStop asyinc init pattern.
  2. implement a async queue for adding nodes to the nodeGraph.

After that we need to make sure all instance where we need to add a node just calls NodeManager.setNode. Where this is done still needs to be speced out but this may be outside the scope of this issue. Create a new issue for this?

Additional context

Tasks

  • update NodeManager.pingNode to match the above specification.
  • update NodeManager.setNode to match the above specification.
  • implement setnode queue. this is similar to the queue used by discovery.
  • seperate pingNode out of setNode
  • concurrently check alpha old nodes if a bucket is full when setting node.
  • convert NCM and NM to startstop, they rely on each other now with NCM calling NM.qeueSetNode.
  • establishing a NodeConnection should trigger setting a node in the nodeGraph by callong NM.qeueSetNode.
  • Creating a new connection should not be blocked by setting node.
  • Integrate into pk nodes set (with --no-ping and --force) and pk nodes ping
@tegefaulkes
Copy link
Contributor

There are 4 cases where we add a node to the node graph.

  1. manually
  2. ping a node
  3. connect to a node in any way
  4. receive a connection from a node

Manually adding a node should ping the node before adding it to the NodeGraph. In fact I think this is functionally identical to pinging a node in outcome except that manually adding the node should guarantee that the node is added to the graph by dropping an older node.

Pinging a node will add the node to the graph but this should be done in the CLI ping command's GRPC handler. having the NodeManager.pingNode is called to verify a node's 'aliveness' during an setNode so we can't have that attempting to set a node.

Adding a node after connecting to it can be handled by the nodeConnectionManager in the with context functions? or possible triggered by the forward proxy? need to explore this more. it would be nice to have this handled neatly in one place.

receiving the connection will be verified by the reverse proxy and then added to the nodeGraph. this is addressed in #344

@tegefaulkes tegefaulkes self-assigned this Mar 17, 2022
@tegefaulkes
Copy link
Contributor

tegefaulkes commented Mar 23, 2022

WIth the changes in commit 7cc74d0 I added a check for if a node was alive with pingNode. currently this is blocking and will timeout based on the proxies timeout which is usually 20 seconds. This will need to be fixed by letting us set a time out in the ping command. I think there was an issue planed for this? #353? but that's for the node connections, we will need a similar solution for the proxies stuff. New issue?

In any case we ill need to connect to a node and check it's nodeId and certificates data. maybe we can create a new agent-agent RPC for getting a remote node's certificates and nodeId? isn't this already what GRPCClientAgent.nodesChainDataGet does? we can verify the node id and MTLS on the client side as well.

@CMCDragonkai
Copy link
Member

Yea the closest issue is #353, by creating a timeout for NodeConnectionManager.

I think however the proxies openConnectionForward and openConnectionReverse should also have timeouts that override the default timeout. This is already supported with the timer parameter.

Also relevant is #297 as background theory into the future.

@CMCDragonkai
Copy link
Member

Furthermore solving #353 would just reduce the amount of time waiting for a ping. To really solve the problem of blocking, is to remove blocking entirely. That is we wouldn't want to block a request to contact a Node or do operations against another Node due to pinging old nodes in order to maintain bucket limit.

I suggest that this logic should be non-blocking, and to do so, we can either do non-blocking in-memory by queuing up a job using JS's own event loop, or by queuing it up into the DB and run this as part of a background work as suggested by the longer-kademlia trie paper at the end.

@CMCDragonkai
Copy link
Member

Suggest creating a new issue for this non-blocking problem as it relates to #353.

@tegefaulkes
Copy link
Contributor

Ah I just remembered that the feature added in 7cc74d0 just checks that the old node is still alive. as mentioned above authenticating it still needs to be done at this stage and that's only really needed to make sure that the node hasn't changed identities. it should've been authenticated before it was added to the graph.

For adding a new node I think the expectation is that the authentication is done when connecting to it just before adding it to the nodeGraph. We could use the same method we use to authenticate the old nodes but we would be doubling up the action of connecting to the node since the action of adding a node is triggered by a connection to a node in the first place.

As to how the node is authenticated we would need to obtain the connection info from the newly combined Proxy class. Based on this comment #344 (comment) we will add a callback to the Proxy class that will cause a connection event in the EventBus system. This will trigger adding the node to the node graph. Assuming that the proxy authenticates the connection before triggering the event then the only information the event needs is a tuple of (remoteNodeId, remoteHost, remotePort).

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Mar 24, 2022

Ignore this, moved old spec back.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Mar 24, 2022

I think the proxy open connection already does authentication by checking the certificate. Therefore the needing to authenticate a node would only be when one is explicitly adding a node to the node graph.

Any authentication would need to be done prior to adding the node to the node graph.

One question is whether we should allow users to explicitly add nodes to the node manager/graph straight from the command line and therefore not have to be authenticated ahead of time. I think we should allow this as this is would primarily be used for special situations and debugging.

The node manager's add node method can schedule an authentication by scheduling a ping node.

I think the ping node should not just establish a proxy connection but also check the certificates which I believe the proxy does already. But you should check @tegefaulkes

@CMCDragonkai
Copy link
Member

We should re-use the ping node as the basis of "authenticating a node". It would then be used when refreshing buckets and dealing with filled buckets.

We could add a boolean flag during the add node method which controls whether to authenticate the node prior to adding.

@tegefaulkes
Copy link
Contributor

Right now the setNode assumes the node was already authenticated prior to calling setNode since this will be triggered by an already authenticated connection by the Proxy.

@tegefaulkes
Copy link
Contributor

Hmm.. the spec I created here relates more to issue #344 but it still relates this one partly. The scope this issue relates more to authenticating and adding nodes when connecting TO a node. Whereas 344 relates to authenticating and adding a node when receiving a connection FROM a node.

I hoped that I could handle adding nodes for ForwardConnections the same way as the ReverseConnections. But if we automatically try to add the old node when we establish the connection for a ping then that could lead to issues or it's too side effect-ful.

So adding nodes based on reverse connections can be automatic but we may have to handle forward connections separately.

In any case, this Issue should focus more on the authenticating the forward connections and pinging the old nodes when we need to add a new node to a full bucket.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Mar 24, 2022

I reverted to the old spec. This needs to be spec-ed out focusing on authenticating and adding nodes on forward connections. the pingNode should also authenticate the node it's pinging.

So maybe like mentioned above, we can just use the pingNode to authenticate the node as well. we can ping the node when adding the node to the node with NodeManager.setNode. add a flag to NodeManager.setNode to skip the 'authenticating` ping for the new node, this will be needed for manually adding nodes and the adding nodes on reverse connection.

The pingNode function I think will need to become a agent-agent RPC, utilize the NodeConnectionManager's connections. it will need to return true if a NodeConnection can be established, was authenticated and was the expected NodeId. We also don't want to attempt a kademlia FIND_NODE operation when trying to ping since we are trying to verify the address. Is that possible with the NodeConnectionManager?

I'll have to review code and spec this futher.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Mar 24, 2022

Old spec

Specification

We currently have some instances when adding nodes to our NodeGraph where we attempt to perform a connection to the node before adding it to the NodeGraph buckets database.

Previously, this was intended as a primitive means of ensuring that only node IDs that corresponded with valid keynodes were added to one's own NodeGraph database (i.e. keeping the state "clean"). We could assume that a node ID was valid if we connected to it and performed the certificate validation with respect to its expected node ID.

This problem relates to mitigating the impact malicious actors/keynodes have on the wider Polykey network, and keeping the network as "clean" as possible. There needs to be further discussion about how this will be considered with Polykey, with this NodeGraph issue being one part of that discussion.

Additional context

Tasks

  1. Find all instances where we add nodes to our own NodeGraph. (There may not be any more.)
  2. Remove these in favour of some other mechanism.

@tegefaulkes
Copy link
Contributor

I feel like I keep editing over issues that describe problems to turn them into design specifications. I feel that we should keep the issues that describe problems as they are. When we find a solution to the problem we should create a new design issue specifying the solution and design spec. Basically I think we should have separation between problem definitions and problem solutions.

problems    solutions       PRs
A -------------->D --------> F
B ---------------^           ^
C -------------->E ----------|

It's just a though I had.
@CMCDragonkai

@tegefaulkes
Copy link
Contributor

tests relating to new pingNode spec

  • Can find the node to ping with kademlia find_node.
  • pings the node directly with the optional host/port parameter
  • ping fails if the node fails to authenticate.
  • ping fails if target authenticates but doesn't match the expected NodeId.
  • ping attempts a NAT punchthrough.
  • ping fails if we fail to connect to the target.
  • ping success case.

@CMCDragonkai
Copy link
Member

I feel like I keep editing over issues that describe problems to turn them into design specifications. I feel that we should keep the issues that describe problems as they are. When we find a solution to the problem we should create a new design issue specifying the solution and design spec. Basically I think we should have separation between problem definitions and problem solutions.

problems    solutions       PRs
A -------------->D --------> F
B ---------------^           ^
C -------------->E ----------|

It's just a though I had. @CMCDragonkai

It's fine to edit issues, in fact they should be edited. We don't want to create too many issues, as old information becomes out-dated. So editing is just a natural process of pruning information and updating information. Makes it easier to do the R&D report later.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Mar 25, 2022

Looking into how to implement the pingNode properly. It seems that establishing a NodeConnection in the NodeConnectionManager is functionally identical to our needs for pinging a node. All we really need to do is to expand the NodeConnectionManager to allow us to specify the host and Port when creating a connection.

So we can add a pingNode method to the NodeConnectionManager and have it use a modified createConnection to allow us to specifiy the target host and port. We will need to make sure that the NodeConnection authenticates the connection which it likely already does. and make sure the the authentecated Node matches the target node.

May as well add optional connection timeouts here as well.

Do you think this is a good course of action @CMCDragonkai ?

@tegefaulkes
Copy link
Contributor

Wild question, How do we actually get the host and port of the client connecting to the forward proxy? nodeConnection.port and client.port seems to return the proxy port.

tegefaulkes added a commit that referenced this issue Mar 25, 2022
Added `nodePing` command to `NodeConnectionManager` and `NodeManager.nodePing` calls that now. the new `nodePing` command essentially attempts to create a connection to the target node. In this process we authenticate the connection and check that the nodeIds match. If no address is provided it will default to trying to find the node through kademlia.

Related #322
@tegefaulkes
Copy link
Contributor

pingNode implementation has been fixed up. now to focus on the setNode part of this.

As for making all of this non-blocking. I think that should be speced out in a separate issue.

CMCDragonkai pushed a commit that referenced this issue Mar 26, 2022
Added `nodePing` command to `NodeConnectionManager` and `NodeManager.nodePing` calls that now. the new `nodePing` command essentially attempts to create a connection to the target node. In this process we authenticate the connection and check that the nodeIds match. If no address is provided it will default to trying to find the node through kademlia.

Related #322
CMCDragonkai pushed a commit that referenced this issue Mar 26, 2022
Added `nodePing` command to `NodeConnectionManager` and `NodeManager.nodePing` calls that now. the new `nodePing` command essentially attempts to create a connection to the target node. In this process we authenticate the connection and check that the nodeIds match. If no address is provided it will default to trying to find the node through kademlia.

Related #322
tegefaulkes added a commit that referenced this issue Jun 10, 2022
Need to ensure validity of nodes by pinging them before adding them to the node graph.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Added `nodePing` command to `NodeConnectionManager` and `NodeManager.nodePing` calls that now. the new `nodePing` command essentially attempts to create a connection to the target node. In this process we authenticate the connection and check that the nodeIds match. If no address is provided it will default to trying to find the node through kademlia.

Related #322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now authenticates the node you are trying to add. Added a flag for skipping this authentication as well as a timeout timer for the authentication. this is shared between authentication new node and the old node if the bucket is full.

Related #322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Updated `NodeConnectionManager.pingNode` to just use the proxy connection.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now pings 3 nodes concurrently, updating ones that respond and removing ones that don't. If there is room in the bucket afterwards then we add the new node.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now has a `blocking` flag that defaults to false. If it encounters a full bucket when adding a node then it will add the operation to the queue and asynchronously trys a blocking `setNode` in the background. `setNode`s will only be added to the queue if the bucket was full.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`NodeConnectionManager` now takes `NodeGraph` in  the `nodeConnectionManager.start` method. It has to be part of the start method since they are co-dependent. `NodeConnectionManager` cals `NodeManager.setNode()` when a connection is established. This fulfills the condition of adding a node to the graph during a forward connection.

Fixed up tests that were failing in relation to the `NodeManager` `StartStop` conversion.

#322
emmacasolin added a commit that referenced this issue Jun 14, 2022
`NodeManager.setNode` and `NodeConnectionManager.syncNodeGraph` now utilise a single, shared queue to asynchronously add nodes to the node graph without blocking the main loop. These methods are both blocking by default but can be made non-blocking by setting the `block` parameter to false.
#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Need to ensure validity of nodes by pinging them before adding them to the node graph.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Added `nodePing` command to `NodeConnectionManager` and `NodeManager.nodePing` calls that now. the new `nodePing` command essentially attempts to create a connection to the target node. In this process we authenticate the connection and check that the nodeIds match. If no address is provided it will default to trying to find the node through kademlia.

Related #322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now authenticates the node you are trying to add. Added a flag for skipping this authentication as well as a timeout timer for the authentication. this is shared between authentication new node and the old node if the bucket is full.

Related #322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Updated `NodeConnectionManager.pingNode` to just use the proxy connection.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now pings 3 nodes concurrently, updating ones that respond and removing ones that don't. If there is room in the bucket afterwards then we add the new node.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`setNode` now has a `blocking` flag that defaults to false. If it encounters a full bucket when adding a node then it will add the operation to the queue and asynchronously trys a blocking `setNode` in the background. `setNode`s will only be added to the queue if the bucket was full.

#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
`NodeConnectionManager` now takes `NodeGraph` in  the `nodeConnectionManager.start` method. It has to be part of the start method since they are co-dependent. `NodeConnectionManager` cals `NodeManager.setNode()` when a connection is established. This fulfills the condition of adding a node to the graph during a forward connection.

Fixed up tests that were failing in relation to the `NodeManager` `StartStop` conversion.

#322
emmacasolin added a commit that referenced this issue Jun 14, 2022
`NodeManager.setNode` and `NodeConnectionManager.syncNodeGraph` now utilise a single, shared queue to asynchronously add nodes to the node graph without blocking the main loop. These methods are both blocking by default but can be made non-blocking by setting the `block` parameter to false.
#322
emmacasolin pushed a commit that referenced this issue Jun 14, 2022
Need to ensure validity of nodes by pinging them before adding them to the node graph.

#322
@teebirdy teebirdy added the r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy label Jul 24, 2022
@CMCDragonkai CMCDragonkai added r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices and removed r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy labels Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development discussion Requires discussion r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

Successfully merging a pull request may close this issue.

5 participants