Integrating `TaskManager` into `Discovery` #451

tegefaulkes · 2022-09-26T06:56:22Z

Description

This PR focuses on updating the Discovery domain to make use of the TaskManager.

Issues Fixed

Related Integrate TaskManager into NodeGraph and Discovery #445

Tasks

1. Update the Discovery queue to make use of the TaskManager queuing.
2. Remove any tests that are no longer required, I.E. anything testing the async queue aspect of the Discovery since it's handled via TaskManager now.
~ 5. re-enable Discovery tests in the generated CI tests ~ - The problem is ultimately a connection timeout, not addressed here.
3. Update any tests that need it.
4. Add any new tests as needed.

Final checklist

ghost · 2022-09-26T06:57:02Z

👇 Click on the image for a new way to code review

Make big changes easier — review code in small groups of related files
Know where to start — see the whole change at a glance
Take a code tour — explore the change with an interactive tour
Make comments and review — all fully sync’ed with github

Try it now!

Legend

CMCDragonkai · 2022-09-26T09:41:34Z

We can limit the scope of discovery operations because we only care about:

Our own gestalt
Any gestalt we trust
Any gestalt that the user triggers manually

We don't need to crawl transitive gestalts (friends of friends...).

CMCDragonkai · 2022-09-26T09:50:24Z

Discovery is likely to be push and pull at the same time. But right now everything in PK is pull-based cause it's easier to implement.

So we can do something simple.

At most we might have 1000 gestalts trusted, and each gestalt is likely a handful of nodes, say 5. Then at the most have 5000 nodes to check in with.

We can say 1 day TTL to revisit each node.

When the push-based gossip protocol is available, we can actually increase the TTL, and that TTL acts like the last-resort in the sense that if no update has occurred from there we can check in.

It could be even simpler of we can expect that any given node of a gestalt should be up to date with their own gestalt's information.

At this point in time, let's make it configurable and see how it goes.

The goal here for this PR is not to make our discovery perfect UI/UX-wise, but to simply integrate the task manager system, simplify the code in discovery so it becomes stateless (with respect to task management), and also incorporate timedCancellable where possible.

CMCDragonkai · 2022-09-26T09:51:01Z

So timedCancellable for #450 will be in another PR, but this starts it out for the discovery domain at least.

tegefaulkes · 2022-09-27T08:13:13Z

General changes so far:

The parts of discovery that handles queuing has been removed. This includes the 'plug' promises, database levels and the discovery ID generator.
The logic of the queue where it handled the vertex was separated out into a new method. This logic included connecting to a vertex, getting the claims' data, iterating over the claims, verifying the claims and adding them to the gestaltGraph, and adding the new vertices to the list.
Adding a vertex to the list now consists of creating a singleton task. each task focuses on processing a single vertex.
I tried to cancel all tasks on discovery.destroy() but it would require the taskManager to be started at the time since we need to iterate over active tasks at the time. For now cancel the tasks when starting with fresh.
To avoid re-trying tasks, a visitedVertices set is used, this hasn't been updated. A more complex mechanism is need for this that has persistence. I think this will be left with the more complex discovery changes later.
Discovery tests have been updated. none removed or added so far. The change was just using taskManager.

tegefaulkes · 2022-09-28T05:08:03Z

Re-based on staging and a quick fix. Should pass CI now.

tegefaulkes · 2022-09-28T05:43:27Z

The Discovery failures is due to proxy.stop() taking a very long time to complete. I could dig into it, but ultimately the problem will be addressed by the cancel-ability changes PR after this one. For now i'll keep the discovery tests disabled.

console.time
    proxy.stop(): 19142 ms
      at Object.<anonymous> (tests/discovery/Discovery.test.ts:240:13)

tegefaulkes

Some small changes and discussion needed.

src/discovery/Discovery.ts

src/nodes/NodeManager.ts

tests/client/service/gestaltsDiscoveryByNode.test.ts

tests/client/service/gestaltsGestaltTrustByNode.test.ts

tegefaulkes · 2022-09-28T07:22:31Z

This should be ready to merge pending CI. @CMCDragonkai did you still want to review?

CMCDragonkai · 2022-09-28T07:48:21Z

Yes give me a moment.

CMCDragonkai

Initial review. Haven't gone into the discovery code yet.

src/PolykeyAgent.ts

src/discovery/types.ts

tests/discovery/Discovery.test.ts

src/discovery/Discovery.ts

CMCDragonkai · 2022-09-28T08:10:47Z

The Discovery failures is due to proxy.stop() taking a very long time to complete. I could dig into it, but ultimately the problem will be addressed by the cancel-ability changes PR after this one. For now i'll keep the discovery tests disabled.
console.time
    proxy.stop(): 19142 ms
      at Object.<anonymous> (tests/discovery/Discovery.test.ts:240:13)

Please link the next PR to here. Applying timedCancellable across the entire codebase.

src/discovery/Discovery.ts

CMCDragonkai · 2022-09-29T03:19:52Z

As per: https://github.com/MatrixAI/Polykey/pull/451/files/2d004eeb5bc7270320ecef7fb0cb3ebfe43877c8#r983028305

It is possible to remove these mocks from our tests:

const mockedRefreshBucket = jest.spyOn(
  NodeManager.prototype,
  'refreshBucket',
);

If we realise that what we need is to have 2-stage start for some of our objects.

The NodeManager.start is similar to TaskManager.start in the sense that there are 2 stages here.

Making the object ready to be interacted with
Actually starting the background tasks (live side-effects)

We can reuse the pattern of lazy parameter, by putting a lazy parameter into NodeManager.start.

Then we also need to move NodeConnectionManager.start's ops in setting the seed node to node manager into NodeManager.syncNodeGraph:

    // Adding seed nodes
    for (const nodeIdEncoded in this.seedNodes) {
      const nodeId = nodesUtils.decodeNodeId(nodeIdEncoded);
      if (nodeId == null) never();
      await this.nodeManager.setNode(
        nodeId,
        this.seedNodes[nodeIdEncoded],
        true,
      );
    }

The above code seems that it can just be done as part of NodeManager.syncSeedNodes.

Then in NodeManager.start, we enclose the background operations as part of:

    if (!lazy) {
      await this.startSync();
    }

So this startSync would be similar to TaskManager.startProcessing, in that it would perform these 2 operations:

    async startSync() {
      await this.syncSeedNodes(); // Replaced name for `syncNodeGraph`
      await this.setupRefreshBucketTasks();
    }

Can we then have an equivalent stopSync? Similar to stopProcessing? Or does this not matter.

I guess it would whatever NodeManager.stop is currently doing, but just factored out to stopSync.

Atm, it looks like syncSeedNodeGraph doesn't actually get cancelled if we call NodeManager.stop, therefore there is a place to create a stopSync method that NodeManager.stop ends up calling as well.

This would basically make NodeManager very similar to TaskManager.

Then in the PolykeyAgent, under createPolykeyAgent, it would need the lazy: true, and then under start, it would do something like:

      await this.taskManager.start({ fresh, lazy: true });
      await this.nodeManager.start({ lazy: true });
      await this.nodeConnectionManager.start({ 
        nodeManager: this.nodeManager 
      });
      await this.nodeGraph.start({ fresh });
      await this.discovery.start({ fresh });
      await this.vaultManager.start({ fresh });
      await this.notificationsManager.start({ fresh });
      await this.sessionManager.start({ fresh });
      if (lazy) {
        await this.nodeManager.startSync();
        await this.taskManager.startProcessing();
      }

Notice the lazy boolean here, this propagates the pattern to PolykeyAgent. Which would mean upon testing PolykeyAgent such as in the discovery tests, rather than mocking out specific side-effects, we can just start the PolykeyAgent lazily, meaning none of the background tasks are started. If we want to do that, we would need another method PolykeyAgent.startX (not sure what X should be called). But that would then start the background tasks for PolykeyAgent.

What does it mean to do tests on PK agent without having it run background tasks? It seems that this means that it would avoid interference if we are trying to do a deterministic test.

So imagine you want to test the PK agent to do a specific direct connection or a specific call, and you would prefer that these background tasks aren't running. Then it does make sense to have this ability of staging the second stage of start.

So now we have 3/4 stages:

Async creation
Sync construction
Async start
Async start background stuff

CMCDragonkai · 2022-09-29T03:26:14Z

The fwxProxy logger name should just be Proxy.

tests/discovery/Discovery.test.ts

src/discovery/Discovery.ts

CMCDragonkai · 2022-09-29T04:02:02Z

@tegefaulkes can you create a new issue for discovery refactoring: #451 (comment) (needs TTL) - this can be done in separate PR

And also a new issue for this: #451 (comment) - this can be done separate PR

tegefaulkes · 2022-09-29T06:44:57Z

All review addressed now. I'll clean up the commit history again, merge, and create the new issues/PRs.

- The parts of discovery that handles queuing has been removed. This includes the 'plug' promises, database levels and the discovery ID generator. - The logic of the queue where it handled the vertex was separated out into a new method. This logic included connecting to a vertex, getting the claims' data, iterating over the claims, verifying the claims and adding them to the gestaltGraph, and adding the new vertices to the list. - Adding a vertex to the list now consists of creating a singleton task. each task focuses on processing a single vertex. - I tried to cancel all tasks on `discovery.destroy()` but it would require the `taskManager` to be started at the time since we need to iterate over active tasks at the time. For now cancel the tasks when starting with fresh. - To avoid re-trying tasks, a `visitedVertices` set is used, this hasn't been updated. A more complex mechanism is need for this that has persistence. I think this will be left with the more complex discovery changes later. - Discovery tests have been updated. none removed or added so far. The change was just using `taskManager`. - Removed discovery utils and types. These are not needed anymore since we don't make use of a discovery ID generator. - Added locking on the vertex to avoid duplicating a task due to racing. Also catching and throwing the singleton reason, so we don't get any errors on tasks that were canceled for this reason. - Discovery tasks are canceled during fresh start and destroy. -`Discovery` stops any active tasks when stopping - Added a check for if `TaskManager` is still processing when calling stop on `Discovery` and `NodeManager`.

tegefaulkes · 2022-09-29T06:48:47Z

Squashed and re-based on staging.

CMCDragonkai · 2022-09-29T07:19:52Z

Please link the new issues you've created here too.

tegefaulkes · 2022-09-29T08:16:59Z

Issue #461 created to address #451 (comment).

tegefaulkes · 2022-09-29T08:33:05Z

Issue #462 created to address further discovery refactoring.

tegefaulkes self-assigned this Sep 26, 2022

tegefaulkes changed the title ~~wip: Discovery depends on TaskManager~~ Integrating TaskManager into Discovery Sep 26, 2022

CMCDragonkai mentioned this pull request Sep 26, 2022

ci: merge staging to master #411

Merged

tegefaulkes force-pushed the feature-Discovery_TaskManager_integration branch from ab75115 to 00ce3e1 Compare September 27, 2022 08:12

tegefaulkes force-pushed the feature-Discovery_TaskManager_integration branch 2 times, most recently from b5d7f5f to 02912dd Compare September 28, 2022 05:07

tegefaulkes force-pushed the feature-Discovery_TaskManager_integration branch from 20c1eab to 8408a8f Compare September 28, 2022 05:49

tegefaulkes commented Sep 28, 2022

View reviewed changes

tegefaulkes force-pushed the feature-Discovery_TaskManager_integration branch from b4842f4 to 2d004ee Compare September 28, 2022 07:04

tegefaulkes marked this pull request as ready for review September 28, 2022 07:56

tegefaulkes requested a review from CMCDragonkai September 28, 2022 07:56

CMCDragonkai requested changes Sep 28, 2022

View reviewed changes

CMCDragonkai reviewed Sep 28, 2022

View reviewed changes

src/discovery/Discovery.ts Show resolved Hide resolved

CMCDragonkai reviewed Sep 29, 2022

View reviewed changes

tests/discovery/Discovery.test.ts Show resolved Hide resolved

CMCDragonkai reviewed Sep 29, 2022

View reviewed changes

src/discovery/Discovery.ts Outdated Show resolved Hide resolved

CMCDragonkai reviewed Sep 29, 2022

View reviewed changes

src/discovery/Discovery.ts Show resolved Hide resolved

tegefaulkes force-pushed the feature-Discovery_TaskManager_integration branch from 25f4ad3 to 6d8f606 Compare September 29, 2022 06:48

fix: updating badly worded comment

6d54b89

tegefaulkes merged commit c602b92 into staging Sep 29, 2022

tegefaulkes mentioned this pull request Sep 29, 2022

Allow NodeManager to start lazily without network entry procedure #461

Open

tegefaulkes mentioned this pull request Sep 29, 2022

Add priority and TTL to Discovery process #462

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating `TaskManager` into `Discovery` #451

Integrating `TaskManager` into `Discovery` #451

tegefaulkes commented Sep 26, 2022 •

edited

Loading

ghost commented Sep 26, 2022 •

edited by ghost

Loading

CMCDragonkai commented Sep 26, 2022

CMCDragonkai commented Sep 26, 2022

CMCDragonkai commented Sep 26, 2022

tegefaulkes commented Sep 27, 2022

tegefaulkes commented Sep 28, 2022

tegefaulkes commented Sep 28, 2022

tegefaulkes left a comment

tegefaulkes commented Sep 28, 2022

CMCDragonkai commented Sep 28, 2022

CMCDragonkai left a comment

CMCDragonkai commented Sep 28, 2022

CMCDragonkai commented Sep 29, 2022 •

edited

Loading

CMCDragonkai commented Sep 29, 2022

CMCDragonkai commented Sep 29, 2022 •

edited

Loading

tegefaulkes commented Sep 29, 2022 •

edited

Loading

tegefaulkes commented Sep 29, 2022

CMCDragonkai commented Sep 29, 2022

tegefaulkes commented Sep 29, 2022

tegefaulkes commented Sep 29, 2022

Integrating TaskManager into Discovery #451

Integrating TaskManager into Discovery #451

Conversation

tegefaulkes commented Sep 26, 2022 • edited Loading

Description

Issues Fixed

Tasks

Final checklist

ghost commented Sep 26, 2022 • edited by ghost Loading

Legend

CMCDragonkai commented Sep 26, 2022

CMCDragonkai commented Sep 26, 2022

CMCDragonkai commented Sep 26, 2022

tegefaulkes commented Sep 27, 2022

tegefaulkes commented Sep 28, 2022

tegefaulkes commented Sep 28, 2022

tegefaulkes left a comment

Choose a reason for hiding this comment

tegefaulkes commented Sep 28, 2022

CMCDragonkai commented Sep 28, 2022

CMCDragonkai left a comment

Choose a reason for hiding this comment

CMCDragonkai commented Sep 28, 2022

CMCDragonkai commented Sep 29, 2022 • edited Loading

CMCDragonkai commented Sep 29, 2022

CMCDragonkai commented Sep 29, 2022 • edited Loading

tegefaulkes commented Sep 29, 2022 • edited Loading

tegefaulkes commented Sep 29, 2022

CMCDragonkai commented Sep 29, 2022

tegefaulkes commented Sep 29, 2022

tegefaulkes commented Sep 29, 2022

Integrating `TaskManager` into `Discovery` #451

Integrating `TaskManager` into `Discovery` #451

tegefaulkes commented Sep 26, 2022 •

edited

Loading

ghost commented Sep 26, 2022 •

edited by ghost

Loading

CMCDragonkai commented Sep 29, 2022 •

edited

Loading

CMCDragonkai commented Sep 29, 2022 •

edited

Loading

tegefaulkes commented Sep 29, 2022 •

edited

Loading