Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: update aggregation capabilitites to use height instead of size together with client and api #831

Merged
merged 4 commits into from
Jul 25, 2023

Conversation

vasco-santos
Copy link
Contributor

@vasco-santos vasco-santos commented Jul 17, 2023

Renames size to height for piece/aggregate based on described in storacha/data-segment#13

Note that:

  • took the opportunity to update this to finally rely on data-segment lib as it is now ready to be used (both in tests, and to validate aggregate)
  • given piece and aggregate are both perfect binary trees, leaf count of can be derived as 2 ** height, and the size of the tree can be derived by leafCount * Node.Size (32).

Needs storacha/specs#66

import * as Aggregate from '@web3-storage/capabilities/aggregate'
import * as Offer from '@web3-storage/capabilities/offer'
import * as API from '../types.js'

export const MIN_SIZE = 1 + 127 * (1 << 27)
export const MAX_SIZE = 127 * (1 << 28)
// export const MAX_SIZE = 127 * (1 << 28)
export const MAX_SIZE = AggregateBuilder.DEFAULT_DEAL_SIZE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gozala interesting that these values are different than ones previously obtained with riba. Clues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former value is (unpadded) payload size, if you pad it you'll get a deal (aggregate) size which is the new value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't know if that 32GiB is the max deal size for spade, so I'm not sure we do need to impose it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am playing with values that Riba provided us back in the days. Maybe not relevant anymore, but I will confirm and iterate

@vasco-santos vasco-santos requested a review from Gozala July 17, 2023 14:50
code: SHA2_256_TRUNC254_PADDED,
},
}),
link: /** @type {import('./types').PieceLinkSchema} */ (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gozala we can't add first unknown in template config https://github.com/web3-storage/ucanto/blob/main/packages/core/src/schema/link.js#L89 so that we can specify Link<MerkleTreeNode, 61697, 4114, 1> instead of Link<unknown, 61697, 4114, 1>

Should we add a config param in ucanto?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can not because there is no way to verify it, so I was expecting that ppl would write manually type cast like Schema<Link<T, ...>> that way you you imply to know what you're doing.

I'm not sure what config param in ucanto would look like. That said I have in-progress PR that would allow us to infer types like Schema.struct({ ... }).link({ codec, hasher, version }) which than would be able to infer the type from the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will keep as is for the time being

Copy link
Contributor

@Gozala Gozala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's just make sure that MIN_SIZE and MAX_SIZE are what we expect them to be. I also proposed few minor changes to avoid repeating functionality provided by data-segment lib.

import * as Aggregate from '@web3-storage/capabilities/aggregate'
import * as Offer from '@web3-storage/capabilities/offer'
import * as API from '../types.js'

export const MIN_SIZE = 1 + 127 * (1 << 27)
export const MAX_SIZE = 127 * (1 << 28)
// export const MAX_SIZE = 127 * (1 << 28)
export const MAX_SIZE = AggregateBuilder.DEFAULT_DEAL_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former value is (unpadded) payload size, if you pad it you'll get a deal (aggregate) size which is the new value.

import * as Aggregate from '@web3-storage/capabilities/aggregate'
import * as Offer from '@web3-storage/capabilities/offer'
import * as API from '../types.js'

export const MIN_SIZE = 1 + 127 * (1 << 27)
export const MAX_SIZE = 127 * (1 << 28)
// export const MAX_SIZE = 127 * (1 << 28)
export const MAX_SIZE = AggregateBuilder.DEFAULT_DEAL_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't know if that 32GiB is the max deal size for spade, so I'm not sure we do need to impose it.

@@ -1,11 +1,13 @@
import * as Server from '@ucanto/server'
import { CBOR } from '@ucanto/core'
import { Node, Aggregate as AggregateBuilder } from '@web3-storage/data-segment'
import * as Aggregate from '@web3-storage/capabilities/aggregate'
import * as Offer from '@web3-storage/capabilities/offer'
import * as API from '../types.js'

export const MIN_SIZE = 1 + 127 * (1 << 27)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this number is, and I wonder if it was mistaken for

Suggested change
export const MIN_SIZE = 1 + 127 * (1 << 27)
export const MIN_SIZE = 127 * (1 << 27)

Which would be unpadded size corresponding to 16GiB deal, that is after padding we'd get 16GiB which is probably what was the intention was.

SIZE = 127 * (1 << 27) // 17045651456
DEAL_SIZE = SIZE + SIZE / 127 // 17179869184 => 16GiB

Comment on lines 63 to 66
pieces: offers.map((o) => ({
link: o.link,
size: 2n ** BigInt(o.height) * BigInt(Node.Size),
})),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should pass a size as it it may be different from default, note that builder is not deriving size from pieces it just uses 32GiB default.

You also do not need to do the math there, there is Piece.fromJSON that you could use instead

Suggested change
pieces: offers.map((o) => ({
link: o.link,
size: 2n ** BigInt(o.height) * BigInt(Node.Size),
})),
size: aggregateSize
pieces: offers.map(Piece.fromJSON),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we expect pieces to be sorted, we should make note of that in the spec, if that is not the case we probably should sort them here.

@@ -1,11 +1,13 @@
import * as Server from '@ucanto/server'
import { CBOR } from '@ucanto/core'
import { Node, Aggregate as AggregateBuilder } from '@web3-storage/data-segment'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import { Node, Aggregate as AggregateBuilder } from '@web3-storage/data-segment'
import { Node, Piece, Aggregate as AggregateBuilder } from '@web3-storage/data-segment'

You'll need this to use Piece.fromJSON directly.

// Generate Pieces for offer
const { pieces, aggregate } = await randomAggregate(100, 128)
const badHeight = 31
const size = 2n ** BigInt(badHeight) * BigInt(Node.Size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have PaddedSize.toHeight we probably should just add PaddedSize.fromHeight so we don't have to do this math all over the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link: PieceCID
size: number
link: PieceLink
height: number
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we need another type along with Piece and PieceInfo that we can re-export here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code: SHA2_256_TRUNC254_PADDED,
},
}),
link: /** @type {import('./types').PieceLinkSchema} */ (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can not because there is no way to verify it, so I was expecting that ppl would write manually type cast like Schema<Link<T, ...>> that way you you imply to know what you're doing.

I'm not sure what config param in ucanto would look like. That said I have in-progress PR that would allow us to infer types like Schema.struct({ ... }).link({ codec, hasher, version }) which than would be able to infer the type from the schema.

@vasco-santos vasco-santos force-pushed the fix/aggregation-spec-to-use-height-instead-of-size branch from 8103681 to 11e68f0 Compare July 25, 2023 16:45
@vasco-santos vasco-santos merged commit 31730f0 into main Jul 25, 2023
14 checks passed
@vasco-santos vasco-santos deleted the fix/aggregation-spec-to-use-height-instead-of-size branch July 25, 2023 17:35
vasco-santos pushed a commit that referenced this pull request Aug 9, 2023
🤖 I have created a release *beep* *boop*
---


##
[8.0.0](capabilities-v7.0.0...capabilities-v8.0.0)
(2023-08-09)


### ⚠ BREAKING CHANGES

* update aggregation capabilitites to use height instead of size
together with client and api
([#831](#831))

### Features

* w3filecoin new client and api
([#848](#848))
([7a58fbe](7a58fbe))


### Bug Fixes

* update aggregation capabilitites to use height instead of size
together with client and api
([#831](#831))
([31730f0](31730f0))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants