Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: element ratio parameter for seeder #3844

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dranikpg
Copy link
Contributor

@dranikpg dranikpg commented Oct 1, 2024

Fixes #3840

Added element_size_ratio parameter to both static and dynamic seeder

data size ~ data volume, element size = data size ^ ratio, element count = data volume / element size

@dranikpg dranikpg marked this pull request as ready for review October 1, 2024 17:09

keys = await async_client.keys()
assert (await async_client.llen(keys[0])) == 1
assert len(await async_client.lpop(keys[0])) == 10_000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question:

if element_size_ratio=1/2 then data_size is 10k ** (1/2) == 100 and variance is 1. So dsize=100. My question is, why do we xor the dsize ? That is: LG_funcs.esize = math.ceil(dsize ^ delement_ratio) which is how many elements a given type should contain (that is llen(keys[0])).

So to sumamrize:

  1. Why do we xor this ? LG_funcs.esize = math.ceil(dsize ^ delement_ratio)
  2. Why do we express the number of elements per set via all of this? Can't we just be specific on how many elements we want of a given size each ?

There is something I am missing so I am asking here 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It's not xor, it's power 🙃
  2. If it's too difficult and fragile, no one will use it properly. It's just a 0/1 slider: 0 means smallest possible elements, 1 means biggest possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1. power is ** not caret ^ which is xor ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brainfarted. It's LUA not python 😮‍💨 🤦

Now it all makes sense. Ignore my blindness....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol @dranikpg we make the comment at exact same time. I did not read your link but somehow I noticed and then you replied at the exact same moment I figured and reply 🤣

Copy link
Contributor

@kostasrim kostasrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM maybe wait for Adi?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pytests : Improve seeder adding element count for entries like set/hash/list
2 participants