Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark runner using AWS CLI #12

Merged
merged 8 commits into from
Oct 3, 2023
Merged

benchmark runner using AWS CLI #12

merged 8 commits into from
Oct 3, 2023

Conversation

graebm
Copy link
Contributor

@graebm graebm commented Sep 28, 2023

This runner skips benchmarks unless it can do them in a single AWS CLI command. If we used multiple commands, one after another, it wouldn't be a fair comparison to the other runners that do everything in parallel. And users probably aren't running multiple CLI commands in parallel, so it doesn't seem worth doing that comparison either.

Here are examples showing how this works:

  • Uploading or downloading a single file is simple:
    • benchmark: upload-5GiB
    • cmd: aws s3 cp upload/5GiB/1 s3://graebm-s3-benchmarks/upload/5GiB/1
  • A benchmark with multiple files only works if they're in the same directory:
    • benchmark: upload-5GiB-20x
    • cmd: aws s3 cp upload/5GiB s3://graebm-s3-benchmarks/upload/5GiB --recursive
  • If the benchmark doesn't use every file in the directory, then we --include the ones we want:
    • benchmark: upload-5GiB-10x
    • cmd: aws s3 cp upload/5GiB s3://graebm-s3-benchmarks/upload/5GiB --recursive --exclude "*" --include 1 --include 2 --include 3 --include 4 --include 5 --include 6 --include 7 --include 8 --include 9 --include 10
  • If the benchmark has "filesOnDisk": false then we upload from stdin, or download to stdout. This only works if the benchmark has 1 file.
    • benchmark: upload-5GiB-ram
    • cmd: <5GiB_random_data> | aws s3 cp - s3://graebm-s3-benchmarks/upload/5GiB/1

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

benchmarks/download-max-throughput.run.json Show resolved Hide resolved
runners/s3-benchrunner-cli/benchrunner.py Outdated Show resolved Hide resolved
@graebm graebm enabled auto-merge (squash) October 3, 2023 00:34
@graebm graebm merged commit e40fadf into main Oct 3, 2023
1 check passed
@graebm graebm deleted the cli-runner branch October 3, 2023 00:37
@graebm graebm mentioned this pull request Oct 18, 2023
graebm added a commit that referenced this pull request Oct 18, 2023
Cut down on boilerplate by just having one python runner, instead of separate folders, each with their own build scripts, main functions, and bytes_to_GiB() helper functions, etc, etc, etc

This looks like a crap ton of new code, but it's just moving/splitting/combingin code from the following PRs:
- #12
- #14
- #15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants