Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel build does not make full use of available CPU resources #169

Open
stweil opened this issue Aug 20, 2020 · 4 comments
Open

Parallel build does not make full use of available CPU resources #169

stweil opened this issue Aug 20, 2020 · 4 comments
Labels
enhancement New feature or request

Comments

@stweil
Copy link
Collaborator

stweil commented Aug 20, 2020

make all -j6 starts with downloading all submodules, 6 at a time, but because of the semaphore required for git all downloads happen sequentially. Builds will only start as soon as there remain less than 6 submodules to download.

@stweil
Copy link
Collaborator Author

stweil commented Aug 20, 2020

The first build for a freshly cloned ocrd_all requires about 16 minutes of CPU time. With 6 CPUs, it should run in less than 3 minutes, but it takes more than twice of that:

# time nohup make all -j6
real	6m24.406s
user	15m47.172s
sys	1m8.953s

In this test, the pip cache was filled from previous runs of the same user.

@bertsky
Copy link
Collaborator

bertsky commented Aug 20, 2020

make all -j6 starts with downloading all submodules, 6 at a time, but because of the semaphore required for git all downloads happen sequentially.

Yes, that is unfortunate. But maybe we overreacted when fixing #123: there are lock files involved in both sync and update, but only the former is shared by all submodules (.git/config.lock), while the latter uses submodule-localized locks (.git/modules/MODULE/{index,HEAD,...}.lock). So we could actually remove the second SEM which accounts for most of the download time!

Builds will only start as soon as there remain less than 6 submodules to download.

If you don't restrict to a fixed number of jobs, but a relative number of load level (e.g. -j -l 6), this behaviour will be better!

@stweil
Copy link
Collaborator Author

stweil commented Aug 21, 2020

Using make all -j -l 6 seems to create a large overhead with slightly faster build:

# time nohup make all -j -l6
real	6m3.837s
user	19m42.264s
sys	1m40.864s

I had removed the second git SEM for this test.

@jbarth-ubhd
Copy link

My recent experience is, when doing make all -j 8 not all is done, because make all builds a lot afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants