Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature improving request (srl_zoo)] Around 30% speed-up by changing few lines code #46

Open
ncble opened this issue May 24, 2019 · 4 comments
Labels
enhancement New feature or request

Comments

@ncble
Copy link
Collaborator

ncble commented May 24, 2019

Problem description

In the current (origin/master) version, there are two preprocessing mode: 1. 'tf' and 2. 'image_net'. I have noticed that with the some device/python environment/encoder model/robot environment etc, the option 'image_net' provide 30% speed-up compared to 'tf'.

Reproduce the problem

  • Under robotics-rl-srl/
    $ python -m environments.dataset_generator --env MobileRobotGymEnv-v0 --name mobile2D_fixed_tar_seed_0 --seed 0 --num-cpu 8
  • Modify the script srl_zoo/preprocessing/utils.py so that the preprocessing mode is 'tf'
  • Under srl_zoo/
    $ python train.py --data-folder mobile2D_fixed_tar_seed_0 --losses autoencoder

With the original version of srl_zoo, the training time per epoch of autoencoder (under 'tf' mode) is about 43s on my computer and with the following modification, the time reduces to 31s.

Solution

I propose to change the script srl_zoo/preprocessing/utils.py (both the functions preprocessInput and deNormalize.

  def preprocessInput(x, mode="tf"):
      ....
      assert x.shape[-1] == 3, "Color channel must be at the end of the tensor {}".format(x.shape)
      x /= 255.
      if mode == "tf":
          # x -= 0.5
          # x *= 2.
          ## The following code is 33% faster than above one.
          x[..., 0] -= 0.5
          x[..., 1] -= 0.5
          x[..., 2] -= 0.5
          x[..., 0] *= 2.
          x[..., 1] *= 2.
          x[..., 2] *= 2.

deNormalize is similar.

@araffin araffin added the enhancement New feature or request label May 24, 2019
@araffin
Copy link
Owner

araffin commented May 24, 2019

This is unexpected, but why not if the result is the same.

@ncble
Copy link
Collaborator Author

ncble commented May 28, 2019

Agreed ! I was shocked when I discovered this. Do you have any idea how this happens ? I guess it's related to the multiprocessing ?

@araffin
Copy link
Owner

araffin commented May 28, 2019

I couldn't reproduce your results...

minimal code:

import numpy as np

def prepro(x, mode='one'):
	x /= 255.
	if mode == 'one':
		x -= 0.5
		x *= 2.
	else:
		x[..., 0] -= 0.5
		x[..., 1] -= 0.5
		x[..., 2] -= 0.5
		x[..., 0] *= 2.
		x[..., 1] *= 2.
		x[..., 2] *= 2.
	return x


image = 255. * np.random.random((224, 224, 3))

a = prepro(image.copy())
b = prepro(image.copy(), mode='test')

assert np.allclose(a, b)

and in a ipython console:

In [1]: from test import prepro, image

In [2]: %timeit prepro(image.copy(), mode='test')
782 µs ± 9.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: %timeit prepro(image.copy())
535 µs ± 7.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: %timeit prepro(image.copy())
534 µs ± 5.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@ncble
Copy link
Collaborator Author

ncble commented May 28, 2019

Yes, I agree that these two methods alone are the same (in term of results and timing), but when you call it with the data_loader.py (please change the scripts as I described above) then the elapsed time is significant different ! That's why I guess the problem is related to multiprocessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants