Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu selection #145

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

OfekShochat
Copy link

can select gpus for training. syntax is

gpu: 1, 2, ..., n

and

gpu: 1,2,...,n

Tilps
Tilps previously requested changes Mar 14, 2021
tf/tfprocess.py Outdated Show resolved Hide resolved
tf/tfprocess.py Outdated Show resolved Hide resolved
@OfekShochat
Copy link
Author

@Tilps how can I push from gpu-selection branch on the fork to here now? I made the changes

@Tilps
Copy link
Contributor

Tilps commented Mar 14, 2021

nothing to do.

tf/tfprocess.py Show resolved Hide resolved
@Tilps
Copy link
Contributor

Tilps commented Mar 14, 2021

Needs a test and formatting fix and then probably fine to go.

@Tilps Tilps dismissed their stale review March 14, 2021 09:12

addressed

@dje-dev
Copy link

dje-dev commented May 23, 2021

This enhancement will be very useful. However it seems to break backwards compatibility, for example a value of 0 now throws
TypeError: argument of type 'int' is not iterable
instead perhaps the isinstance clause could be added as follows:
elif isinstance(self.cfg['gpu'], str) and "," in self.cfg['gpu']:

@OfekShochat
Copy link
Author

yea yea its known. just I didnt know if this is even something to work on. if you are interested sure ill fix it happily!

@OfekShochat
Copy link
Author

well now its saying some git shit, but this should work now. can you check @dje-dev? not home...

@dje-dev
Copy link

dje-dev commented May 30, 2021

It turns out MirroredStrategy wants a set of string such as ["GPU:0", "GPU:1"].

So I was able to get code as shown below to work.

Note that set_memory_growth is called here, I'm not sure if that is required/desirable.
If it is called, it seems necessary to call it for all GPUs not just the selected ones, otherwise Tensorflow throws an error.

        elif "," in str(self.cfg['gpu']):
            gpus = tf.config.experimental.list_physical_devices('GPU')
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
            for i in self.cfg['gpu'].split(","):
                active_gpus.append("GPU:" + i)
            self.strategy = tf.distribute.MirroredStrategy(active_gpus)
            tf.distribute.experimental_set_strategy(self.strategy)

@OfekShochat
Copy link
Author

Kk
Works now? If it does I'll change my pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants