model loading inference API #141

clmnt · 2022-11-16T23:20:34Z

Describe the bug

it gets stuck at model loading

Reproduction

go to https://huggingface.co/nitrosocke/classic-anim-diffusion and prompt for the first time

https://www.loom.com/share/10fdb5920e0248cc8162e145f8957d77

Logs

No response

System info

chrome

osanseviero · 2022-11-16T23:41:57Z

The first time it says that the model is loading. When you do the refresh it turned out the model was now loaded, so the inference was fast this time. Moving to community repo

Narsil · 2022-11-17T10:15:57Z

Multiple issues things at play that are currently known about:

Model loading is not really using correct information. api-inference doesn't know how to "guess" the model size properly, so the loading bar is not accurate. It's never acurate, but the simple rule of thumb would still mean the loading bar would be bigger and more representative.
First loads are always much longer due to downloading the weights
Sometimes, depending on cluster conditions creating the docker is slower than usual (depends how many GPUs are used, how many nodes are available etc.. creating a new node on demand is much slower than just launching the pod)
Inference still takes 5-6s which feels very "slow" to us humans. Using xformers and fast attention should help a bit (expected to go down to 3s).

Here I'm thinking 1/ and 4/ are the most effective things we can do something about.
We're also working on adding tracing to the cluster so we have a better picture of 2 and 3.

@NouamaneTazi

clmnt added the bug Something isn't working label Nov 16, 2022

clmnt assigned Narsil Nov 16, 2022

osanseviero transferred this issue from huggingface/huggingface_hub Nov 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model loading inference API #141

model loading inference API #141

clmnt commented Nov 16, 2022

osanseviero commented Nov 16, 2022

Narsil commented Nov 17, 2022

model loading inference API #141

model loading inference API #141

Comments

clmnt commented Nov 16, 2022

Describe the bug

Reproduction

Logs

System info

osanseviero commented Nov 16, 2022

Narsil commented Nov 17, 2022