How to Load into colab

by ghpkishore - opened Jun 13, 2022

Jun 13, 2022

I cannot seem to locally load the model in colab using the git function. It shows that setup.py is missing. Also, when I try to use the normal method of "from transformer import", I am not able to load it cause the RAM gets over. I am using Google Colab pro account. Is there a way for me to resolve this?

nielsr

Jun 18, 2022

Hi,

Looking at the docs, the weights are in float16 format, meaning that 16 bits or 2 bytes are used to store each parameter.

That means that, for a 20 billion parameter model, you need 20 billion parameters * 2 bytes / parameter = 40 billion bytes, also known as 40 GB. That's the amount of RAM required to load the model.

stellaathena

EleutherAI org Jun 18, 2022

That’s not quite correct. GPT-NeoX-20B was trained using mixed precision (fp32/fp16). These weights are in fp32, which is why the docs mention using .half() before loading the model onto GPU.

I’m not sure what GPUs you are able to get via Colab but inference with this model typically requires more then 40 GB of VRAM.

nielsr

Jun 20, 2022

Oh my apologies. I read from the docs "GPT-NeoX-20B was trained with fp16", I guess this can be corrected.

Also, I think it may be beneficial to add the RAM requirements to the docs as well, similar to the "tips" section of GPT-J.

Do you think it would be beneficial to have a separate branch on this repo with float16 weights?

nielsr

Jun 21, 2022

cc'ing @sgugger regarding whether or not this model can be loaded into Google Colab using Accelerate's big model inference feature.

sgugger

Jun 21, 2022

•

edited Jun 22, 2022

Not on Colab free no, they don't provide enough disk space to even download the weights.

@stellaathena I'm surprised to learn the model was trained in fp16 (not bfloat16?) as we get crappy generations in FP16 but decent ones in bfloat16 in our tests.

Edit: Looks like it was only a bug in the Transformers implementation. https://github.com/huggingface/transformers/pull/17811 should fix the float16 generations.

ghpkishore

Jun 25, 2022

Thanks for letting me know and fixing the issue @sgugger . I will upgrade to Colab Pro and see if can run their.

stellaathena changed discussion status to closed Aug 9, 2022

keithhon

Feb 10, 2023

I have tried running the model in Colab Pro but failed, as it only has 39GB~40GB gpu ram

HexDev

Apr 2, 2023

github codespaces

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment