any reason why the finetuning llama notebook is running only on colab? #7

yairVanti · 2023-09-21T08:43:50Z

i tried running the same notebook on gcp A100 machine, and it failed on :

`File ~/.local/lib/python3.9/site-packages/transformers/utils/bitsandbytes.py:109, in set_module_quantized_tensor_to_device(module, tensor_name, device, value, fp16_statistics)
107 new_value = old_value.to(device)
108 elif isinstance(value, torch.Tensor):
--> 109 new_value = value.to(device)
110 else:
111 new_value = torch.tensor(value, device=device)

RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.`

on colab it work perfectly.
any idea ?

The text was updated successfully, but these errors were encountered:

mlabonne · 2023-09-21T13:44:57Z

It could be something related to the CUDA version this machine is using. I'd also recommend updating the libraries (especially transformers, accelerate, and bitsandbytes), it might solve a compatibility issue.

mlabonne closed this as completed Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

any reason why the finetuning llama notebook is running only on colab? #7

any reason why the finetuning llama notebook is running only on colab? #7

yairVanti commented Sep 21, 2023

mlabonne commented Sep 21, 2023

any reason why the finetuning llama notebook is running only on colab? #7

any reason why the finetuning llama notebook is running only on colab? #7

Comments

yairVanti commented Sep 21, 2023

mlabonne commented Sep 21, 2023