This repository was archived by the owner on May 11, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 21
This repository was archived by the owner on May 11, 2025. It is now read-only.
conversion error big time #33
Copy link
Copy link
Open
Description
I keep getting this error even though I'm trying to convert a model that should have the proper context size...not sure what else to do:
The model I'm trying to convert is Yi-1.5-9B-Chat
Token indices sequence length is longer than the specified maximum sequence length for this model (10073 > 4096). Running this sequence through the model will result in indexing errors
Traceback (most recent call last):
File "D:\Scripts\bench_chat\convert_awq2.py", line 50, in <module>
model.quantize(
File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\awq\models\base.py", line 211, in quantize
self.quantizer = AwqQuantizer(
^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 69, in init
self.modules, self.module_kwargs, self.inps = self.init_quant(
^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 570, in init_quant
self.model(samples.to(next(self.model.parameters()).device))
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
return forward_call(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1139, in forward
outputs = self.model(
^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
return forward_call(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 918, in forward
position_embeddings = self.rotary_emb(hidden_states, position_ids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
return self._call_impl(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
return forward_call(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 153, in forward
freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels