Skip to content
This repository was archived by the owner on May 11, 2025. It is now read-only.
This repository was archived by the owner on May 11, 2025. It is now read-only.

conversion error big time #33

@BBC-Esq

Description

@BBC-Esq

I keep getting this error even though I'm trying to convert a model that should have the proper context size...not sure what else to do:

The model I'm trying to convert is Yi-1.5-9B-Chat

Token indices sequence length is longer than the specified maximum sequence length for this model (10073 > 4096). Running this sequence through the model will result in indexing errors
Traceback (most recent call last):
  File "D:\Scripts\bench_chat\convert_awq2.py", line 50, in <module>
    model.quantize(
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\models\base.py", line 211, in quantize
    self.quantizer = AwqQuantizer(
                     ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 69, in init
    self.modules, self.module_kwargs, self.inps = self.init_quant(
                                                  ^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 570, in init_quant
    self.model(samples.to(next(self.model.parameters()).device))
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1139, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 918, in forward
    position_embeddings = self.rotary_emb(hidden_states, position_ids)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 153, in forward
    freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
             ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions