conversion error big time

I keep getting this error even though I'm trying to convert a model that should have the proper context size...not sure what else to do:

The model I'm trying to convert is ```Yi-1.5-9B-Chat```

```
Token indices sequence length is longer than the specified maximum sequence length for this model (10073 > 4096). Running this sequence through the model will result in indexing errors
Traceback (most recent call last):
  File "D:\Scripts\bench_chat\convert_awq2.py", line 50, in <module>
    model.quantize(
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\models\base.py", line 211, in quantize
    self.quantizer = AwqQuantizer(
                     ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 69, in init
    self.modules, self.module_kwargs, self.inps = self.init_quant(
                                                  ^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\awq\quantize\quantizer.py", line 570, in init_quant
    self.model(samples.to(next(self.model.parameters()).device))
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1139, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 918, in forward
    position_embeddings = self.rotary_emb(hidden_states, position_ids)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl
    return self._call_impl(*args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl
    return forward_call(args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 153, in forward
    freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
             ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conversion error big time #33

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

conversion error big time #33

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions