Trying to run the gradio app in Windows, it loads fine, the model too, but as soon as it's about to run inference, it throws this error:
Exception in thread Thread-6 (generate):
Traceback (most recent call last):
File "C:\Users\jtabox\envs\p312\Lib\threading.py", line 1075, in _bootstrap_inner
self.run()
File "C:\Users\jtabox\envs\p312\Lib\threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\generation\utils.py", line 2564, in generate
result = decoding_method(
^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\generation\utils.py", line 2784, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\utils\generic.py", line 918, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\models\llava\modeling_llava.py", line 419, in forward
outputs = self.model(
^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\utils\generic.py", line 918, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\models\llava\modeling_llava.py", line 285, in forward
outputs = self.language_model(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\liger_kernel\transformers\model\llama.py", line 81, in lce_forward
outputs = self.model(
^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\torch\nn\modules\module.py", line 1964, in __getattr__
raise AttributeError(
AttributeError: 'LlamaModel' object has no attribute 'model'
Error during generation:
Traceback (most recent call last):
File "G:\progs\_JoyModels\joycaption__fpgaminer_gh\gradio-app\app.py", line 555, in chat_joycaption
for text in streamer:
^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\site-packages\transformers\generation\streamers.py", line 226, in __next__
value = self.text_queue.get(timeout=self.timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jtabox\envs\p312\Lib\queue.py", line 179, in get
raise Empty
_queue.Empty
🛠️ System configuration:
Python : 3.12.10 (C:\Users\jtabox\envs\p312\python.exe)
PyTorch : 2.9.1+cu128
‣ CUDA build : 12.8
transformers : 4.57.3
bitsandbytes : 0.49.0
liger_kernel : 0.7.0
GPUs (total 1):
• [0] NVIDIA GeForce RTX 3090 | compute 8.6 | 23.7 GiB
Hey,
Trying to run the gradio app in Windows, it loads fine, the model too, but as soon as it's about to run inference, it throws this error:
This is the system configuration it outputs when the app starts:
Any ideas?