-
Notifications
You must be signed in to change notification settings - Fork 64
Running the translategemma-4b-it (F16/q8_0) on a relatively long text with a context size greater than 1024 results in an error and chatllm crashes with CUDA / Vulkan backends, but it works fine on CPU. #122
Description
Chatllm.cpp: v0.22
Command line: -c 2048 -i -ngl all -m C:\model\translategemma-4b-it-f16.bin
Vulkan error: C:\projects\chatllm.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:6342: GGML_ASSERT(allow_misalign || misalign_bytes == 0) failed**
Sample output:
You are served by translategemma-4b-it, with 3880263168 (3.9B) parameters. You > Translate to spanish: Application performance of the 9950X3D is fantastic, it even beats the 9950X, which is somewhat unexpected. While it's no surprise that some applications will benefit from the extra cache, we're seeing higher performance across the board in virtually every single test. Taking a closer look at the typical clock frequencies of both processors reveals that on average, the 9950X3D does clock higher than the 9950X. Partly responsible for that is the fact that energy efficiency is a little bit better on the X3D version, or it could simply be that these cores are binned better. Still, this means you won't have to decide "do I focus on applications and prefer the 9950X, or focus on gaming, so I should pick the 9950X3D?"-No, the 9950X3D will give you the best performance, period-as long as you can afford it. Compared to the Intel Core Ultra 9 285K, the performance uplift is around 7%, and against the 14900K the improvement is 13%. This confirms once again, that AMD is the leader in the processor space, not only for gaming, but for applications, too. While the 9800X3D is fantastic for gaming, it does fall back in application workloads, especially tasks that scale to a lot of cores run A LOT faster on the 9950X3D, but the processor is more expensive, too, of course. AMD also has support for the AVX512 instruction set, which is beneficial for a handful of specialized applications, but a total non-issue for the general consumer. Intel on the other hand has introduced an NPU for AI acceleration with Arrow Lake, which is missing on Zen 5 desktop processors. That's not a problem either, because at the moment most AI experiences are cloud-based or run on the CPU/GPU.Intel's Arrow Lake CPUs have been launched, both K and non-K. Maybe they'll release a KS model, but I doubt that it will bring any significant gains over their existing offerings. AMD's Zen 6 seems to be one more year away. Intel Panther Lake, aka Core Ultra 300 Series is expected to launch at the end of 2025, but that will probably be mobile only. There have been whispers about an Arrow Lake refresh, which could allow Intel to tackle the performance shortcomings observed in certain cases. Although Arrow Lake possesses impressive raw power, it hasn't been effectively converting that into optimal FPS figures for gaming and related workloads-or they canceled it and are focusing on their next-gen, which would be "Beast Lake," expected for 2026. A.I. > La aplicación del 9950X3D es fantástica, incluso supera al 9950X, lo cual es algo inesperado. Si bien no es sorprendente que algunas aplicaciones se beneficien de la caché adicional, estamos viendo un mejor rendimiento en prácticamente todas las pruebas. Al examinar de cerca las frecuencias de reloj típicas de ambos procesadores, se revela que, en promedio, el 9950X3D funciona a una frecuencia más alta que el 9950X. Una de las razones de esto es que la eficiencia energética es un poco mejor en la versión X3D, o simplemente que estos núcleos están mejorificados. Sin embargo, esto significa que no tendrás que decidir "¿Debería enfocarme en aplicaciones y elegir el 9950X, o enfocarme en juegos, por lo que debería elegir el 9950X3D?" - No, el 9950X3D te dará el mejor rendimiento, sin importar nada, siempre y cuando puedas pagarlo. En comparación con el Intel Core Ultra 9 285K, el aumento de rendimiento es de alrededor del 7%, y en comparación con el 14900K, la mejora es del 13%. Esto confirma una vez más que AMD es el líder en el espacio de procesadores, no solo para juegos, sino también para aplicaciones. Si bien el 9800X3D es fantástico para juegos, tiene un rendimiento inferior en tareas de aplicaciones, especialmente aquellas que escalan a muchos núcleos, las cuales se ejecutan MUCHO más rápido en el 9950X3D, pero el procesador también es más caro. AMD también tiene soporte para el conjunto de instrucciones AVX512, lo cual es beneficioso para algunas aplicaciones especializadas, pero no es un problema para el consumidor general. Intel, por otro lado, ha introducido una NPU para la aceleración de IA con Arrow Lake, lo cual no está presente en los** C:\projects\chatllm.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:6342: GGML_ASSERT(allow_misalign || misalign_bytes == 0) failed**