Steam telepítése
belépés
|
nyelv
简体中文 (egyszerűsített kínai)
繁體中文 (hagyományos kínai)
日本語 (japán)
한국어 (koreai)
ไทย (thai)
Български (bolgár)
Čeština (cseh)
Dansk (dán)
Deutsch (német)
English (angol)
Español - España (spanyolországi spanyol)
Español - Latinoamérica (latin-amerikai spanyol)
Ελληνικά (görög)
Français (francia)
Italiano (olasz)
Bahasa Indonesia (indonéz)
Nederlands (holland)
Norsk (norvég)
Polski (lengyel)
Português (portugáliai portugál)
Português - Brasil (brazíliai portugál)
Română (román)
Русский (orosz)
Suomi (finn)
Svenska (svéd)
Türkçe (török)
Tiếng Việt (vietnámi)
Українська (ukrán)
Fordítási probléma jelentése
The LLM ("Mind" in the context of the game) is executed only on the CPU.
Of course, depending on the size of the response the time varies. But I would say that it is an average of 8 seconds. If your computer has better hardware, I would find that time you mention strange.
Please, check if there is no other software that is consuming time on your CPU (for example, an antimalware). Also, make sure you have space on your hard drive so that Windows can do memory paging.
You can also try this LLM: Llama-3.2-3B-Instruct-GGUF (https://www.youtube.com/watch?v=bM5TRBu5GGE)
https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/blob/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf
It is the most modern and compact version of LLAMA, I have already tried it and it is faster. It is a strong candidate to be the new default, the bad thing is that it is not uncensored.
Hey everyone, if you can try it out and let me know your thoughts, it seems like it would be a pretty good option to make it the new default.
This script could have some weird spy reflexes. ^^
I'm on it.
I can't tell if they are both quick equaly or not. I have no delay even with 3.1-F32 on my PC.
(that remind me how Gpt neo was bad written compare to them)
So about quality, i didn't make long test but the 3.2 version seems indeed good. In french, few error, (even less than 3.1 i will say) rarely some weird english.
All Llama seem to do that time to time anyway.
So 3.2 have already a good understanding, but, it have made mistake. (the disclaimer warn us about that.)
_Strangely, they talk more efficiently but change or miss use the story of the prompt more often than 3.1.
Only few rare details are sometimes change or miss use. But my short test didn"t tell me if they will remember what we said on a long talk. I will see. 3.1 is not perfect neither and have also limits.
Thanks a lot!
I've tried 3.2-3B-instruc-uncensored.Q4_K_M, and it seems to work fine in English. The responses are slightly faster than 3.1, though. The fact that it's only 2GB in size is also very appealing.
However, I've tried it mostly with Spanish, and it does make more grammatical mistakes than 3.1, which is the part I don't like. I'll keep trying to be sure.
Hi, thanks for commenting, and I apologize for the inconvenience.
Assuming you have similar or better CPU than mine: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz and 16.0 GB RAM, 5 minutes is too long and would sound like the resources are being consumed by another application. The text response depends solely on the CPU (the graphics card's is for Unreal Engine). If you open the system monitor, isn't there another process or application consuming the CPU?