More than words

More than words

Super Slow response Time
Is there a way of speeding up how fast she responds? It takes around 20 seconds per line for me. Thanks.
< >
112/12 megjegyzés mutatása
Did you change her mind or do you use the one by default?
Jorge  [Fejlesztő] 2024. okt. 4., 10:35 
Hi, my computer is a bit old, from 2019. It has an Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz 3.70 GHz, 16.0 GB, Windows 10 and Geforce RTX 9080.

The LLM ("Mind" in the context of the game) is executed only on the CPU.
Of course, depending on the size of the response the time varies. But I would say that it is an average of 8 seconds. If your computer has better hardware, I would find that time you mention strange.

Please, check if there is no other software that is consuming time on your CPU (for example, an antimalware). Also, make sure you have space on your hard drive so that Windows can do memory paging.

You can also try this LLM: Llama-3.2-3B-Instruct-GGUF (https://www.youtube.com/watch?v=bM5TRBu5GGE)

https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/blob/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf

It is the most modern and compact version of LLAMA, I have already tried it and it is faster. It is a strong candidate to be the new default, the bad thing is that it is not uncensored.
Jorge  [Fejlesztő] 2024. okt. 4., 12:15 
Oh, nice: https://huggingface.co/QuantFactory/Llama-3.2-3B-Instruct-uncensored-GGUF

Hey everyone, if you can try it out and let me know your thoughts, it seems like it would be a pretty good option to make it the new default.
You really want turn Marian as a spy don't you? lol
This script could have some weird spy reflexes. ^^
I'm on it.
I tried both 3.2-3B-instruc-uncensored.Q4_K_M and Q8_0
I can't tell if they are both quick equaly or not. I have no delay even with 3.1-F32 on my PC.
(that remind me how Gpt neo was bad written compare to them)

So about quality, i didn't make long test but the 3.2 version seems indeed good. In french, few error, (even less than 3.1 i will say) rarely some weird english.
All Llama seem to do that time to time anyway.
So 3.2 have already a good understanding, but, it have made mistake. (the disclaimer warn us about that.)

_Strangely, they talk more efficiently but change or miss use the story of the prompt more often than 3.1.
Only few rare details are sometimes change or miss use. But my short test didn"t tell me if they will remember what we said on a long talk. I will see. 3.1 is not perfect neither and have also limits.
Thanks for the help. Gonna try the other Llama and see if it helps :D
Jorge  [Fejlesztő] 2024. okt. 6., 19:33 
Dr. Davey 'One-eye' eredeti hozzászólása:
Thanks for the help. Gonna try the other Llama and see if it helps :D
Excellent, please let me know the quality of the answers!
Jorge  [Fejlesztő] 2024. okt. 6., 19:40 
Lunrei eredeti hozzászólása:
I tried both 3.2-3B-instruc-uncensored.Q4_K_M and Q8_0
[...] I will see. 3.1 is not perfect neither and have also limits.

Thanks a lot!

I've tried 3.2-3B-instruc-uncensored.Q4_K_M, and it seems to work fine in English. The responses are slightly faster than 3.1, though. The fact that it's only 2GB in size is also very appealing.

However, I've tried it mostly with Spanish, and it does make more grammatical mistakes than 3.1, which is the part I don't like. I'll keep trying to be sure.
i cant talk to her because the text chat is missing
Jorge  [Fejlesztő] jan. 6., 19:28 
Hi, hmmm, when you send a message the text chat automatically is hidde. After, when Marian finish to speak, the text chat come back. You mean, chat never come back?
i have this same issue, great game but the response time is extremely slow, takes around 5mins for the text box to reappear
Jorge  [Fejlesztő] ápr. 22., 12:23 
rsdw eredeti hozzászólása:
i have this same issue, great game but the response time is extremely slow, takes around 5mins for the text box to reappear

Hi, thanks for commenting, and I apologize for the inconvenience.

Assuming you have similar or better CPU than mine: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz and 16.0 GB RAM, 5 minutes is too long and would sound like the resources are being consumed by another application. The text response depends solely on the CPU (the graphics card's is for Unreal Engine). If you open the system monitor, isn't there another process or application consuming the CPU?
< >
112/12 megjegyzés mutatása
Laponként: 1530 50