Insurgency: Sandstorm

Insurgency: Sandstorm

View Stats:
schroeder_lvb Dec 12, 2018 @ 10:41am
Observation: RCON vs Dedicated Server CPU Usage
I'm seeing this on Beta, as well as on the Official Release 2018.12.12

Summary: Issuing periodic and multiple RCON commands increases the CPU usage on Dedicated Servers (until the CPU-pegs). Stopping the RCON commands does not appear to bring the CPU usage back to nominal, but it will stop the CPU increase.

=======

Environment: Debian9/64 - using 1-Core VPS for ease of testing (I've seen similar problem w/ multi-core also - just using single core for ease of measurement).

Start Checkpoint Dedi server, password locked to assure 0-players in game.

Baseline: 8% nominal CPU (Sandstorm server idle + Kernel)

Start issuing RCON command loop at approx 1.0 Hz (one message per second): "listplayers"

Observation of CPU usage vs elapsed time:

05 minutes: 20% nominal
15 minutes: 55% nominal
20 minutes: 70% nominal
(I discontinued the test at this point)

Stop issuing RCON command:

No change - CPU % usage is 'stuck' at the last peak level, only server restart will restore nominal CPU usage. HOWEVER, it also stops the CPU usage growth.


=======

I'm trying to do a work-around on the "shimmy/rubber-band" bug by implementing a graceful server restart, and I would like to issue a "say" command to announce the imminent "kick" of all players. I would also like to periodically status the server via RCON but I will look into log file "tail" or queryports alternatives.










Last edited by schroeder_lvb; Dec 12, 2018 @ 12:26pm
< >
Showing 1-15 of 19 comments
tarAn-OD Feb 2, 2019 @ 2:20am 
interesting I have two windows servers been running for 2 months, full most evenings, I have 10 scheduled rcon say messages that run every minute and repeat throughout the 24 hours..also restart and steam update, I do not or have not seen this issue with cpu?
schroeder_lvb Feb 3, 2019 @ 10:12am 
Thanks TarAn - I'm not at all surprised if Windows binary runs differently from the Linux counterpart.

One additional info - last time I tested RCON "say" command I observed micro-vibrations ("shimmies") on my client after sending just few messages. My Linux server was "up" for only few minutes. I stopped investigating at about this point and went on with other things in life.

I'll re-run the test on Linux after the next release, and I will post my findings here. I hope to close this topic soon, and start using RCON commands! For Plan-Z (when all else fails) I'll give WINE a shot.



=]RC[= Hunter Feb 3, 2019 @ 10:38pm 
Micro-vibrations/lags/stuttering are happening on windows servers as well, sometimes after a few rounds, sometimes after a few minutes. Regardless coop or pvp, regardless using rcon or not.

We experienced as well playing on a full server many hours in a row zero issues... while recently it seems to always "stutter" after a while. These moments take less than a second and in between there might be up to a minute full stutter free gaming.

I doubt this to be hardware related, one of our test grounds is a dedicated server XEON E3 based with 3.5GHz and 32GB RAM in example, on Windows Server 2012 OS.

I suspect obviously a code difference in between the community server binaries compared to the matchmaking servers (matchmaking demands), perhaps a certain background / coding issue influencing the community servers not restarting every round (backend connectivity, EAC connectivity etc), while matchmaking servers seem to be created on demand and per round. So pratically they are restarted/newly created all the time.
tarAn-OD Feb 4, 2019 @ 12:58am 
I have turned of eac which has stopped the shimmerings, and lag problems people were having, if you have time look at the server logs, eac is not working well at the moment.

I still get map change crashes, server lock up under certain conditions and the dreaded 95 black screen, but none are related to rcon.
Destinate Feb 4, 2019 @ 4:53am 
Originally posted by =RC= Hunter:
Micro-vibrations/lags/stuttering are happening on windows servers as well, sometimes after a few rounds, sometimes after a few minutes. Regardless coop or pvp, regardless using rcon or not.

We experienced as well playing on a full server many hours in a row zero issues... while recently it seems to always "stutter" after a while. These moments take less than a second and in between there might be up to a minute full stutter free gaming.

I doubt this to be hardware related, one of our test grounds is a dedicated server XEON E3 based with 3.5GHz and 32GB RAM in example, on Windows Server 2012 OS.

I suspect obviously a code difference in between the community server binaries compared to the matchmaking servers (matchmaking demands), perhaps a certain background / coding issue influencing the community servers not restarting every round (backend connectivity, EAC connectivity etc), while matchmaking servers seem to be created on demand and per round. So pratically they are restarted/newly created all the time.

the community server is different than the matchmaking server but at a same time the matchmaking server doesn't have stutter because the server is automatic kill after every match. After you finish the matchmaking, you will go back to the menu. At that time the server is kill and restart and then you will be looking for a new matchmaking in a new server if you're going to continue playing for the next match. That's why they don't have stutter and really stable because it killed. For our community server, player keep playing in the same server over and over again without restart the server. That cause the stutter for us in the long run.
[57th] Ferret Feb 26, 2019 @ 8:48pm 
Just want to chime in and say that I've also noticed this behavior on Linux (VM) and Windows (VM and bare-metal). More info here[forums.focus-home.com].
schroeder_lvb Feb 26, 2019 @ 10:54pm 
I am hopeful they have this fixed for the upcoming release, if not, more operators will notice this issue as NWI adds more rcon gamemodeproperty items (aka cvars) for customization. Current mitigation is to reboot the community server every few hours.
schroeder_lvb Mar 1, 2019 @ 2:54pm 
I just re-ran a test on the patch release 2019.02.28 (aka rev 1.1). I regret to report I am still experiencing this problem.

For the QC department this is what I did to test it:

1) Debian9/64 Linux Community Dedicated Server (2-core Xeon, 4GB RAM)

2) Install htop: sudo apt-get install htop

2) Install mcrcon:

Download from: https://github.com/Tiiffi/mcrcon
Compile as: gcc -std=gnu11 -pedantic -Wall -Wextra -O2 -s -o mcrcon mcrcon.c

3) BASH Test script loop.sh as follows:

#!/bin/bash count=0 for (( ; ; )) do ./mcrcon -c -H 127.0.0.1 -P 27035 -p rconpassword "say hello there this is a test" sleep 1 count=$((count+1)) echo $count done

4) Run it for about 10 minutes and observe by "htop" that CPU usage creeps up after 5-10 minutes.

Mitigation:

Reduce use of RCON to absolute minimum between server reboots. Occasional manual remote kick/ban of players is ok.

Last edited by schroeder_lvb; Mar 1, 2019 @ 5:47pm
[57th] Ferret Mar 1, 2019 @ 6:32pm 
I've also seen no improvement on the issue. Confirmed on Debian and Windows using a different RCON client (rcon-cli[github.com]). The problem can be reproduced much faster (in less than a minute in my case) without the sleep, e.g.:
while true; do ./rcon-cli listplayers; done
schroeder_lvb Mar 1, 2019 @ 9:00pm 
@Ferret -

I used MCRCON for test because this tool was mentioned in the Sandstorm Admin Guide -- hoping that the NWI QC team was familiar with it. I used my own RCON driver also, and got the same result.

I chose the 1.0Hz rate because I wasn't sure if UE4 has a built-in counter-spoofing algorithm. I
didn't want to make it angry!

Very good to know you can offer independent confirmation with a different tool & execution timing, thank you!

schroeder_lvb Mar 9, 2019 @ 11:21am 
I re-ran the same RCON vs CPU load test (see my Mar-1 Post) on release 1.1.2.

I'm observing things got WORSE than the previous versions.

CPU % now goes up at much faster rate even on a idle (zero player) server.
[57th] Ferret Apr 8, 2019 @ 1:31pm 
Here[asciinema.org] is a recording demonstrating the issue and proving a thread leak. As can be seen, each RCON command increases the thread count (NWLP) by one, which never decreases.

Still reproducible on version 1.1.4.
schroeder_lvb Apr 8, 2019 @ 7:09pm 
Thanks Ferret - to save my time, I stopped testing on every patch, but will do so if I see something on the future release notes.

Meanwhile your post explains this:

https://steamcommunity.com/app/581320/discussions/1/1813170373219376670/

...since I wrote my on MOTD and connection/disconnection announcements via the Rcon.
schroeder_lvb May 18, 2019 @ 10:10am 
Ok I have a possible workaround!

I concur with Ferret’s finding of growing thread-count. I found this is independent of number of RCON commands sent to the server, instead, this is due to repeated opening-closing of the UDP connection. You can watch the thread-count like so:

$ ps -eLf | grep "InsurgencyServer-Linux" | wc -l

NWI Dedi Server software appears to spawn a thread for each new UDP connection (while UDP is connectionless I assume the server has to track “channel” for password-authenticated security). It fails to terminate the thread when client disconnects. Therefore if you are using tools like mcrcon to periodically broadcast server policies via the “say” RCON command, you will end up with 500-700 zombie threads after few hours, consuming CPU cycles under the name of kernel services.

A client-side workaround is to write your own app (C/C++, Python, etc.) that opens/connects to the server once, and use this connection for all your RCON command/status interactions. I found thread count to stay fixed even after many hours of up-time, and that red bar on htop (kernel services) pretty much stays fixed. As with any network software, your client should detect a closed or unresponsive connection with the server, and automatically re-connect when prudent.

For NWI, if you decide to fix this in the future:

* Detecting client disconnect and killing the thread is the ideal fix IMHO.
* I’m aware of at least one game the server implements timeout algorithm to close the client connection and kill inactive threads. If you do this, please let the community know so we know the timeout value, as we may need to send “keep-alive” pings to maintain open connection.
* You could add a new rcon command “close” – I think mcrcon can handle multiple rcon commands so some server operator might postfix with this (I did not verify this mcrcon behaviour/feature!)
* If you don’t plan to fix this issue, perhaps update your documentation to make clear that client RCON app should not repeatedly connect/disconnect. If you are recommending use of mcrcon, you might make a note that using this in a bash loop or cron may degrade your server performance over time.

Last tested on Dedi Server version 1.1.4, using Debian 9 64-bit Stretch
Last edited by schroeder_lvb; May 18, 2019 @ 10:11am
[PBS] Powerbits Jun 25, 2019 @ 12:04am 
very intresting, i get this on windows also
< >
Showing 1-15 of 19 comments
Per page: 1530 50

Date Posted: Dec 12, 2018 @ 10:41am
Posts: 19