Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routinator resource usage #333

Open
mxsasha opened this issue May 14, 2020 · 8 comments
Open

Routinator resource usage #333

mxsasha opened this issue May 14, 2020 · 8 comments
Milestone

Comments

@mxsasha
Copy link

mxsasha commented May 14, 2020

Based on this tweet I was asked to open a github issue about my routinator resource usage. This isn't an operational issue for me, the server can handle it, and I don't know whether there is an actual bug, but since I was asked to open an issue, here it is.

Here you can see when I started running routinator:

After running for about 38 hours, the routinator server is at 1h18m of CPU time, so that seems to match those graphs. It's running on a single core virtual machine.

I installed it following the quick start exactly, I am running routinator server with those parameters. There are three BIRD instances using it for RPKI validation, IPv6 only.

Here are my full logs from the last start, nothing in there seems unusual.

@partim
Copy link
Member

partim commented May 14, 2020

When you zoom in a bit, do you see short bursts of CPU usage? Typically, things should be a bit hectic for a minute or two and then calm for ten minutes. Given that each validation run includes validating signatures on around 100,000 objects, quite a bit of CPU usage is to be expected.

Similarly, these 100,000 objects – each is its own file –, are read every 10 minutes explaining the disk usage. I’d expect them to be mostly cached if the machine has enough memory, but if it is a VM on a busy host, ten minutes may be too long for them to stay in the cache.

@mxsasha
Copy link
Author

mxsasha commented May 14, 2020

Here's the closest zoom I can get, last 24 hours:
Screenshot 2020-05-14 at 13 22 13

@alarig
Copy link

alarig commented Jun 22, 2021

I have an operational issue with the RAM usage of the last release (0.9.0), it jumped from some megs to more than 1G. It’s OOM-killed by the kernel every now and them. As a quick fix, I’m back on 0.8.3.
The upgrade has been done at the end of week 22.
graphs

@partim
Copy link
Member

partim commented Jun 22, 2021

Thanks for the report and the graphs! While we expected higher RAM usage due to the new database in 0.9, it certainly is too much now and consumption also seems to be growing over time. We are investigating both right now and hopefully will have a fix soon.

@AlexanderBand
Copy link
Member

Please note that RAM usage in 0.10.0 is now significantly lower than in 0.9.0:

$ sudo systemctl status routinator
● routinator.service - Routinator 3000
   Loaded: loaded (/lib/systemd/system/routinator.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-08-23 12:22:18 UTC; 1 weeks 3 days ago
     Docs: man:routinator(1)
 Main PID: 7389 (routinator)
    Tasks: 5 (limit: 2330)
   Memory: 1.4G
   CGroup: /system.slice/routinator.service
           └─7389 /usr/bin/routinator --config=/etc/routinator/routinator.conf --syslog server

$ cat /proc/7389/status | grep 'VmHWM\|VmRSS'
VmHWM:	  428368 kB
VmRSS:	  369752 kB

@alarig
Copy link

alarig commented Sep 3, 2021

Yeah, I have upgraded last week and I confirm. Thanks a lot for the work done. It even seems to be lower than 0.8.0 ;)

@partim partim added this to the 0.12.0 milestone Jul 20, 2022
@partim partim modified the milestones: 0.12.0, 0.13.0 Oct 17, 2022
@partim partim modified the milestones: 0.13.0, 0.14.0 May 19, 2023
@maxadamo
Copy link

maxadamo commented Aug 8, 2023

I am running the container version 0.12.0 in Nomad, and the CPU is spiking to unthinkable levels: 1500% (thousand and 5 hundred percent):
image

or to 30 Ghz (thirty Ghz):

image

@partim
Copy link
Member

partim commented Aug 9, 2023

It is a bit odd that this happens on every third validation run (assuming you’ve kept the refresh time at ten minutes), but otherwise not entirely unexpected if you have a lot of cores. Routinator uses a thread pool during validation that is by default configured to be the number of cores. Each thread processes one repository including updating the repository. If there is nothing to update and you have most of the files buffered (if you have enough memory), you can end up with all threads basically just validating signatures and using a lot of CPU. This would just be a short spike since most repositories are quite small and eventually there will be only two or three working threads left.

So, while seeing this every third run is certainly strange, I think this is normal. You can, however, limit the amount of threads (and thus cores used) via the validation-threads configuration option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants