Some system load graphs of last 24h

You work at Grafana?

jamesorlakin

It's been very snappy today, nice work! Is it all under Docker Compose with the node handling Nginx and Postgres as well?

Ruud reply

Yes.

MrPoopyButthole reply

Why did you guys roll back the UI to .7 from .10? I enjoyed some of the UI improvements, but I guess there were some bugs?

Edit: I see its back to .10 maybe I had a browser tab open from before that I never refreshed

UndulyUnruly reply

I‘m really grateful for your and your colleagues‘ work. Thank you for letting us lemmy around here!!!

mruczek

Dang that's a lot of RAM

Ruud reply

mastodon.world has the same server but with twice the RAM :-)

acupofcoffee reply

What chassis? I’ve got 256GB in an R720 but only 32 cores here!

Ruud reply

It's a AX161 server at Hetzner

acupofcoffee reply

€142 is more reasonable than I expected! I’ll toss some cash to help!

tool reply

r.rosettast0ned.com

You should see some of our VM hosts at work...

Tygr

I can’t believe how fast you’ve managed to crowdsource and fix things on this instance. I haven’t seen many problems at all sharing comments and things.

victron

UndulyUnruly reply

From the lemmy.world front page:

Donations

If you would like to make a donation to support the cost of running this platform, please do so at the mastodon.world donation URLs:

    https://opencollective.com/mastodonworld
    https://patreon.com/mastodonworld

Sir_Simon_Spamalot reply

Where in the frontpage can we see this?

Edit: thank you all!

Billiam reply

It's on the right-hand sidebar of lemmy.world:

Sir_Simon_Spamalot reply

Awesome! I'm on mobile, so I cannot see it. Will check it out when I get to my computer.

BrerChicken reply

You can view sidebar on mobile. I think it's in the three dots, but it's somewhere!

EDIT: On Jerboa it's under Community Info, under the three dots. On the mobile web app for L.W. there's a sidebar button.

AFK BRB Chocolate reply

Just go to lemmy.world and click sidebar.

Gubb

This is awesome! As a systems engineer for my day job, I love seeing stuff like this!

BigFig

Kaliax

Some of my usage is in this data and I like that.

pretty gauges. the instance seems to be more stable/responsive today

UnfortunateShort

How much is that in beans?

PMmesexypajamas reply

At least 1

peril33 reply

Possibly 2

asterfield reply

Let’s not go crazy

I Cast Fist reply

programming.dev

About tree fiddy

FrostyCaveman

Damn that’s a huge chunk of (what looks like) a 64 core CPU there. Impressive!

It’s cool it can aggressively cache that much. Although I am perplexed why one would have a swap file configured in this case? What does it give you here? Sorry not trying to be an elitist or anything just have no idea what advantage you get!

Ruud reply

To be honest I tend to use swap less and less. But this was in the build that Hetzner does and I didn't remove it.

DoomBot5 reply

If your application goes wild with RAM usage, a properly configured swap will make sure the underlying OS remains responsive enough to deal with it.

steventhedev reply

The OOM killer is usually triggered after it starts hitting the disk. Which means your system is unresponsive for a long time until it finally kills something.

Using something like oomd can help trigger before it hits swap but then why are you using swap in the first place?

The bigger issue is that the kernel sometimes ignores the swappiness and will evict code/data pages long before file cache even when set to 0 or 1. I'm still not sure if that was because of an Ubuntu patch or if it was an issue that's been resolved in the years since I last saw this

remkit

How far do you see lemmy.world capable of scaling to? One thing I've been noticing is the centralisation of Lemmy users on a few top servers, surely that cannot be healthy for federation? What are your thoughts on this?

astral_avocado reply

remkit reply

Not entirely sure of what you are asking, but the only reason they need a clustered setup is simply because of their scale. Making the details of their setup public does not help with the issue I addressed, since in an ideal scenario, communities and users would be evenly distributed amongst the many Lemmy instances in the fediverse, making the need to do any sort of clustering for performance reasons unnecessary.

astral_avocado reply

Ruud reply

We do run on 1 server, but we’ve now seen that Lemmy scales horizontally so the k8s path forward is open 😊 With all these latest improvements we can have a bit more users on the current box.

astral_avocado reply

Oh, I could a swore I read somewhere you went multi. Maybe I'm confusing another instance

remkit reply

Not trying to be pedantic, but why do they have to do so? Why can't people figure it out themselves? Also, why can't Lemmy instances run on single non-redundant boxes? Most instance operators don't have the budget of enterprises, so why would they have to run their Lemmy's like enterprises?

astral_avocado reply

remkit reply

Er, because we should all be working together to try to help Lemmy grow and be stable…?

I agree with this point, but I disagree with the context in which you mentioned, "They should post their clustered setup so others can replicate more easily", right as a reply to my original comment asking how Ruud felt about the centralisation of users in a federated application. This should've been an entirely separate reply, or perhaps an issue on GitHub to the Lemmy authors.

You can run on a single box, but a single problem will bring down your single box. This is a basic problem commonly discussed in DevOps circles.

Again, I agree, but the context in which you mentioned it, basically suggests that everyone who runs single instance Lemmys are doing it wrong, which I disagree.

Lowering the entry requirements is part of how we can get wide-spread adoption of federated software. Not telling people that they have to have at least 2 instances with redundancies or they are doing it entirely wrong.

The bare minimum I would ask anyone running their own instance, is to have backups. They don't need fancy load-balancers, or slaved Postgres database setups, or even multi-node redis caches for their instances of sub-thousand users.

For example, one reasonably priced server on most providers is like $20-40/month. Say a load balancer as a service is another $10-20, and a database server or database as a service is also like $20-$40. A distributed, redundant setup would be like 2 webservers, a database, and a load balancer so like, $70?

Seriously? That may be an acceptable price tag for a extremely public Lemmy host, like lemmy.world or lemmy.ml, but in no way should it be a reasonable price tag for the vast majority of Lemmy instances setup out there. Especially when most of them have sub-thousand users. $70/mo? That has to be a joke. You can easily host a Lemmy on a $5-$10 droplet for ~100 users.

I’ve deployed clustered applications myself, I just haven’t looked into doing it with Lemmy and was curious if they had a run book or documentation.

No offense, but you definitely seem like the kind of person to shill for cloud-scaling and disregard cost-savings.

astral_avocado reply

Djangofett

How much is this costing you? Also who is your host? Is it on a virtual machine?

SuperIce reply

They have a dedicated server: https://lemmy.world/post/75556

samus12345 reply

It's actually pretty funny to see him mention the growth (almost 12k users!) considering they've added, what, 50k or so users recently?

Muddybulldog reply

mylemmy.win

I signed up three days before that post. They were the largest instance with open signups. Almost 1000 users.

Djangofett reply

Whoa, cool. Thanks. Only a matter of time until it gets overloaded though. Can't Lemmy run in a container service like Cloud Run or AWS App Runner?

tool reply

r.rosettast0ned.com

Yeah, you could do it in AWS with ECS or Fargate.

Djangofett reply

https://github.com/jetbridge/lemmy-cdk

Indeed you can, very cool.

kratoz29 reply

Dedicated means local?

degrix reply

lemmy.hqueue.dev

Dedicated usually means it’s not splitting cpu time with another instance. It could mean a local machine but it does not have to be one.

kratoz29 reply

Tbh I'd see it hard to be local, so maybe it is cloud computing but a standalone instance as you just said.

Perhyte reply

No, it means it's got the physical machine all to itself. It's a rented server located in a Hetzner data center.

Jimbo reply

yiffit.net

My homies love dedicated servers

earthquake

I know that the RAM cache is just taking advantage of otherwise free RAM and will be dropped in favor of anything else, but it does stress me out a bit to see it "full" like that.

Ruud reply

It would stress me even more to see a lot of RAM doing nothing, that would be a shame! ;-)

CashewNut 🏴󠁢󠁥󠁧󠁿 reply

Difference between Windows and Linux. Windows would only use what it needs. Linux pre-empts more and fills the RAM for what coul dbe needed.

It used to stress the shit out of me when I switched to Linux as I'd gotten used to opening task manager and seeing 90% free RAM. On Linux I'd be seeing 10% free and panicking thinking it was a resource hog.

The Linux-way is the best way.

I use Arch btw ;)

Gecko reply

Both OSes do pre-caching and for both the standard tools to check usage nowadays ignore pre-cached elements when counting RAM usage.

CashewNut 🏴󠁢󠁥󠁧󠁿 reply

I had a feeling that 'factoid' may be out of date! Since I learnt it about the time of Windows XP when we were shown examples of how Linux and Windows memory management differed. It all made sense why Linux seemed to have full RAM even after a big upgrade but WinXP gave the 'illusion' of having lots of free RAM to use. ~ 20yrs ago!

I think we used SuSE Linux 7.3!

I still hold a savage hatred of all RPM-based distros after dealing with the hell of early 2000's editions (Redhat, Mandrake & Suse). Though I did like SuSE KDE's colours when it worked!

AlexisFR reply

jlai.lu

But Windows also does pre caching?

Perhyte reply

It probably just didn't mark that memory as "used" in the task manager.

CashewNut 🏴󠁢󠁥󠁧󠁿 reply

I discovered this about 20yrs ago and there's been a lot of drugs & drink since then.

I do remember I could open my shit-hot 256Mb RAM desktop with Windows XP taskmanager and it shows a whopping 128Mb free RAM. 😎

Then I'd boot into my '733T H4X0r' Suse Linux 7.3 and top would show 5Mb free RAM. 😱

This caused much upset until I found out the two OS's have (had?) fundamentally different memory utilisation philosophies.

May not be the case anymore but it was late 90s/early 00s.

wounn reply

lemmy.pt

That's how it supposed to work, free RAM does nothing :)

FrostyCaveman reply

It’s free real estate!

If you had this much buffer memory what are the reasons to have swap space as well?

With my servers I’m paranoid having swap enabled will inadvertently slow stuff down. Perhaps there’s a reason to have it that I’m unaware of?

digilec reply

If you had this much buffer memory what are the reasons to have swap space as well?

Many programs do stuff once during startup that they never do again, sometimes creating redundant data objects that will never get accessed in the configuration its being run in. Eventually the kernel memory manager figures out that some pages are never used but it can't just delete them. If swap is enabled it can swap them to disk instead. It frees up that RAM for something more important. It's usually minor but every few MB helps.

Illecors reply

lemmy.cafe

I personally like having some swap as during low memory situations (which lemmy gets at least once a day on my small instance) everything slows down rather than getting culled by the oom killer. It's not a replacement for monitoring, but it does extend the timeframe to react to things.

patsharpesmullet reply

feedly.j-cloud.uk

Memcache usually takes all the assigned memory regardless of usage so seeing high usage isn't always unusual. That's assuming the lemmy servers are using some kind of session caching solution.

netwren

I hate that radial graphs are so popular with *Grafana dashboards. Radial/pie charts are terrible representations for humans to interpret. I tend to try and convert them either to a stat with the line/time display or a bar chart. Humans are better judging linear relationships than radial.

Ruud reply

Who says I'm human?

netwren reply

Or are you dancer?

jelloeater85 reply

Killers ❤️

TrainsAreCool reply

lemmy.one

Radial graphs are a bit of a meme where I work as one of the C-suite managers despises them for precisely that reason.

Dusk

Now that’s hot

xavier666 reply

As a server admin, I really hope it's not hot

Dusk reply

2hot

Robonps

Looks Awesome! Glad to see the patches seem to be working.

nadedefeat

lemmy.thenullcore.com

Awesome. Gotta love Grafana!

itadakimasu

This is so cool to see. Thanks for posting! Lemmy.world has been super smooth today

gardylou

Love me some grafana.

sunnyxiongster

Everytime I open a post and go back to previous page it scrolls back to top. Is this fixable? Im on windows 11, chrome.

henfredemars

infosec.pub

I was hoping to see some uptime, but thanks for the window into your server! Are you still having to kill the instance every half hour?

Undearius reply

It says uptime is 3.3 weeks in the top right.

henfredemars reply

infosec.pub

Hmmm... maybe the instance uptime is different from the server uptime.

DrManhattan

lemmy.design

Great stats. Thanks for posting!

slazer2au

Always fun to see system dashboards.

carroarmato0

Quite a beefy setup 😄

WhiteOakBayou

Thanks for all the hard work. It has been running so well all day!

The D Quuuuuill

slrpnk.net

I notice your defederation list is completely depopulated today. Is that intentional?

Ruud reply

No it's just moved to the bottom of the page apparently. I preferred it on the side. Maybe a tab would be better.

Lodion 🇦🇺 reply

aussie.zone

On 0.18.1-rc.10 the defederated instances are at the very bottom, not on the right hand side.

The D Quuuuuill reply

slrpnk.net

OOoooooh! Thanks for the info.

Also infinitely less clear and helpful...

Lodion 🇦🇺 reply

aussie.zone

It may be a bug, I'm not sure.

Bappity

THE DROP???!! >:O

Molecular0079 reply

Seriously! Talk about amazing optimization and debugging of the network service.

SomeOtherUsername

Is the memory leak still there?

Ruud reply

No! Restarts are disabled and it's OK now!

SomeOtherUsername reply

Great! 😁 I was just wondering because the memory graph showed sharp falls in memory usage every ~30 mins.

ccunix reply

That is probably the Garbage Collector running.

SomeOtherUsername reply

Rust has no garbage collector though. Memory is freed up as soon as the variable leaves the current scope.

I’m guessing the server was still set up to restart every 30 mins at the time this pic was taken. Then they tried disabling that and it was fine.

ccunix reply

It was mildly educated guess, I know very little about Rust.

CashewNut 🏴󠁢󠁥󠁧󠁿

I fucking love a sexy Grafana dashboard.

saga

So I‘m currently on planning to host an instance myself. This graph helped me quite a lot to get an idea what system resources are required.

Do you use any reverse proxy in front of it?

Ruud reply

Nginx runs on the server , proxying to the lemmy docker containers

saga reply

That‘s what I had in mind. To run nginx on a seperate vps, so I can scale it easier. Run fediverse instances in the back, either all on one vps or on different vps. This way I could provide a hub while increase performance (due to compression and caching) and provide redundancy/load balancing if necessary.

What‘s the typical traffic you experience? Peak (Gbit/s) and average/daily traffic (GB)

Ruud reply

saga reply

Thanks, that’s super helpfu!

xavier666 reply

Lemmy world has a lot of users. So your instance initially will require a lot less resources ✌️

saga reply

Yeah I saw that. I‘m a big fan of minimalistic, yet super performant architectures and I‘m just trying to get a feeling on how I could solve this problem. I try to avoid any downtime, whenever possible

Snow-Foxx

Ahh look at all those nice charts and diagrams, that's true server porn lol.

Again thank you very much for your awesome job. We all really appreciate that <3

TheObserver

Can someone give me a hand. I see tons of posts of people talking about a picture in the OP but i see nothing. Am i doing something wrong? Is my connection bad? This seems to be happening quite a lot. For example the meme instance has almost zero pictures but i know just about every post should have one.

Ruud reply

hmm yeah it was gone.. need to investigate..

bankercat3

The entire team is doing an amazing job. Lemmy is getting smoother with each passing day. I hope it keeps growing (and none of you get too burnt out in the process)!

zikk_transport2

I think you can export the dashboard the way it looks to you - into Grafana cloud. Like a snapshot. Click "Share" then "upload" and share the link.

We won't be able to see historical data as it takes only dashboard snapshot with visible data.

Would be cool, isn't it?

jelloeater85 reply

Used some provisioning templates to get started 😁

throwsbooks

🤤

Marxine

This is indeed interesting, thanks again for the service!

SomeoneElse

Such pretty gobbledygook!

namelivia

Nice! That's a nice-looking dashboard, would you mind sharing its JSON config? Thanks!

Mulch5516 reply

it's the popular one on grafana.com - https://grafana.com/grafana/dashboards/1860-node-exporter-full/

namelivia reply

Thanks!

jelloeater85 reply

I could share the template, if ya like.

namelivia reply

Thanks for the offering, but no worries, some user posted it and I found it already

hawkwind

lemmy.management

Sexy loads.

kmartburrito reply

I imagine sexy sax man playing in the background while watching these graphs.

fiat_lux

kbin.social

You just can't beat the dopamine hit from "pointy chaos graph go smooth". Delicious. Great work!

FermatsLastAccount

Does Lemmy have a memory leak?

Ruud reply

Lemmory meak?

Yes at least until yesterday's version...

Panja reply

Lemmory meak?

heh

aussiematt reply

kbin.social

From those graphs, memory usage is very low. Most of it is being used for disk caching, which is what linux does with memory it has no other use for (may as well use it for something).

Ruud reply

Yes, but we still restart the containers every 30 min. I'm gonna see if that's still needed after the recent changes.

Perhyte reply

Ah, so that's the reason for the regular dips in the memory graph I assume? They do indeed seem to be spaced every 30 minutes.

FermatsLastAccount reply

The consistent, sharp dips every 15 minutes made me assume that the container was being restarted.

toki

Kushan reply

These graphs were generated from https://github.com/prometheus/node_exporter (I believe, not my graphs). They're showing system-level data, not lemmy specific data.

jelloeater85 reply

Correct!

NobleFenrir

I have a love and hate relationship with Grafana but it probably feels the same

copylefty

lemmy.fosshost.com

Who you hosting with?

SuperIce reply

They have a dedicated server: https://lemmy.world/post/75556

copylefty reply

lemmy.fosshost.com

I figured haha. I was wondering which company they used

Whois shows Hetzner which answers my question :D

djgenesis

Is that kibana or graphana?

AaronIsFab reply

feddit.uk

That's the grafana icon and it says grafana top right

apotheotic reply

kbin.social

grafana, judging by the logo

jelloeater85 reply

Graphana

Slashzero

hakbox.social

Is there anything Grafana cant do?

I have so many things pumping data “into” Grafana these days I’m surprised they haven’t tried to force me to pay for an enterprise license.

Anyway, thanks for sharing these, @[email protected]. As a performance engineer, I love to see this level of detail and commitment on your part to keep the user experience for lemmy.world at acceptable levels.

remkit reply

It can't make me pancakes.

ccunix reply

https://registry.terraform.io/providers/MNThomson/dominos/latest/docs

Wrong tool for the job, but if you want to order pizza, you can use terraform:

I suppose you could then feed your Terraform runs into Grafana and use it to track your pizza consumption.

JasonDJ reply

vlemmy.net

In the early days of the pandemic…and the early days of my Ansible learning…I set up a playbook to scrape several websites for hand sanitizer and Clorox wipes.

If it found one in stock, it would email my cell phone carriers SMS gateway. Tasker would then make a loud audible alert.

Ran for weeks before it found some in stock. And then it did. At 2am. And again at 2:05, and 2:10, and 2:15…

And it was an error on the shops webpage. It wasn’t actually orderable…once it got in your cart, it wouldn’t let you check out.

Dave reply

Bwahaha:

4) Even if you do want a pizza, you should probably be careful with this provider. In testing, I once nearly ordered every item on the Domino's menu, which would probably have been expensive and embarrassing.

Reminds me of the old adage:

A computer lets you make more mistakes faster than any invention in human history -- with the possible exceptions of hand guns and tequila.

chrundle

I don't know what any of that means, but graag gedaan!

sonovebitch reply

DataIsBeautiful vibe

Steve Anonymous

I am not seeing them. Are they gone?

JATth

Thats ~19 cores pegged at 100%, eating 128GiB of ram (OS disk cache included) and bleeding onto swap. 🤯

Molecular0079 reply

I think you're misreading it. The olive green in the CPU chart is idle. RAM cache taking up most of system memory is also normal on most Linux systems, even on desktop. That cache is freed for applications to use as needed.

JATth reply

Welp, my only calculation was "64 cpu threads * 30% load -> ~19 cores busy", I may be guilty of rounding up too much... The RAM usage is intresting however, since the kernel seems to be caching all it can, to point ejecting uneeded data into swap in order to retain the disk cache. If more ram is reserved by running processes, the (likely pict-rs, database services) disk access times will begin to degrade.

Molecular0079 reply