Comment on
Slrpnk.net is down
Reply in thread
poVoq here. After a string of bad luck it looks like I will not be able to get it back up before Tuesday evening. Super annoying.
Comment on
Slrpnk.net is down
Reply in thread
poVoq here. After a string of bad luck it looks like I will not be able to get it back up before Tuesday evening. Super annoying.
Comment on
*Permanently Deleted*
Reply in thread
Slrpnk.net admin here.
The failure seems to have been in the main firewall, if it had been the server itself we could have easily restored it on another server from the backups on another machine. But as it stands, remote access is entirely cut off.
There usually is another person with hardware access, but they are on summer holidays. This seemed like an acceptable risk at the time...
An off-site backup would have been nice of course, but due to the costs involved in running an Lemmy instance of that size on a rented server, it would have not been a great option either.
I have plans to add a KVM to the main firewall via a secondary connection, but even that might have not helped in this case. I'll know more when I have physical access again.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Yeah, but at least I have not disappeared yet 😅
Comment on
Slrpnk.net outage (resolved)
Reply in thread
The feddit.de admin disappeared on a work trip to Japan. At least that is the official story.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Sort of yes. It is a homelab a few people have access to, but currently they are all either on holiday or on extended work related travel. It seemed like an acceptable risk as everything worked smoothly for years.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Thanks, I am trying, but it really is quite annoying to lose access like that.
Comment on
*Permanently Deleted*
Reply in thread
Yeah, the xmpp server is down too. That is something that bugs me quite a bit and I will probably move that one to an external small VPS to retain a more secure backup communication channel.
Comment on
Slrpnk.net is down
Reply in thread
Yeah, I need to work on the busfactor.
The last time it went down was a similar situation and all the stuff I planned to prevent it next time I procastrinated on because the server was stable 🤦
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Glad that at least someone sees a positive side of it, because I sure don't... well at least it gives me a bunch of ideas how to avoid such situations in the future 🤷
Comment on
Slrpnk.net outage (resolved)
Reply in thread
We have a small write up about the hardware on our wiki, but it is also down right now.
I think we will share a post-mortem write up of the actual improvements we will do to avoid this in the future.
One thing I will definitly do is to add a KVM remote management console to one of our server boards and move the main firewall into a VM with hardware passthrough of the NICs (this was anyways planned for a 10gbit network upgrade for the second half of 2025). This way I should be able to reboot and even reinstall the main ingress point remotely, so that only the fiber gateway remains as a failure point that requires physical access.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Well, on the plus side lots of lessons learned and I think I might move at least the xmpp server to an external vps to have a backup communication channel.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Yeah, I know. Complete Murphy's law situation.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Lemmy has a lot of individual parts that don't interact very well with each other, especially the image host part. Futhermore the main UI is quite a mess and we were thinking of switching to an alternative already, but this would further increase the "too many moving parts" issue. Piefed on the otherhand has an integrated and very lightweight UI, which also has some nice additional filtering and moderation features Lemmy currently lacks.
And I personally feel more at home with the Python codebase, as it allows better troubleshooting and more standartized (Flask) tooling. The Rust codebase of Lemmy has a lot of obscure custom stuff and the error messages are extremely obstruse from a sysadmin perspective.
And looking at the performance metrics of Lemmy, the main limiting factor seems to be the Postgres database anyways, so the theoretically slower Python codebase of Piefed should not have much impact.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
We have backup admins, but this seems to be an unexpected hardware issue, and physical access to the servers is more restricted.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Already thinking about how to make the best of it 🤷 Maybe we can use this opportunity to try and migrate to Piefed? I had this in the back of my mind for some time already and Rimu seems optimistic that it is possible.
Comment on
*Permanently Deleted*
Reply in thread
I think now that Piefed has an API for apps, we will see some of them adding support soon. Overall I think the benefits of a Piefed migration outweight the disadvantages, but it remains to be seen if it is doable.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Not necessarily, no. We aim to preserve users, communities and posts/comments. Image uploads might get lost though.
Such an in-place migration will need extensive database operations and likely some support by the Piefed developer (to add support for bcrypt hashed passwords), but we are hopeful to make it happen and maybe this will result in a database migration script other Lemmy instances could also use.
If this turns out to be infeasible, we will stay with Lemmy rather than reset everything.
Comment on
*Permanently Deleted*
Reply in thread
Lets see. I think a relaunch on Piefed might interest some people to come back, and most slrpnk communities are rather niche and will probably stay. /c/climate might move though.
Comment on
Slrpnk.net is down
Reply in thread
A KVM is basically a second computer that acts as a virtual screen and keyboard that can be connected to the server to remote control it as if you were physically in front of it, including hardware resets and access to the bios etc.
Comment on
Slrpnk.net outage (resolved)
Reply in thread
Yeah I had plans to set up something like that, but always other priorities and in this specific case I could maybe access other internal servers but i would need KVM access to reboot the firewall or some other way to cut physical power. And exfiltrating hundreds of GBs of lemmy database wouldn't work over such a small pipe either.