Spyke

Replies

Comment on

How is lemmyworld so stable?

I'm not an admin, but have followed the sizing discussions around the lemmyverse as closely as I can from my position of lacking first-hand knowledge:

  • lemmy.ml is the biggest instance by user count, but runs on incredibly modest 8-cpu hardware. Their cloud provider doesn't provide any easy scale up options for them, so they can't trivially restart on a bigger VM with their db and disk in place. I suspect this means that instance is going to suffer for a bit as they figure out what to do next.
  • lemmy.world on the other hand was running on a box at least twice as big as lemmy.ml at last count, and I believe they can go quite a bit bigger if they need to.
  • The lemmy.world admins also run mastodon.world and lived through the twitterpocalypse, seeing peak user registrations rates of 4k per hour. So this is not their first rodeo in terms of explosive growth, I'm sure that experience gives them some tricks up their sleeve.
  • The admin team is pretty clearly technically strong. If I recall correctly, ruud is a professional database admin. One of the spooky parts of Lemmy performance-wise is the db. If ruud or others on the admin team custom-tuned their pg setup based on their own analysis of how/why it's slow, they may be getting more performance per CPU cycle than other instances running more stock configs or that are cargo-culting tweaks that aren't optimal for their setup without understanding what makes them work.

I'm surprised that sh.itjust.works isn't growing faster. They also have a hefty hardware setup and seemingly the technical admins to handle big user counts. I wonder if it's a branding problem, where lemmy.world sounds inviting and plausibly serious where sh.itjust.works sounds like clowntown even though it's run by a capable and serious team.

Comment on

How is lemmyworld so stable?

Reply in thread

Fwiw, he has been providing quite a lot of transparency in his posts to this community. He's shared his hardware config in detail, posted maintenance posts with brief descriptions of what he's doing, and replied to comments around specific config tweaks. I haven't catalogued a list of links, but I've seen him do all of these things in the last 48h. It's easy to imagine that all these things could be compiled in real time into a how-to, but it's a pretty big deal just to keep the lights on right now, and pretty difficult to understand whether tweaks that helped your setup are generally applicable or only situationally useful and happen to perform well for your specific setup.

I'm sure we will see more high-performance Lemmy guides in the future, but at this point no one has more than 36h of experience with high-performance Lemmy. Give them a minute to catch up.

Comment on

So how does lemmy make money?

Others have answered the crux of your questions, which is that it's basically donations... either from the admins by providing free access to their server, or by the community through Patreon or whatever.

But to put into context how much money we're talking about...

  • A server to host 1k active users and 5k-10k registered users, you're talking about a 4cpu-8cpu box costing less than $20/mo. Plenty of nerds with decent jobs in wealthy countries are willing to write that off as a donation. This covers 99% of the <1k Lemmy servers in the world.
  • The 10 biggest Lemmy servers still only have hosting costs of $50-$300/mo. That's not nothing, but there are probably 10 wealthy nerds in the world willing to write that much off each month. And those costs can be offset through community donations. These servers support 10k-40k registered users, it doesn't take a ton of donations to cover that modest expense serving that many people.

Now, if you count admin/mod time and expertise, of course... those costs would be huge. But those people either volunteer or get a bit of money from non-profits. But the hardware costs are modest.

Comment on

How do we deal with similar communities on different Lemmy instances?

Say what you will about reddit, at least an established subreddit was the place to gather on the topic, ie r/technology etc.

This premise on which your question is based isn't actually true though. There's /r/technology and also /r/tech. There's /r/DnD and also /r/dndnext. As of recently, for some reason there are like 35 nearly identical amitheasshole subreddits with different names.

I feel like what you're observing is just that reddit communities are mature, people have had time to gravitate to whichever community is more active or has better quality moderation and so there is generally a "winner" sub with more participation because... unless there's a major problem with the bigger sub it tends to be more interesting than a less well-trafficked sub.

Lemmy, in contrast, is still fairly wild-west. Most communities are not very active and have only a few subscribers. If a competing community with an overlapping topic appears, folks are willing to subscribe to it just in case it takes off. If Lemmy continues to retain a healthy number of users, I expect in most cases that consolidation would set in unless there were major differences in moderation policy or something else that splits the community into factions that align across server or community boundaries... and over time you'll see a similar layout of one or two dominant communities and a long tail of tiny ones that few pay attention to.

Comment on

Advantages to selfhosting a Lemmy instance?

Reply in thread

Folks should not use lemmony to bootstrap their subscription count. It's not that hard to hit lemmyverse.net and just manually sub a bunch of stuff you're actually interested in, or to visit a big instance and browse their all feed unauthenticated.

But if you really want to automate community bootstrapping, lemmony is the worst of the scripts that doit because it defaults to subscribing to EVERYTHING, including all the porn, piracy, and hate communities on the most absent-admin'ed under-modded instances in the lemmyverse. Then your instance will mirror all those questionably legal communities and re-serve them to the public unauthenticated internet, creating hosting liability for you. Not to mention being a bad fediverse citizen and creating massive amounts of federation load on the instances forwarding you posts and comments from 20k communities that you don't read.

These two subscription bootstrapping scripts limit you to top subs by default... So you're more likely to be in well-modded territory and just the number of subs is smaller you you can review them and back out of anything sketchy. Subscriber-bot's docs do a good job of explaining the risks and problems of mass-subscription so you know what you're getting into.

Comment on

I understand close to none of the comics in here.

Reply in thread

I think a couple things are in play:

  • Very few people consumed these comics as we are... reading each one in sequence. You'd more likely sporadically encounter them in the funnies section of a physical newspaper. Which was a pretty hit/miss proposition to begin with. No one expected every one to be a winner, and people would routinely skip over stuff that didn't interest them without thinking about it too hard. You're operating under the assumption that Far Side is a classic, but at the time people would just cruise by and think "that comic is stupid, just like 60% of the other stupid comics on this page". And folks were pretty happy to have 40% of comics be a bit funny.
  • What made Far Side a classic was not its consistency. Rather, there were a few strips that became cultural phenomena. Basically a handful of hits that were breakout memes of the 80s and 90s. Colleges used to sell t-shirts of the school for the gifted strip with the kid pushing on the door that says pull, which is pretty accessible and one of those breakout hits.
  • Because of those breakout hit strips, some folks got into Larson's style of humor enough that fewer of his strips were inscrutable to them and he had a lasting market.
  • Other comments point about topical references and those are also a big deal. If someone sees a beans meme with no context 30y from now, it ain't gonna be funny. But a few weeks ago on lemmy, it was part of a contextual zeitgeist that was more or less about "these idiots will upvote anything, I'm one of the idiots... I'll upvote this!" and it kind of captured the exuberant excitement of not knowing what lemmy was but wanting it to be something. Similarly, these strips often weren't intended to last multiple generations. They assumed you were reading the newspaper RIGHT NOW... and so could reference current events very obliquely and still be accessible.

TLDR: Like a stupid meme, many Larson comics require shared transient context we're missing now. Some are also just fukin weird, like cow tools. But some were very accessible and became hugely popular. These mega-star strips cemented Far Side's popularity, and which gave Larson the autonomy to stay weird when he chose. Now we waste time trying to figure out what they meant.

Comment on

I just learned how to collapse threads in Jerboa.

That behavior is about to change. There's already a jerboa update available that changes the thread-hiding trigger from long-tap on the comment header to regular-tap on the comment body (away from links and other tappable bits of the comment).

This update is available on f-droid now, but only came out yesterday I think. Might not be on the play store yet, but get ready for this to change on you.

Comment on

How is lemmyworld so stable?

Reply in thread

It's important to recall that last week the biggest lemmy server in the world ran on a 4-core VM. Anybody that says you can scale from this to reddit overnight with "horizontal scaling" is selling some snake oil. Scaling is hard work and there aren't really any shortcuts. Lemmy is doing pretty well on the curve of how systems tend to handle major waves of adoption.

But that's not your question, you asked if Lemmy can horizontally scale. The answer is yes, but in a limited/finite way. The production docker-compose file that many lemmy installs are based on has 5 components. From the inside out, they are:

  • Postgres: The database, stores most of the data for the other components. Exposes a protocol to accept and return SQL queries and responses.
  • Lemmy: The application server, exposes websockets and http protocols for lemmy clients... also talks to the db.
  • Lemmy-ui: Talks to Lemmy over websockets (for now, they're working to deprecate that soon) and does some fancy dynamic webpage construction.
  • Nginx: Acts as a web proxy. Does https encryption, compression over the wire, could potentially do some static asset caching of images but I didn't see that configured in my skim of the config.
  • Pict-rs: Some kind of image-hosting server.

So... first off... there's 5 layers there that talk to each other over the docker network. So you can definitely use 5 computers to run a lemmy instance. That's a non-zero amount of horizontal scaling. Of those layers, I'm told that lemmy and lemmy-ui are stateles and you can run an arbitrary number of them today. There are ways of scaling nginx using round-robin DNS and other load-balancing mechanisms. So 3 out of the 5 layers scale horizontally.

Pict-rs does not. It can be backed by object storage like S3, and there are lots of object storage systems that scale horizontally. But pict-rs itself seems to still need to be a single instance. But still, that's just one part of lemmy and you can throw it on a giant multicore box backed by scalable object storage. Should take you pretty far.

Which leaves postgres. Right now I believe everyone is running a single postgres instance and scaling it bigger, which is common. But postgres has ways to scale across boxes as well. It supports "read-replicas", where the "main" postgres copies data to the replicas and they serve reads so the leader can focus on handling just the writes. Lemmy doesn't support this kind of advanced request routing today, but Postgres is ready when it can. In the far future, there's also sharding writes across multiple leaders, which is complex and has its downsides but can scale writes quite a lot.

All of which is to say... lemmy isn't built on purely distributed primitives that can each scale horizontally to arbitrary numbers of machines. But there is quite a lot of opportunity to scale out in the current architecture. Why don't people do it more? Because buying a bigger box is 10x-100x easier until it stops being possible, and we haven't hit that point yet.

Comment on

2023 British Grand Prix - Qualifying Results

Sadly the formatting in this post gave me terminal cancer. As my final act, I've fixed the formatting. OP, please, only you can save the others. Fix the post formatting.

  1. Max VERSTAPPEN - Red Bull Racing 1:26.720
  2. Lando NORRIS - McLaren +0.241
  3. Oscar PIASTRI - McLaren +0.372
  4. Charles LECLERC - Ferrari +0.416
  5. Carlos SAINZ - Ferrari +0.428
  6. George RUSSELL - Mercedes +0.435
  7. Lewis HAMILTON - Mercedes +0.491
  8. Alexander ALBON - Williams +0.810
  9. Fernando ALONSO - Aston Martin +0.939
  10. Pierre GASLY - Alpine +0.969
  11. Nico HULKENBERG - Haas F1 Team 1:28.896
  12. Lance STROLL - Aston Martin 1:28.935
  13. Esteban OCON - Alpine 1:28.956
  14. Logan SARGEANT - Williams 1:29.031
  15. Valtteri BOTTAS - Alfa Romeo no time DQ'ed for failing to provide a sufficient fuel sample. See https://www.formula1.com/en/latest/article.bottas-disqualified-from-silverstone-qualifying-with-finn-set-to-start.5smsKl0raawLdfivHEeQgq.html
  16. Sergio PEREZ - Red Bull Racing 1:29.968
  17. Yuki TSUNODA - AlphaTauri 1:30.025
  18. Guanyu ZHOU - Alfa Romeo 1:30.123
  19. Nyck DE VRIES - AlphaTauri 1:30.513
  20. Kevin MAGNUSSEN - Haas F1 Team 1:32.378

Comment on

How are Lemmy.world posts and comments licensed?

The terms-of-use on every lemmy server I've seen would be considered underdeveloped by any lawyer I've ever met. Pragmatically:

  • In the absence of a TOU that requires licensing content to participate, content posted directly to a lemmy server would probably get whatever the default treatment is either in the jurisdiction where the post was made or where the server is hosted (or maybe even that depends on the jurisdiction of each in complex ways). In the US that would mean all content is all-rights-reserved by default.
  • But the poster/commenter isn't going to try to enforce their rights against lemmy. If they didn't want the content there, they wouldn't have put it there. And if they changed their mind they can delete it. And if they refuse to delete it themselves but contact an admin/mod... probably the admin/mod will just delete it for them.
  • If the jurisdiction where the instance is hosted has a safe-harbor framework of some kind (like the US does), that would provide some protection from copyright claims on user-generated content provided the admins followed the requirements to be eligible (which I think most admins do even if they don't know it).
  • Images and media hosted elsewhere but hotlinked from Lemmy may have their own TOU's (like imgur or whatever).

Overall, I'd say most of the lemmyverse has underbaked policy frameworks. The de-facto results function ok pragmatically anyway for what lemmy does on its own. Any scraping/reuse of content from Lemmy would have to navigate a very complex, confusing, and ambiguous licensing landscape. Probably 10y from now, if the Lemmyverse continues to grow, TOU's will be more common and more clear about open-licensing content or leaving it all-rights-reserved but giving lemmy a perpetual irrevocable non-exclusive right to distribute whatever you post here (the latter of which is more or less what's implicitly happening today).

Comment on

*Permanently Deleted*

I see very little discussion about the implications of this for moderation, and it feels to me like they get very sticky. With traditional human-curated multi-reddits, you as a subscriber must engage with the idea that you are choosing to aggregate multiple communities into a single feed, which is intuitive enough, the subscribed feed already works that way.

But by making it automatic, the software hides the fact that it's pulling together discrete feeds from communities with different rules and different moderators. This feels very awkward to me. I'm all in favor of traditional multi-reddits, which can be used to create this sort of feed for yourself. I'm still on the train of "duplicate communities will sort themselves out if community discovery is made much easier and popular communities reliably show up at the top community searches, mostly irrespective of what instance the search was performed on" (obviously defederation takes precedence here).

Comment on

lemmony: A better "All" browsing experience for small Lemmy instances

This is a terrible idea, and borderline irresponsible. One of the key reasons that Lemmy doesn't subscribe by default is to avoid forcing servers with many communities to waste time/CPU delivering messages to servers where no one will read those messages. By subscribing to everything, you're telling all those overloaded servers to waste time sending content to your server that you'll never even see.

  • It also will massively inflate your db by multiple GB/day.
  • It will maximize the chances of you downloading and hosting copyright infringing content and content that may be illegal in your jurisdiction but not in the jurisdiction where it's hosted (loli, etc).

It is much MUCH better to just hit lemmyverse.net and subscribe to 10-100 communities you care about. If script accepted a list of community-urls and automated subscribing to those, that would be super nice. Subscribing to the entire lemmyverse is terrible for your server, for your hosting liability, and for the lemmyverse's performance.

Comment on

Victory 🙌

This, but desktop linux users are on the step for 193rd place while excitedly screaming and holding a third-place sign. Steamdeck users are on the 3rd-place step while calmly playing their deck.

Comment on

Some Lemmy Technical Questions

What’s the network flow like? I’m posting this to the lemmy.ml /asklemmy community, but I’m composing it on the sh.itjust.works interface. I’m assuming sh.itjust.works hands this over to lemmy.ml. How does my browsing work? Is all of my traffic routed through sh.itjust.works?

  • You register your account on sh.itjust.works, that's where all the info you care about resides. Your list of subscribed communities resides there. When you read a post, it gets fetched out of the db on sh.itjust.works (irrespective of where the home instance for that post's community is... when you read it it comes out of the database on your home instance), and when you comment on a post, that gets written to the db on your home instance. Your home instance a standalone fully functioning thing.
  • When you subscribe to a remote community like this one, you tell your home instance "keep up to date with posts and comments for this community and let me know about them. Your home instance asynchronously gets all those updates while you're asleep or whatever so it can show them to you out of its local database when you come back. If more users on sh.itjust.works subscribe to the same community... there's no incremental overhead. All ya'lls instance is ALREADY subscribed to that sub. So other users on your instance can sub to it for free, it's already in the instance's database.

Assuming there’s a mass influx of redditors, what does it look like as things fail?

  • If lemmy.ml (where this community is homed) falls over from being overloaded or just is broken for whatever reason, your instance is unaffected. You can still read posts and make comments. This community however... is affected. New posts and comments for this community might come through intermitently or not at all for you (and everyone in the lemmyverse) because the community's home server isn't working well enough to reliably deliver them over federated replication. You can still read older posts and comments that have already been synced to your home instance, but new ones might not arrive. You might also see weird stuff like being able to see new comments from other sh.itjust.works users on this community, since those get written to your db before getting federated back to the community's home server. But mostly updates from other instances stop or get unreliable.
  • If sh.itjust.works falls over for some reason... well... that sucks for you. You can't log in or browse anything on it. You can still visit this sub at https://lemmy.ml/c/asklemmy/ as long as lemmy.ml is working and you'll be able to see the posts and comments that other accounts make. But you'll be an anonymous read-only browser, you won't be able to post or comment until sh.itjust.works comes back online (or you make a new account elsewhere and lose all your comment history and subscription list).

Are there easy mechanisms to allow me to grab my post history?

There's a github issue for this, but it's not done yet: https://github.com/LemmyNet/lemmy/issues/506.

I’m assuming most (all?) Lemmy servers are hosted in home labs?

I don't think that's a good assumption. lemmy.ml is hosted on OVH, a cloud provider. My home instance on lemmy.world is hosted by admins that run something like a 32 CPU mastodon instance. Most instances with over 100 users are running on some kind of probably modest but "real" cloud instance. The admins are volunteers, but often smart technical folks paying for small but real compute infrastructure.

The idea of Lemmy excites me, but the growth pain that could be coming scares me. Anybody using a CDN in front of their servers? That could be good, but with unconstrained growth, that could be costly, which is very bad.

Anticipating growing pains isn't wrong, it's probably gonna happen. But the devs are gonna find and work on the biggest performance problems so that people can viably run bigger instances, and instance admins are gonna run bigger hardware and ask for donations or run patreons to cover the cost. In my opinion, the bigger worry is that Lemmy will fizzle... not that it will spectacularly explode. As long as people join and contribute and are interested, we'll find a way to improve scalability and performance. The death knell would be if people get bored and leave, but compute capacity won't be the problem in that scenario.