Spyke
selfhosted·SelfhostedbyAverrin

Are all these thousands of lemmy servers useless?

Correct me if I'm wrong. I read ActivityPub standards and dug a little into lemmy sources to understand how federation works. And I'm a bit disappointed. Every server just has a cache and the ability to fetch something from another known server. So if you start your own instance, there is no profit for the whole network until you have a significant piece of auditory (e.g. private instances or servers with no users). Are there any "balancers" to utilize these empty instances? Should we promote (or create in the first place) a way how to passively help lemmy with such fast growth?

View original on lemmy.world
lemmy.world

You are right. On the one hand, it's kind of bad, naive distributed architecture (my day job), it could have been done much better. On the other hand, the more important point is that it demonstrates an alternative to centralized. We'll learn a lot about usage patterns here, get new ideas, and either improve Lemmy or build something better from the ground up. Big thanks to Reddit for driving users this way to test scalability and get much better knowledge of usage.

83
lemmy.nine-hells.net

It's not distributed architecture as you normally think it - it's a decentralised federation. It's an important distinction from your typical distributed architecture app.

34
NX2reply
feddit.de

Can you explain what's the difference?

5

A distributed architecture generally refers to a single application or service designed to be resilient to individual data center failures. For example, Reddit, a centralized application controlled by Reddit itself, operates data centers around the world to process user transactions. In the event of an outage in a specific location, such as California, Reddit would still be able to function because its infrastructure for handling user requests and serving data would automatically switch to other functioning data centers elsewhere, like Nevada, Arizona, or Washington. This is an example of a distributed architecture.

On the other hand, a decentralized federation does not consist of a single application. Instead, it involves a software platform like Lemmy, which is hosted on multiple individual hosts. When a user signs up with one host, they can interact with users from other hosts, but each host manages its own infrastructure. For instance, someone could host a Lemmy instance on an old laptop they found in their closet and name it ballsuckers.com, while another person could host a Lemmy instance in the cloud with a properly designed distributed architecture and name it bingbong.com. Each host is responsible for managing its own instance. Users from both instances can interact with each other, but if, for example, the hard drive of ballsuckers.com were to fail, the entire ballsuckers.com instance would go down. However, this would not affect bingbong.com because its infrastructure is separate and managed independently.

I hope this helps!

8
ultraHQreply
beehaw.org

it could have been done much better.

Care to expand on this point?

20
Terrasquereply
infosec.pub

Disclaimer: I've only looked a bit at the protocols and high levels descriptions of how it works, and this is just my understanding of it. But it seems to track.

let's take .. [email protected] for example. Right now lemmy.world is the Source of Truth on this, which means if you sign up for it on a different host, let's say myawersomeinstance.com, that first contacts lemmy.world, copies over posts, and then subscribes on new posts for that. Actually not 100% sure if lemmy.world contacts myawersomeinstance.com when there's a new post, or myawersomeinstance.com polls lemmy.world.. But anyway, point is, lemmy.world is authority on it. myawersomeinstance.com also have [email protected] data, but it's a copy of it. And lemmy.world is only authority. So if you post something, your server then sends it to lemmy.world and waits a reply. Then lemmy.world contacts all instances that has at least one user following this to tell about the new post. And that new post now exists on a few hundred databases.

The problem is the scaling is whack. Okay, you can have 5000 federated servers with users subscribing to [email protected], but that means lemmy.world needs to update 5000 servers per post, and there'll be 5000x storage used for that post, and ALL 5000 servers contacts lemmy.world to get the new good stuff.

Frankly, it's a scaling nightmare. As for a different approach, you could have private / public keys and sign updates from lemmy.world and allow the other instances to fetch the new data from each other. That would also allow more relaxed caching, since it would be generally lower cost to re-fetch the data. Now you need aggressive caching because you don't want lemmy.world to keel over and die form every server on the planet wanting to hear the latest and greatest posts all the time.

5

Thanks for the in depth write up! I haven't looked too far into the docs or the subscription model, but is this a fault on Lemmy's end, or is this a function of how activity pub handles federated communication? (I'm very new to activity pub/federation, just now reading through the activity pub docs)

I do like your idea of distributed replication via keys,much better than what I had brainstormed

Edit: yeah it does look like it's a function of activity pub, wonder if theres a more scalable federation protocol out there

3
Fizzreply
lemmy.nz

Could lemmy.world put a load balancer in front and use that to direct requests to different instances of lemmy.world? Not sure if that question is dumb I'm not a technical guy.

2

It's not dumb at all, and it's a common scaling technique. But the software needs to support it, and I have no idea if lemmy has support for running multiple instances for one server.

3

Seeing Lemmy groan under influx of new users, but still a much smaller number than established centralized apps made me start wondering how it would scale to a couple of orders magnitude larger numbers. I’ve only started diving into code and architecture, but I’m worried that as the number of instances grow they’ve got an N! connection problem going. This is not a simple problem to fix for a federated system, but it’s got to be addressed eventually.

1

What makes a distributed system good that Lemmy hasn't done? Seems like a pretty robust system to me, seems like scaling issues are on the instance host themself. With Reddit's experience, I don't see how there are issues

17

If there was an easy solution that balanced decent UX and performance, we'd have it by now!

-2

I’m a Software Architect who has worked on some very large data sets and distributed systems. I’ve never used the title “Data Architect” but I meet the definition. So, yes.

1
feddit.de

A network of (“thousands of”) servers has — like most things — pros and cons.

Some of the pros are:

  • The network is more resiliant against outages. If lemmy.ml is down, all other users can still access the network.
  • It's hard to take legal action against the network or to buy it out (like Big Players™ like to do to get rid of potential competitors).
  • It allows various similar or even conflicting moderation policies. The network, i.e. the infrastructure doesn't allow or prohibit any specific opinion (the communities do).
  • It allows for different ways to pay the bills: goodwill of the admin, donaitions, ads, fee or selfhosting. The latter also allows great control over the data so you control your privacy.

Some of the cons are:

  • Content is replicated across servers, which increases the total amount of data stored.
  • Latency and speed suffer.
  • Interoperability with the wider Fediverse is less than 100%, which can create confusion and frustration.
  • Discovery is more difficult.
61

Yeah, and this post about how to use some (a lot of) servers that are doing nothing to participate in "pros" while the top 20 of servers are suffering from these cons.

3
lemmy.world

I just commented on this in another thread: https://lemmy.world/comment/76011

TL;DR: The server-to-client interactions on Lemmy are a lot heavier than the server-to-server interactions, so even if you're just using your own server to interact with communities on other servers, it should still take load off of the servers you would have been using directly.

58

That's news to me. I thought serverto-server interactions would be heavier since other instances will keep fetching contents from your instance once they start federating. I guess it's better to join less populated instances instead of crowding on a single instance.

6
Averrinreply
lemmy.world

Huh, so the problem is about just serving static assets? TBH, I think this problem should be mostly solved, especially for such minimalistic UI. Maybe some (free) CDN? Also UI can use any lemmy server for most of requests (e.g. fetched federated_servers fom a bootstrap node) and use "logged one" only for user actions. I think it isn't a terrible difficult task for the current ui (it has it's own backend).

4
KelsonVreply
lemmy.world

It's not the static assets, it's the database queries

16

To expand, every user has their custom set of content that they want to see, which needs to be queried from the database. Mainly their subscriptions, those will rarely be the same between any two users, and they need to be aggregated according to the sorting method the user wants. Or other personal things, like every user's messages/replies/notifications/settings.

10
lemmy.world

I have my own Lemmy instance running on my home server, but I'm here. "But Bizzle," you may be asking yourself, "why go through all the trouble of configuring your own instance just to wind up on Lemmy.World anyway?"

I'm glad you asked! And the answer is that federation only fetches parent comments. I'm glad Lemmy exists, and I'm going to keep using it, but we need federated sibling comments for this to actually be good, in my opinion.

EDIT: I actually couldn't have been more wrong.

44

I would be happy if my locally posted comment showed up on lemmy.world at all :-) (If this one does, maybe I was just impatient with the initial sync or so)

EDIT: Nevermind, I was too impatient :-)

20

I can attest sibling comments are meant to be federated (example thread from another instance: https://lemmy.world/comment/97687), but I was here at the very start of this instance and there were federation issues. Posts not showing, comments not showing on other instances. It's all very new so there may be technical issues.

14
enzyeshareply
lemmy.sdf.org

I'm not sure I understand what you're saying. Did you mean that child comments are not federated?

13

Seeing as we're viewing this from many different instances, it certainly sounds like a configuration issue on their side

13
Bizzlereply
lemmy.world

I think you're right, I may have misinterpreted the documentation. I attached a screenshot below.

4
LookTherereply
fedia.io

So that's why there were no comments when I searched a post from a remote instance. You need to search the comments individually. It's kinda weird design imo.

3

once you subscribe to a community all new comments and posts get federated, but yeah if you just search for the post then you can't seethe comments unless you search for them

3

Every server just has a cache … there is no profit for the whole network …

I wouldn't say that caching is no profit. Yesterday there were several times when lemmy.ml was struggling or effectively down for some people, but despite complaints over there I could read lemmy.ml communities just fine through my instance. Caching meant that I was isolated from the service interruption, and the lemmy.ml server was isolated from my contribution to its load.

41
gtsreply
lemmy.world

this is an interesting concept - is Lemmy.blahaj.zone your personal instance, closed to registration from the public? I can totally see the draw in doing this and self hosting my own, then I don’t need to worry about performance of someone else’s instances

4

I wish, but lemmy.blahaj.zone is not mine, just one of the larger medium-sized instances that have been around for a while. But some people do do as you describe.

3
lemmy.world

Well, lemmy.ml still needs to serve you the content the first time in order to cache it. And since you're the only person in your instance, you're the only person benefiting from that cache. So you're still exerting at least the same load as if you were browsing lemmy.ml directly.

4

So you’re still exerting at least the same load as if you were browsing lemmy.ml directly.

Not quite accurate... although probably reasonably close.

The activitypub transaction is just a small amount of text. The formatting and display of the page and tracking of user sessions and other transactional data that you would need to handle for the user itself...

Ultimately server->server transactions are much simpler and easier than server->user transactions.

Edit: one user instances are not helping much... But the moment you get 2 or more eyeballs on the same content on a remote instance... it starts to matter. Start a local instance with 10-100 users? You're making a large dent in traffic on the origin (in relation to the content origin) server's usage.

19

But only once, even if you open the content several times. And without transferring all the Web UI with it. And on the sending servers' own terms related to when to send or if at all. On the other hand the server has to send any changes in a subscribed topic, regardless of you being interested in it.

Overall I still would still think it is a benefit to run your own instance.

11
Averrinreply
lemmy.world

As I said, there is no profit from empty instances. Of course, the federation itself is good and fail-proof in this way. But if nobody asks for this cache, it's just an Internet Archive of a sort.

-12

It only takes one user for an instance to not be empty. Every bit of decentralisation adds resilience to the whole. But more decentralisation adds more resilience, so let's try to spread out the communities and users.

12
lemmy.blahaj.zone

I see—you're talking about instances with no users? Yes, those don't help much. Maybe edit the typo "a significant piece of auditory" in the original post, since I guessed that you were talking about instances with users but no communities.

1
Averrinreply
lemmy.world

Yeah, it's nothing about communities. Technically speaking, only the amount of direct HTTP requests matters. If nobody opens your domain, your instance is just spending your money for nothing.

-1

Instance A blocks instance B, which also blocks back. I create a single-userr instance and subscribe to communities on both.

5
fedia.io

I'm quite worried of how well this federation system will work in the long run, especially when more people coming from Rexxit. As people make more post/comments, every federated instance will have to cache more redundant contents from each other, which also will use more storage thus increasing the fee of every instance hoster. There's also another problem of visibility in search engines. Because Lemmy/Kbin can be hosted by anyone, it makes searching on a specific domain impossible, unlike how I can just add "reddit" in the search query. Also since there are multiple Lemmy/Kbin instances, there's a chance there'll be similar communities spread over, fragmenting the communities even further. Until they can find a way to fix those problem, I don't think federation is suited for large scale communities.

As for fragmentation problem, maybe adding a global search for communities like this will help reducing fragmentation. Users can still make their own community in their instance, while other people who don't need to can easily find the community they want.

40
altarireply
lemmy.world

After a day of use, I'm incredibly disappointed.

The fragmentation problems, and lack of cohesive community discovery (or even apparently any agreed standards for sharing communities etc. across instances in a way the most popular app can reliably recognise as being a community and not an external link or mailto address) will make Lemmy an absolute non-starter for 99% of potential users.

I'm sure there are solutions, but as it stands I can't see Lemmy gaining any widespread adoption without a significant leap in user friendliness in regard to how federated instances are implemented and managed.

25
lemmy.world

I don't see fragmentation as a problem at all. The number of total subscribers is published when doing a search and is the ultimate primary consolidator just like reddit. There were many redundant subs on reddit for any given subject, they just had no patronage. The process of establishing primacy takes time. Three days ago .world had less than 1k users and all of Lemmy had less active users than half those present on any one of several instances right now. The .world instance is 10 days old.

The priorities of decentralized service will not align with the antithesis model. I see a minor complexity barrier to entry as a positive filter for some of the worst quality users.

48

Agreed. If decentralized doesn't appeal to certain people then this isn't for them. I came from reddit. I'm not trying to make this into reddit.

20

I haven't even been using it for day, and I share your disappointment. However, I understand that Lemmy is in its infancy. There are huge UX hurdles to overcome, and it's a lot for two developers to carry. The hope is that more devs will join, and make a good UX -- For what it's worth, the UI is quite neat IMHO, it's just the UX with regards to federation and discoverability.

Having a ways to add instances and then replicate community lists would be a start. Having to manually fiddle with URLs of other communities is weird.

19
perkelereply
sopuli.xyz

What there needs to be is concerted development focus on fixing these quality of life issues. Unfortunately, there was not much time allowed for this to happen seeing as it was about a week or two from the announcement to the start of the blackout. These things take time and development time isn't always available.

17

True, but the big surge in popularity should help with getting more positive pressure on the community devs to improve things rapidly.

2

I initially felt the same way after a day or so of use, however once I got the app and figured out the clunkyness and rough edges it's really grown on me.

You're definitely right about discoverability but you're probably comparing this to Reddit that's had like 15 years to mature and sort it all out. Lemmy is made by like 2 developers for free and it's pretty impressive already what they've achieved.

I think if you give it more time and lower your expectations a little you'll appreciate it more. And you don't have to leave Reddit or whatever either, you can just use both and see what happens too.

9

I don't think having a federated r/all would properly work in a federated network, where popular posts comes at the top of the community.

6

Redundancy is also happening on centralized servers. Also it is text so I would not be too concerned.

10
suguha.net

I've created my own instance in order to not create more load on others and it took a minute to realise I needed to populate it myself, would be nice to have a default view aggregating popular posts etc. across instances. But maybe I'm just asking for too much hehe

37
claymediareply
lemmy.world

That's an interesting idea. Maybe you could even choose the "default subs" for your instance from across lemmy.

15

That would be awesome I think! I am toying with the idea of building a proper instance, put my devops skills to use, but at the same time, few features missing!

9
Deboreply
lemmy.world

Seems to me that this is the ONLY way that a user (let's call them creators for the sake of this convo) can guarantee that their efforts are always both protected AND remain available as the creator sees fit.

If I start my own lemmy server and I'm the ONLY user, it would stand to reason that if I deprecated that server that all of my posts EVERYWHERE would instantly cease to exist (with exception of quoted posts in other's comments). That gives me 100% of control over MY specific content contribution to this platform. So, if in the future lemmy goes the way of reddit, it's as simple as us staging a walk-out just like we did to reddit, except NOTHING would show up on reddit anymore.

Am I missing something here? For true creators, spinning up a cheap server to host is acceptable if not expected if you want any type of control after the "Post" button is pressed.

8
Deboreply
lemmy.world

Right? Imagine having EVERYTHING you EVER contribute online be 100% controlled by YOU and YOU alone. There’s a framework here for a whole host of businesses…

Imagine just one where a subject matter expert can sell their ‘membership’ to a platform based on the strength of their history with other platforms…. Almost like a paid Wikipedia where the value of the platform is conveyed to the users that created the content. huh, imagine that.

8

If I have understood how lemmy works, the post and comment would be on the instance hosting the community. Your server would just post it to the community's server on your behalf.

2

This seems like a great solution, does it work this way? I admit I've not delved too much into how federation works, but I assumed that when content gets pulled from one instance to another it gets replicated to that other instance, so deletion becomes problematic.

In any case, being in complete control of one own's online presence seems like a great way forward.

1

I did the exact same thing. Ended up looking up the more popular communities on the bigger instances and searched for them on mine to index them.

I wish there was an easier way, but for now there isn't.

4
lemmy.jtmn.dev

What's the alternative? You go full-banana decentralised or mega-site Reddit. I think Lemmy is a nice middle ground

34
lemmy.zip

This has definitely been a problem with communities being created on the bigger instances and not utilising smaller instances. Happy for someone to say I'm wrong etc, but I think there would be merit in capping instances to x number of users or communities, to force the user base to spread out.

Also, the way signups work, (ie you find a community you like then click sign up but that signs you up to that instance), further exacerbates the issue and the confusion around how federation works. The sign up links on each instance should lead either to a page with an instance finder, or to a random instance that matches the profile of, and is already federated with, the instance you were on. Otherwise the larger instances have a monopoly and are just going to lead to a bad user experience when they can't cope with the traffic.

It's a self defeating prophecy if users only want to sign up to the instances with the big communities, because then everyone is going to keep creating communities there and nobody is going to want to join a smaller instance.

I might be talking nonsense and am happy to be told why that is all wrong :)

32

Yes, there should be instance caps, and they should be visible to users.

That way users can scale, choose, without much thinking.

This same techinque works everywhere, for example MMO games. You have availability visible and choose servers according to it.

This would fix scaling partially without much technical changes.

13

If that cap idea was to exist, it would make sense to have it based on the balance of users across the federated servers, so of there's enough with a similar amount it raises the cap

3

I'm happy to use one of my other accounts but those instances aren't federated as largely as world. If I can look at all from any instance and choose what communities to avoid I would do that.

2
lemmy.world

It's a bit worse than that actually. I'm now seeing several communities with exactly the same name that originate on different servers - so clearly Lemmy doesn't have a rule about duplication once you cross a server boundary. That's going to get unwieldy quite fast particularly if, I dunno, "Aww" gets popular on two separate servers at the same time - I guess I'll have to subscribe to both...

28
Ataraxiareply
lemmy.world

Well one instance shouldn't monopolize a community. If it takes a dump on one instance at least it exists elsewhere. If I want to start up my own cat community I don't see why that's an issue.

61
Saik0reply
lemmy.saik0.com

I agree, I don't particularly see this as an actual issue... Nothing stops you from subscribing to both.

Just like there could be a [email protected] and a [email protected]. Nobody is confused with emails when it comes to this... The difference is that it's slightly more work than reddit because r/aww is one particular thing and it's assumed we're talking about Reddit because of it's unique format. Here it's just c/aww on lemmy.ml, but that's a bit of the point of the ![email protected] structure of naming.

I LOVE that there's ![email protected] and ![email protected]. Different communities ran by different groups will end up with different content. Then I can shop for the content I want myself.

Nobody can singularly own the name. I always found that to be a big problem on reddit. r/trees comes to mind, if there was an actual arborist community that want r/trees, well they were fucked. And that's kind of jacked. This way it doesn't matter. Just pick a different instance that doesn't already have c/trees and post there... or better, start your own instance to host it.

I don't know... in the future people could even start up instances of lemmy on domains like lemmy.jobs, lemmy.help or lemmy.hobby to aggregate major communities based on topics. lemmy.jobs for instance could be an instance that houses professional the arborist and the domain would make it clear the intent. Or even better... drop the lemmy all together and register jobs.social or similarly descriptive domain names.

I know we're all a hodge-podge of domains now because a lot of us are just spinning up instance on domains we already have... but the potential is there.

51
Masterreply
lemmy.world

This problem existed on reddit too still. You have r/games r/game r/gamers r/gamenews r/gamernews etc. All trying to do the exact same thing.

38
coalbusreply
lemmy.world

I think this comment convinced me. Because you're right, on Reddit there were always offshoot communities that were essentially the same exact thing just of different sizes and run by different people. There'll probably always be the "most popular" one, and then several offshoots for the same topic but perhaps a better sense of community because it's hundreds or thousands of users vs millions or tens of millions of users.

Remembering the exact instance and community name combinations will take a little extra effort, but not significantly and subscribing negates that mostly.

15
haxasaurreply
lemmy.world

The one that pissed me off a lot is the misspelling of r/politcs trying to mimic r/politics. And i messaged the mods asking why they existed and was just either oblivious or trolled with their answer of "to talk politics".

4

Took me forever to realize I was subscribed to an r/mildlyinteresting and an r/mildyinteresting. Just figured they were the same thing and didn't affect me much.

2
SickIcarusreply
sh.itjust.works

r/trees comes to mind, if there was an actual arborist community that want r/trees, well they were fucked.

There was. They ended up with, I think, /r/marijuana_enthusiasts or something like that. It was quite funny to both sides, at least it was like 15 years ago.

19
slrpnk.net

As a former marijuanaenthusiast subscriber it was funny for about 5 minutes and then annoying for the next 10 years. I am considering making a similar community here but /c/trees has already been claimed by the other community. So I have to decide if I want to deal with the confusion or name it something else.

2

Yeah I can see that.

Maybe name it /c/arborists? 🤷‍♂️

Shame that that baggage would have to carry over, but the flip side is can you imagine the confusion if it didn’t.

1

Yup. I think it's fun and it makes me explore more. Makes me check out different instances and actually actively look for things I like instead of passive doomscrolling.

6

I'm not sure how this would work, but what about the concept of cross-instance communities? For users it would be a bit like a multi-reddit where you group various communities together into one aggregate list but when posting content you'd have to choose which instance it lands on. Mods would have to agree on a set of rules (and you'd have some communities split off due to differences), but otherwise it seems somewhat plausible.

That would be one way to solve the problem of every instance having a version of one specific type of community.

1

Wouldnt @ solve this? Or some form of unique designator.

6
aussie.zone

This is a feature, not a bug. But we definitely need a solution to make subscribing/coalescing them easier for users. Mastodon allows subscribing to topics (hashtags) - I think something similar is needed here, but that will evolve naturally over time.

56

Yes. This is 100% necessary. Otherwise giant communities would be built and probably all on lemmy.ml

9
lemmy.world

Maybe have them coalesce based on channel name, but have local mods on each server. It'd be great if you could share moderation between trusted servers or trusted mods on different servers as well (this could be on a per-community or per-server basis).

8

you don't have to have an account on the same lemmy server to mod a community on it. The creator of the community can add anybody as a mod.

7
lemmy.world

I don't get argument about duplicates. The same situation was on reddit - you've got few, sometimes more, subs about same topic. You could subscribe to whichever you wanted. Why on Lemmy this is suddenly a problem?

54
kadureply
lemmy.world

I think users are still having trouble with the mental model for browsing Lemmy.

The first interaction with the service is already fragmented - you need to choose where to create an account and start browsing. Even though you can browse communities from other servers, people are now seeing them through the lens of "fragmented" "my server vs other server" and that creates the illusion that these duplicates are somehow a huge issue.

But duplicates can actually be quite useful - a community called "memes" on Lemmy.world could attend to a different audience than a community also called "Memes" but made in an instance entirely in French.

Also, if two instances have two communities you enjoy, with the same name... Subscribe to both? Nothing stops you from doing that. It's okay. Reddit had "me_irl" and "meirl" which were the exact same, but with different mods, a relatively similar number of subscribers and quite honestly the same content. I didn't know the actual difference between the two, and I still do not know - I just subscribed to both and kept getting depressing memes to cry before going to sleep. No issues.

59
chiisanareply
lemmy.world

Are there ways to manage lists of such? For example, on the former platform that doesn't deserve a call out, you can do "me_irl+meirl" and aggregate both into one feed. This makes reading the (albeit potentially cross posted) content in a unified feed much easier.

Another similar point I'm having a hard time getting over is that with a centralized platform, it is easy to go to "Subject A", and see everything on that subject. However, now I need to see "Subject [email protected]", "Subject [email protected]", "Subject [email protected]"... Yes, I could subscribe to them all, but this ultimately end up creating a noisy home feed with also "Subject [email protected]", "Subject [email protected]", "Subject [email protected]", "Subject [email protected]", ... etc. all baked into one feed, as opposed to just something focused on "Subject A".

Lastly, discoverability leaves a lot of room for desire. Today, I'm fairly new to Lemmy, I am actively seeking out communities that I might be interested in, across multiple popular instances, and hoping that federation is enabled between the two instances. Tomorrow, I'd find that I'm subscribed to too many (see the noisy main feed issue above), and I'd remove a bunch. Next week, am I likely to go to the Join Lemmy directory to find new instances, and add "duplicate" communities from newly popular instances? I think not.

I think the long term survival of the platform (to expand beyond just us tech nerds that hate the former platform) will depend a lot on streamlining this workflow to make content discovery much more consistent. Even a simple option where a pseudo "!Community@" (with no instance) feed that aggregates all the "!Community" regardless of instance that you've subscribed to, might go a long way.

14
beehaw.org

Discovery really has been the biggest drawback for me. The r/system combined with wikis and sidebars made it very easy to find interesting things.

That's lacking in lemmy so far. Which, it isn't a bad thing, barriers to entry have benefits. But from a user perspective, trying to replace reddit, the difficulty in navigating and finding things is frustrating.

But I'm coming from reddit, and they aren't meant to be the same. The issues are part of what makes it next to impossible for what happened there to happen in a federated system. And I'm so fucking sick of corporate bullshit ruining good things . I figure that lemmy will catch up in feature parity soon enough, and there's bound to be apps that make it easier to use at some point.

I just wish I had the resources to run a server myself.

13
beehaw.org

Well, by just searching topics in the search bar you can typically find instances related to the search. You need to click the "chain" icon rather than the "federated star" icon to view the post "from your instance" and stay on your personal account.

4
SickIcarusreply
sh.itjust.works

You need to click the “chain” icon rather than the “federated star” icon to view the post “from your instance” and stay on your personal account.

Woah. I’ve been clicking the star the whole time. This may make things a looot easier.

6

It does! That way, you can immediately subscribe to the community regardless of what instance it's on.

5

The feature I'll miss the most from Reddit is multireddits. I wish there were a way to create multilemmys.

Even a simple option where a pseudo “!Community@” (with no instance) feed that aggregates all the “!Community” regardless of instance that you’ve subscribed to, might go a long way.

I think we should have both this and multilemmys. For example, I would group all !gaming@... communities in an pseudo-community, then put it in a multilemmy with other gaming communities (Linux gaming, PC gaming, etc).

8

Yeah, I really do think we need both:

!gaming@... or !gaming@ which aggregates [email protected], [email protected], ... etc. that I've subscribed to into a single feed; and

#gaming which I can put !gaming@..., !pcgaming@..., and !consolegaming@... into a single collection.

This way we'd get the flexibility to pick and choose what we'd want to see more easily.

2
LookTherereply
fedia.io

That's a really good analogy. Still, there needs to be an easier way to search remote communities. Copy pasting community links in search bar is really clunky.

14

It would be really nice if the search would show all communities in federated servers, and maybe communities in servers federated with those severs, etc.

1

undefined> I just subscribed to both and kept getting depressing memes to cry before going to sleep. No issues.

Hahahaha I’m sorry but the way you snuck this in at the end just killed me. But you have a valid point. Every platform like Reddit/Lemmy has duplicates. That is kind of the point.

8

That's true, and the point I guess. You sub to all relevant communities and the overlap isn't an issue because it's different communities with different instances making content with others interacting through federation. The "subreddit" is diversified to the top communities in all of the highest subscribed instances. It's just the nature of the beast, but once you find all the top comms it probably doesn't seem so bad.

8

I don't think that there are thousands. The fediverse stat's show about 300 servers, 200 or so made in the last week.

At that rate, it is not too bad. I expect there will be a plateau at some point, relatively soon, where the need for new ones stop, and the experimental ones disappear.

28

They are not useless, if the users would actually spread out among them. Each server has its limits.

22
Averrinreply
lemmy.world

I think a amall hit of p2p can be useful. Maybe as an addition layer. I worked a lot with tendermint nodes (cryptocurrency) and i saw pretty effective solutions.

4
Saik0reply
lemmy.saik0.com

I worked a lot with tendermint nodes (cryptocurrency)

If you worked with crypto, I think you would understand that ALL crypto is federated, not P2P. You need full nodes to communicate with in order to validate transactions. This is fundamentally federation, anyone can spin up their own full node and participate however they want.

Same is happening here with ActivityPub instead of block chain transactions.

9
Averrinreply
lemmy.world

I know what you mean, but all nodes are equal, they are fully participating (stay aside validators). I mean every every node handles every transaction and can be faster than another (it doesn't matter due to validation scheme, but technically speaking all nodes handling every user action)

1
Saik0reply
lemmy.saik0.com

You clearly don't know what I mean. While the each node is storing the same block chain... not every node is handling the same amount of traffic. Nodes can choose which nodes to peer with. I, for example, have a full node that only accepts traffic from a handful of trusted nodes and from my self-hosted btcpay instance.

Users can choose what nodes they wish to validate their wallets to. Many companies and exchanges only validate against their own full nodes. This is not equal. There is no sense of "equal" here. Aside from eventually the "correct" block chain is eventually agreed upon by the majority of the network.

This is exactly the same in this case... lemmy instances that want to peer, will peer. The activitypub standard will broadcast updates just like a block solve notification to the blockchain network. It's up to each node/peer what they do with it.

7

Ok, you are right about peering, I tried to get more peers to be faster, but it isn't necessary. I didn't find anything about ActivityPub broadcasting, but if it's true... so, yeah, having rpc p2p connection doesn't make the whole system less federated. But still, usually crypto clients has lists of nodes (or api balancers) for faster handling.

4
omg.qa

Privacy wise for me it is more convenient to run my own instance and have my own private communities.

13

Yep and you're not subject to the whims of your home instance admins deciding what they will and won't federate with, and how you must behave across the fediverse.

1

Are all these thousands of lemmy servers useless?

almost. It's actually worse than that - when you subscribe to a community from your server it will fetch like 20 posts and that's it, you'll get only new stuff after that, so there's no possibility to do a full mirror of selfhosted, for example, if you started your instance today and didn't fetch posts and comments manually.

ActivityPub per se is just a spec on s2s/s2c communication, which is not a great thing since in many cases it assumes single source of truth, which potentially puts huge load on more popular instances.

I think a quick and dirty hack to this could be the following - each linked instance may maintain cache of announces (so there would be benefit of just forwarding original http signed requests w/o being afraid of malicious actor), which your instance could pull, this way you could populate your mirror without overloading the original source.
Distributed activities propagation though... Let's say there are some design steps involved to make this truly distributed, however I feel like it's possible.

11
lemmyfly.org

every instance is sharing in the traffic to browse the fediverse. Not one service is responsible for serving content, you (the instance admin) are only serving for your members.

The downside of this is there is a huge amount of replicated data stored everywhere. Content of popular communities will be scraped by and stored on many many servers, filling up servers and increasing storage and bandwith bills for all those servers

11
deadcycloreply
lemmy.world

I'm not sure your second paragraph is correct. First of all, it's "just in time" so will only be replicated if somebody on that instance is following it. But more importantly, I read a statement from a server owner somewhere that the software purges older content regularly (and refetches is "just in time" when somebody tries to view the old content) to keep storage size down.

6
feddit.nl

If this is the case, then wouldn't his fitst paragraph be incorrect also? Because if it is "just in time" with quick purging, the main server still has to constantly serve the instance server the content. It would only be beneficial if many instance users are trying to view the exact same content at around the same time (so for the "massive" communities maybe?)

3

From my understanding you are correct. Each instance is responsible for serving all of the content of the communities created on it. So many small instances with a smaller amount of communities = good, a few huge instances with lots of communities = bad.

4
Averrinreply
lemmy.world

Please elaborate, how is "every instance is sharing in the traffic to browse the fediverse". I didn't find it nor in AP standards, nor in activitypub_federation lib docs. If there is some mechanisms of balancing inside the lemmy's code, would you mind pointing it for me?

5
lemmyfly.org

Looking into the database, it contains many thousands of posts. I’m assuming this is stored in the local db for serving it to instance members. So when you open a post from instance B on instance A, A fetches post-data from B, stores it in A database, then serve the content from db A to the browser

3

Yes, you are right. If this instance has members. A server will actively fetch "foreign" content and cache when this instance's user asks. But aside of top 10 servers, there is no profit of having more until they have a couple of dozens of users. If any server would have been able to "delegate" request handling to less busy servers, it will be a solution for this uneven load.

1

The replication isn't all that bad. Images stick around in their local instance, the federated data is all JSON payloads and metadata. Yes it will pile up over time, but only instances with hundreds of users and thousands of indexed communities are at risk of massive storage needs.

3

Amen on matrix. Federating with most popular rooms on matrix.org basically brings my server to it's knees for a week trying to play catch up between federating users and their profile pictures and decrypting years of chat history. On my first go I made the mistake of trying to join #matrix:matrix.org and I had to wipe the entire server clean to get it back.

4

Check out matrix 2.0 on youtube. It looks promising.

1
Averrinreply
lemmy.world

> Because you can still access all content no matter where you are.

If you know how and want to do it. Unfortunately, it isn't the way how most people think.

3

And fortunately, there's still time for people's minds to change. Federation and decentralization are things that aren't really advertised or mainstream yet so people still don't have a clue what it is. However, we do know how those things work, so I guess it's kind of up to us to help people know about how said things work.

4
Averrinreply
lemmy.world

I'd like to help with this improvement. Do you know any plans for it? Honestly, looks like that there is no "lemmy committee" and even lemmy's developers cannot organize something like this. Any ideas?

0
lemmy.perthchat.org

Nothing good can come out of a federation committee. Invite whoever you want wherever you want and give a little bias to smaller instances, and it should balance itself out.

4
Averrinreply
lemmy.world

I dont suggest adding a centralization =) I see two possible and actionable directions:

  1. Create tech solution to balance load through available resources
  2. Spread the word that there are better ways to spend your money and passion helping lemmy. I know, my "engineering manager" bias tends to see process problems in places where are no problems. But I dont want to see how the awesome idea is dying because of lack of basic management and foreseeing.
-1
ericjmoreyreply
lemmy.world

I'm confused about what you want. Why should I care about lemmy.ml being over run because they didn't put enough resources into their instance?

3

Because we are here because of content, made by users. I'm thinking about whole "lemmy-verse". If users encounter issues, they just stop using the service. You as an instance owner can choose to not participate. But if somebody already thinks rhat they helps, why not use it?

-2
lemmy.perthchat.org

And all major chat rooms are hosted there too. That's not how matrix works. Rooms are fully replicated to all participating servers, no single server acts as a point of failure. It also has a resolution system for divergent states; no server has authority over any other. You can add local aliases for any room you're in on your local server without permission from room admins. (... Actually I'd have to doublecheck that. It might be one of the permissions.) You don't even need an alias, the room ID (eg. !BZVTUuEiNmRcbFeLeI:matrix.org) and a server that's currently participating in the room (ie. has that room ID) is enough to join. That's really what aliases are; mapping a name to a room ID on a particular server that is known to be participating. Room IDs include a server name as an anti-collision mechanism, it has no other significance.

It feels like 95% or more are on matrix.org instance. I think it's a bit lower than that. midov is actually pretty big. So's T-AC. (And then there's joe's room directory, which I think is bigger than morg's.) But it's well over 50%, which is disgustingly high. morg should close registrations, and purge dormant accounts/aliases.

a weekly cap for new registrations as an option Extremely good idea.

like MMOs. Funny you mention that, I had a laugh on Witch Weapon when I discovered that my server was frequently getting 10-100x more points in ranking events than other servers (and I was frequently placing top 50 without any of the busted characters). I'd simply chosen the recommended server, which I had assumed was the smallest at the time. Turns out it was just a static value so it had like 50x the population of the other two.

I still miss that game.

1
taladarreply
sh.itjust.works

Trying to host my own Synapse server once for my own use and seeing how it was chewing through every bit of resources on my server while providing an unusable slow experience has pretty much ruined Matrix as a whole for me as well as contributed significantly to my dislike for Python.

2
Samreply
lemmy.ca

How long ago was this? Its in a much better state now.

1
taladarreply
sh.itjust.works

A few years ago (3-4 maybe). It wasn't just a bit slow either, more like the server using the full 16 gigabytes of RAM and constantly at 100% CPU and channels not even being usable to read them 20 minutes after joining.

1
Samreply
lemmy.ca

That was my experience as well, and I completely wrote it off. After having gone back to it, and after watching the matrix 2.0 preview on youtube, things are a lot better than they were, and looking a lot better in the future.

1

It’s kinda the same issue that some games have, like MMOs. People tend to make new accounts on the biggest and overloaded servers because there is the most activity even though stability could be an issue, or login queues.

It's funny you mention MMOs, because FFXIV has a system that i'm now realizing feels like federated websites.

You have your home world(server) where your character was created and is stored server side, but you can matchmake within your data center as well as visit other servers in your data center.

And then you can also temporarily transfer to different data centers (though the implementation is clunky and has a few restrictions)

1

Lemmy is very similar to Matrix. By the way, there are channels in Matrix. There are communities on Telegram....

7
lemmy.sbs

I just spun up my own instance as well and it does feel a bit like I'm just pulling from the biggest instances and feeding my own without really being able to give much back.

2

You're reducing load on the bigger instances by not using them directly, which is giving something back

2

A very valid question I am also interested in knowing. I’m wondering how much management it will be for me - who created his own instance and am having to find all the other communities myself. Or if my instance is doing anything but providing me a unique instance address and name.

2
lemmy.nz

Does anyone have any idea what specs are required to run alemmy server, how about the "big" ones at the moment, just to get an idea of the scale of the challenge?

1

Here is a post from the admin of sh.itjust.works talking about his setup that is currently handling 2k+ users pretty well, and according to the latest update 10 minutes ago still has a lot more spare capacity to go.

I've seen some posts about self-hosted Lemmy servers that run very well on the cheapest VPS tiers that hosting services can provide (the 2-5 USD/month range).

3

Based on the bit of research I have done, along with creating https://lemmyonline.com/

It seems you are correct. A small handful of servers contains roughly 95% of the user-base.

I think the intended way for this to work, certain communities can be hosted on their own servers. However, it appears most of the popular communities migrating away from reddit, all flocked to lemmy.world, which is likely contributing to it being overloaded.

1

I've suggested a routing protocol to the lemmy devs - to use federated instances to route all the messages to other federated instances. The idea was received with some interest, but it seems that people believe that there's still a ton of performance that can be squeezed out from the current architecture through optimisations.

1
lemmy.world

Since Lemmy instance are not backed by commercial interest, but rather by nice volunteers and donors that have money and time to spare, they will be heavily affected by economic downturns (we still can see commercial interests still affect users negatively tho with reddit). Here are my thoughts on the matter:

  • as far as I understand the owner of the domain: https://lemmy.world even has to pay for this fancy domain name in the DNS system ... every month subscription service style
    • (and tbh I hate the Domain name system) why should I fund it with my own money?
    • if you hosted with an onion site over tor that expenditure would not exist, but how would users discover your site then? Let me know if you know something about this
  • in times of deflation (meaning money becomes worth more, spending some money on a self hosted lemmy instance becomes nonsensical)
  • tbh if I hosted a lemmy instance and the users of my instance posted high quality content in quantity I would use it to train my own LLM, that would at least create some economic incentive for me to host such a page ... but managing spam and bots will be HARD

That is why you should always back up your comments on your personal device, would be nice if lemmy had an automated way of doing this (I should look into this more)

-1
cstinereply

The thing you’re overlooking is that for a lot of the people hosting small instances, this is a hobby.

Speaking for myself, the cost of a domain is basically nothing, and adding Lemmy to my hosting setup was zero - I already have more ram, cpu, and disk space than I’d ever need for this instance.

Financial incentives are not the only thing people care about, and until relatively recently weren’t the general default purpose of online social spaces.

13
claymediareply
lemmy.world

Ok, sure. But what if your instance became popular and started costing you hundreds per month? Or in a couple years and you lose interest, do you keep paying for it? What happens to all of the content that users created on your instance?

0

Well, I'm very explicitly not running it with the intent of it getting larger than maybe a couple dozen people I know who are interested. I'm not really interested in content moderation at any scale, let alone with random people I don't know from the internet. (My most recent job was dealing with content & abuse for a large cloud provider, and I have zero interest in picking up shitpost babysitting if it's avoidable.)

I'm otherwise going with the Mastodon Server Covenant as the basic guidelines: I've got a trusted friend I'm going to add as a 2nd admin, doing backups nightly, and at least a 90 day notice if/when I decide to stop hosting this.

I'd happily transfer the domain & data to anyone who wants to continue to admin it, or ask the community members what they want done.

I'll admit that makes WAY less sense if I wanted to run an instance with thousands of users, but that's very much not the goal.

8