Reddit: 'We Are in the Early Stages of Monetizing Our User Base'
Reddit said in a filing to the Securities and Exchange Commission that its users’ posts are “a valuable source of conversation data and knowledge” that has been and will continue to be an important mechanism for training AI and large language models. The filing also states that the company believes “we are in the early stages of monetizing our user base,” and proceeds to say that it will continue to sell users’ content to companies that want to train LLMs and that it will also begin “increased use of artificial intelligence in our advertising solutions.”
The long-awaited S-1 filing reveals much of what Reddit users knew and feared: That many of the changes the company has made over the last year in the leadup to an IPO are focused on exerting control over the site, sanitizing parts of the platform, and monetizing user data.
Posting here because of the privacy implications of all this, but I wonder if at some point there should be an "Enshittification" community :-)
https://www.404media.co/reddit-we-are-in-the-early-stages-of-monetizing-our-user-base-2/Open linkView original on infosec.pub
Reddit has long had an issue with confidently providing false statements as fact. Sometimes I would come along a question that I was well educated on, and the top voted responses were all very clearly wrong, but sounded correct to someone who didn't know better. This made me question all the other posts that I had believed without knowing enough to tell otherwise.
Llms also have the same issue of confidently telling lies that sound true. Training on Reddit will only make this worse.
Yeah all of my most down voted reddit comments were the ones where I replied about something I'm an actual expert in. Scary stuff
The voting system let's people push comments to the top that they want to be true, not necessarily things that are true.
There's also the issue of reddit comment sorting being entirely dominated by time. In something like 90% of posts, the top comment is one of the first five. Literally all you have to do is just comment first, and it'll likely be the top.
I noticed from the beginning that Lemmy's default comment sorting improves visibility of a variety of comments including newer ones. Gee, I wonder who could have helped make it that way ;)
Over the years I ended up getting a Reddit habit of replying to one of the top comments so that it could attain some visibility. I still do sometimes but less often on Lemmy.
Because it's like old forums where the first person to comment gets engagement
Some of the better subreddits tried to mix it up and change how this affected upvotes. There was Muxing,..etc etc.. But then,.. Spez came in (back) and didn’t give af about anything at all except money.
First time I'm hearing about this, can you give any links? Maybe we could use something similar in lemmy
Muxing upvotes , “balances”, etc.
Even hiding all upvotes of every comment thread until ~12 hrs after posting.
This tends to give more influence to people who spend more time on it and write more. And they are less likely to be subject matter experts.
I strongly agree with this comment. To show my appreciation, you have my upvote. Had I only agreed a little bit, I might have not voted at all. If that comment had made me angry, I might have downvoted.
Actually calling these things votes instead of likes makes a lot of sense. I might not like a comment, but I might want it to be higher. I might not hate another comment, but I might want it to be lower because of other reasons.
Downvoting was always just fast food validation that you're better than someone else without having to actually back it up.
I spent 20 years as a producer, developer, and project manager in the lottery and games industry.
Trying to explain how lottery and games work to people and have them hear me makes me want to cry.
Fascinating! I’d love to hear a little about it, if you don’t mind.
Certainly, I'm always happy to share with inquisitive minds.
Is there any particular question you'd like me to address?
Not really, I never paid much mind to it. I’m curious about the whole industry I guess, or anything you’d like to share or set the record straight about.
Oh there's lots I have to set the record straight about and there's lots I could talk about, but without being asked a specific question that would just leave me to write an open-ended essay and I'm not up for it right now
@Fubarberry yes I saw this a lot too. Highly upvoted confidently incorrect comments, with the real answer or an answer debunking them with links to factual sources less upvoted.
Happened to me as well.
I am a lawyer and I would get down voted for posts explaining the law that contained citations to the actual applicable statute if people didn't like the statute. Using reddit up votes as a measure of correctness is fundamentally a dumb idea.
@collapse_already yeah Reddit also tended to mistake explanation for agreement and savagely downvote it.
specious /spē′shəs/ adjective
and then the real answer will be hidden or something silly, or in some cases where money is involved the correct answer might have been removed
This can be said to https://news.ycombinator.com/ as well. I wonder how much of this is due to sock puppets and bots.
I'm still happy that I went through the effort to delete all my old posts when I left Reddit a while back. I periodically check if they've restored them and luckily it hasn't happened so far. I do miss some of the bigger communities but overall I'm having a good time on Lemmy.
I'm sure they have a backup somewhere that they will use to train the AI, but agreed, it is time to leave reddit for good.
Unless you are in the EU Reddit absolutely did not delete your data.
Reddit is dumb enough that they probably have a backup they kept of EU users.
I can vouch for that.
Well, if you want to be sure that Reddit deleted your data, the time to bring it up is now. Ask questions, contact journalists, demand answers.
Your PII isn't being sold here and you gave Reddit an irrevocable license to your content, so being in the EU doesn't matter.
No, GDPR applies to all data, not just PII.
The GDRP explicitly only applies to "personal data"
which it defines as follows:
Please provide a quote where the GDPR says that it applies to anything but "personal data".
I wonder what the risks are to including deleted and pre-edited content in training data. Most of the edits are going to be typos and formatting, do you want 2-3 copies of the same message with typos in them for training data? Similarly, deleted comments are mostly nonsense, unhelpful, duplicate, or highly controversial things.
If someone wants to dig through and find individual users to restore that's one thing, but I don't think I'd immediately choose to train off of that other data unless I had to.
It should be very easy to distinguish edits and deletes which were made within a few minutes or hours after writing a comment, from those made months or years later right around the reddit blackout.
Lol YoU ShOuLd HaVe ThOuGhT oF ThAt SoOnEr
LaNgUaGe FoR tHe MaChInE!!?:/;1
After deleting all of my posts and comments Reddit decided to undelete them three days later and then proceeded to lock me out of my own account. Fucking bastards.
I just left my comments on. I still use reddit when searching actual human responses from Google. Maybe one day someone might find my archived comments useful in the future.
I am glad it makes you feel better but the reality is they still have your data. Just because you don’t see it on the front end doesn’t mean it isn’t still in the database with a “deleted” flag set. They aren’t hard deleting your comments.
Deleting your messages is just another data point for them. Reddit can train an AI on the originals and categorize you as a "comment deleter" to give them more information.
Aye, and that’s why I left. As an author, fuck you trying to monetise my writing when I can’t even do that myself.
Hey another author?! How you doin? Lol
Same as you fuck them.
Yeah, hi!
Can I have a link to your work?
May i see both of your works?? Id love to give em a read!
Here’s mine:
Blue Are the Hills, Lilly Piper.
None of my other writing is public at the moment.
Gotta buy me dinner first! Lol
Jokes aside I'm fairly private when I'm not so I tend to not openly share my writing. I'm building up for when I retire from corporate IT to unleash a lifetime of it.
I did that, too. I published my first novel in 2019 after leaving my career as a UX designer/softwaredev/db admin/etc.
Hit me when you’re ready, no matter how many years that is – I’d love to read your stuff.
basically every technology one
The start of the bubble popping was the increases in interest rates. We've seen several online companies shut down already because the free money isn't there any more and there is no path to monetization.
The problem with the Fediverse right now is that it is all run on volunteer labor and donations, similar to an early Reddit. It will be interesting to see how a distributed system solves this problem.
I think the volunteer labor and donations strategy works much, much, better on a distributed platform like the fediverse.
Sure, but what happens if the population explodes? Primarily server costs will go through the roof, and then you're still relying on volunteer moderation. It works now because the fediverse is reasonably small, but a true user exodus for any major platform could overload existing instance resources. I think the saving grace here is that there is a bit of a learning curve with Lemmy that fends away the less tech savvy, but that could change in future updates
Maybe I’m wrong but I think the fediverse isn’t quite that fragile. Instances can always close new sign ups if they’re overwhelmed. More users means more donations and more people likely to self host, too.
I guess we could run into real issues if fediverse infrastructure doesn’t scale well (example: required server resources scale exponentially with more users instead of linearly)
In extreme circumstances instances can defederate from larger ones if their mod teams are overwhelmed (obviously this isn’t a good solution but it is something beehaw.org is doing/did with lemmy.world)
The issue really comes down to the infrastructure costs. The fediverse is by design significantly less efficient with hardware than a centralized system. It isn't that it's difficult to scale, it's just that it's expensive to scale. And since the hardware is maintained by generosity of donation...
This is offset by the higher interest in volunteer labour, though.
I think the "solution" is just to accept that instances will burst in and out of existence (and favour) based on time and generosity.
As long as user profiles and contributions can transfer between instances, especially if the process is easy, then instances coming and going won’t be that much of a problem.
I do hope that current and future open source tech moves towards monetization resistance if monetization can’t be done ethically. Donation and volunteers seem to be the working formula so far
I think the bubble is coming too. The question is how much it will take for normal users to be done with them. The current Lemmy user base is more focused on tech, open source, and/or privacy than the average Internet user, which is why we already abandoned Reddit.
I think having to pay for access to these sites might be the biggest issue, as many people see the Internet as something that should be free.
There is such a thing as good technology. It would be nice if one of the tech comms would ban posts about shit tech
You know the phrase "If you aren't paying, you're the product".
It doesn't hit as hard as a CEO using the phrase "Monetizing Our User Base".
You know what the world doesn't need?
an AI model trained on the old Reddit Hive Mind.
Some AI models already argue when people point out inaccuracies, just like on Reddit.
As someone with expertise in some niche fields:
They're almost always wrong about everything, and when someone tries to correct them, with sources, they get downvoted.
Guess what data they're trained on...
This is a human thing and not so much a reddit thing. People been arguing on the internet since the inception of message boards.
I disagree. A reddit bot would be really funny as it would constantly talk about incest and spez
That and the feeling of pride and accomplishment.
A lot of AI models are probably already trained on Reddit data. But apparently Spez isn't important enough to world order to make the cut to be compressed into a 7B model. I asked my Mistral-7B-Instruct (4-bit quantised) local LLM:
"Early Stages?" You've got AI mining your data. The Lions have already come and gone. The hyenas and other scavengers are picking over the scraps, now.
They mean that they havent made money on it (yet)
They have probably only provided a small amount of available data, and have much more data, of different type.
Yes we've got the data, but now we need it from different angles!
When I go to some reddit posts on Mobile now (like from a Google search, that's the only way I end up at reddit anymore), it tells me "this content is unmoderated" and gives me a choice to either navigate away or install the Reddit app. Fuck that noise.
Try this, in either Bing/Copilot AI or Google Gemini: Start your prompt with "According to Reddit", then do your search like you would by using search alone.
The AI of your choice will scrape the posts and give you a nice summary of whatever you were searching for - no need to ever touch Reddit directly.
For me, this works better with Copilot, YMMV.
Example: "According to Reddit, what is the best mechanical keyboard brand to use for touch typing?"
or i can just add "site:reddit.com" to a normal search. meh.
Absolutely! What I am suggesting here is: since Reddit is so gung ho on AI, use the AI to bring them to their knees, and have some fun while doing it. 😬
how exactly do you think that would bring reddit to their knees?
Does that allow you to bypass the "open in app or navigate away" wall?
I never see that because all my devices are setup to redirect to old.reddit.com
Change the URL to old.reddit.com as the domain
Fuck u/spez
They've finally gone full /HailCorporate, become the thing some of the original people of the site would probably not have agreed with in many ways
That is a story as old as time. Greed is strong.
If anyone on Reddit reads that and stays there willingly they are an idiot. Not they weren't idiots for staying after the API changes but now they are even bigger idiots.
Aaron Schwartz is rolling in his grave
I know it's only token resistance at this point because others have found their comments from Google searches even after their accounts have been deleted, but Power Delete Suite is busy churning away on mine right now.
I wish I had known about Power Delete Suite. I nuked my posts / comments by hand :-(
In case it's useful to more people: https://github.com/j0be/PowerDeleteSuite
Lol
My account was four years old. There was no way I was going to do it by hand. It took PDS 8 hours to get churn through all that crap.
I had been meaning to delete my account earlier for opsec reasons, but just hadn't gotten around to it.
I wonder if constantly cycling through it could eat up bandwidth, storage, etc. might be a good way to fuck with them.
I Remember people uploading 10gb files of noise in order to fuck their storage
They permabanned my 14yo account because my anti-nazi rhetoric was "encouraging violence." I guess Nazis are a class of humans dumb enough to give them money so they don't want to scare them off. The post that got me banned had more than 60 up votes when it was deleted and I was permabanned. A reply post in the same vein was not deleted.
Remember that video where Ron Perlman talked about there's a lot of ways to lose a house?
I lost my 11-year account because I said something to the effect of 'If Ron Perlman pulled up and said get in the fucking car we're going to go burn down Bob Iger's house I wouldn't hesitate.'
They had been getting very weird near the end there anyways? I kept getting these stupid warnings over the most petty shit. At one point somebody said respond to this comment and I'll gild you. I simply responded fuck you because I thought it would be funny to see that have gold, which it got. Got an official warning for harassment.
I had said a lot worse over the years.
Same. This.
Edited to add: fuck redtit
Honest question: deleted comments might be just hidden and still up for sale, do people know if GDPR can come to the rescue here?
To be fair, advocating violence on any platform will not get you very far even if the idea is justified, eg) nazis
Curiously, Nazis seem to get away doing just that, under their clear name even! Reported a few of those on Twitter a while ago before Elons takeover. Got a message that the reports are unwarranted and if I continued to make them they'd disable my ability to report.
I asked what Eisenhower would do if he saw the Nazi marchers in Wisconsin and had ready access to a machine gun. I don't think that is advocating violence. I intended the comment to illustrate how far some Republicans have moved to the right since Ike was president.
Eisenhower is dead. Advocating for his attendance at a Nazi march is nothing more than a thought experiment.
“Pay-Per-Click”, is all this is when you break it down to its basest.
Narwhal developers have come out and said that they have to pay beforehand for clicks to the API—- what absolute bullshit Reddit and Spez are bringing to the trough. Spez killed reddit—- calling it now; a slow painful lingering shitty death.
People will not put up with it once they know what is really going on.
Let em know. “Pay-Per-Click” will not stand.
People will not know what is really going on as they do not care. Reddit will continue to exist.
Ah
Yes
I know Fark and /. and MySpace, and still exist
With all the changes that Reddit has made recently esp with the API changes, it definitely did leave salt in my mouth alongside how increasing toxic the Reddit community had become in comparison to when I joined the community but the small niche communities that existed on Reddit did honestly made it harder to quit due to the lack of communities outside, which is another big problem with centralisation, esp in the modern internet as it makes you rely on platforms you may not necessarily like due to big issues like social isolation etc.
When I found out about this, this isn’t simply excusable anymore and I would rather delete my account over having my personal data being sold for profit (which goes completely against the early ethos of Reddit as a whole but being semi owned by Conde Nast, this would have been inevitable) despite the fact that I have been thinking about deleting my Reddit profile way before this issue.
Surprisingly, I honestly have had no regrets deleting Reddit out of my life and honestly I do wish I would have done it sooner, I’m far less frustrated, I’m starting to think more constructively again and I feel way way less dependent on it.
Can say, I made a good choice there tbh.
Ditto for me, as well. It's just a matter of establishing those 'niche' communities on the Fediverse. The Fediverse has broken thru 10M users. We're getting there. Onward!
Is this a long term source of revenue for Reddit? Or will it loose value at some point, simply because LLMs are all trained sufficiently on user generated content. Is there more to learn at some point?
Also it seems that a lot of content on Resdit is already AI generated, so it would train on data from other LLMs, which I'm sure doesn't improve quality.
It’s the reason I can’t see this stock maintaining or improving its price after the IPO. I mean, sure, there will probably be a short term gain for a few stock holders. But, I just don’t see how it doesn’t tank afterwards. I mean, in the end, Reddit is Reddit. It’s just an aggregation site, how can it grow in value? The fediverse is slowly but surely gaining popularity. And even though Reddit calls itself the front page of the internet, it really isn’t.
*Not investment advice. Good god please don’t take investment advice from me. Knowing my luck that fucking stock will soar to Wall Street record highs, beating out Bitcoin by a large margin.
Supposedly in Reddit finance there's something called the "Anarchy Chess/Ewan gambit". If you post one grain of rice, and double it each time you reach a threshold you can farm near-infinite updoots! Probably works the same with money, idk.
Well, eventually LLMs will need to be fed new misinformation at some point, such as which minority was responsible for their own genocide
If you are planning to kill your reddit account, there is an app, Redact, which is available on the Apple and Play stores, that will allow you to nuke all your posts before you close it completely. Deny them your data.
Surely that just removes the public data.
They will have backups that will retain it all
My thoughts exactly
For better or for worse, Reddit has a super valuable archive that has basically replaced Google search for me, it's insane how many times it has helped me solve small and big issues. I understand the logic, but it would still be a big blow for the internet if many people did that.
I'm in the early stages of becoming a billionaire. Now I just need approximately a smidge less than a billion dollars.
that's great. most of us are more than a billion dollars short of a billion.
Read: that means things are gonna get much worse around here
When Reddit go public we gonna see some serious shit.
Yeah as I have already written the site off, at this point I just kinda wanna see how bad it gets how fast.
I do think it's interesting that a lot of people seem to think AI is going to take away jobs but understanding AI just a tiny fraction, it seems like the things that are threatened are one that were already micro serviced away like internet search.
We use search everyday and having the best search engine means being the best tech company. These companies are in a race to topple Googles search dominance through providing AI as a service. There's money in them hills if you can train an AI to recommend when and where to go buy the newest shiny thing that solves all your problems.
monetizing the most racist community outside of twitter what could go wrong?
????
Something something sweet summer childrens
I wonder if they would use the data on all my old accounts that got banned for promoting violence against the billionaire class.
Don't forget kids -- all rights are won through violence.
The forgotten truth nobody wants to remember.
Only people would lose money when they use Reddit.
It took them how many years to monetize their user base? This company is run by complete idiots.
Given that Spez managed to write himself a $193M cheque, I’d say it’s idiots all the way down.
See, most companies would do this before they went public.
yea make the community; somewhere to post the offenders lol
There is, its called [email protected]
Just subscribed, thanks a lot.
Looks like the enshitification of Reddit is about to accelerate. I barely use it anymore, but I kept my two ten year + old accounts intact (one for porn one for legit posts). I’ll probably nuke my non-porn account soon.