Study of 8k Posts Suggests 40+% of Facebook Posts are AI-Generated

It's incredible, for months now I see some suggested groups, with an AI generated picture of a pet/animal, and the text is always "Great photography". I block them, but still see new groups every day with things like this, incredible...

will_a113 reply

I have a hard time understanding facebook’s end game plan here - if they just have a bunch of AI readers reading AI posts, how do they monetize that? Why on earth is the stock market so bullish on them?

WalrusDragonOnABike [they/them] reply

reddthat.com

As long as they can convince advertisers that the enough of the activity is real or enough of the manipulation of public opinion via bots is in facebook's interest, bots aren't a problem at all in the short-term.

acosmichippo reply

surely at some point advertisers will put 2 and 2 together when they stop seeing results from targeted advertising.

SolarMonkey reply

slrpnk.net

I think you give them too much credit. As long as it doesn’t actively hurt their numbers, like x, it’s just part of the budget.

lepinkainen reply

Engagement.

It’s all they measure, what makes people reply to and react to posts.

People in general are stupid and can’t see or don’t care if something is AI generated

acosmichippo reply

they measure engagement, but they sell human eyeballs for ads.

andallthat reply

But if half of the engagement is from AI, isnt that a grift on advertisers? Why should I pay for an ad on Facebook that is going to be "seen" by AI agents? AI don't buy products (yet?)

acosmichippo reply

yes, exactly.

lepinkainen reply

Engagement is eyeballs looking at ads

acosmichippo reply

... unless it's AI masquerading as eyeballs looking at ads.

anomnom reply

They want dumb users consuming ai content, they need LLM content because the remaining users are too stupid to generate the free content that people actually want to click.

Then they pump ads to you based on increasingly targeted AI slop selling more slop.

1984 reply

AI can put together all that personal data and create very detailed profiles on everyone, automatically. From that data, an Ai can add a bunch of attributes that are very likely to be true as well, based on what the person is doing every day, working, education, gender, social life, mobile data location, bills etc etc.

This is like having a person follow every user around 24 hours per day, combined with a psychologist to interpret and predict the future.

It's worth a lot of money to advertisers of course.

spongebue reply

For me it's some kind of cartoon with the caption "Best comic funny 🤣" and sometimes "funny short film" (even though it's a picture)

Like, Meta has to know this is happening. Do they really think this is what will keep their userbase? And nobody would think it's just a little weird?

Petter1 reply

Well, maybe it is the taste of people still being there.. I mean, you have to be at least a little bit strange, if you are still on facebook…

brucethemoose reply

Engagement is engagement, sustainability be damned.

yarr

feddit.nl

This is a pretty sweet ad for https://originality.ai/ai-checker

They don't talk much about their secret sauce. That 40% figure is based on "trust me bro, our tool is really good". Would have been nice to be able to verify this figure / use the technique elsewhere.

It's pretty tiring to keep seeing ads masquerading as research.

hedhoncho reply

damn no wonder i feel so cheap after scrolling a fb feed for an hour

morrowind

Keep in mind this is for AI generated TEXT, not the images everyone is talking about in this thread.

Also they used an automated tool, all of which have very high error rates, because detecting AI text is a fundamentally impossible task

addie reply

AI does give itself away over "longer" posts, and if the tool makes about an equal number of false positives to false negatives then it should even itself out in the long run. (I'd have liked more than 9K "tests" for it to average out, but even so.) If they had the edit history for the post, which they didn't, then it's more obvious. AI will either copy-paste the whole thing in in one go, or will generate a word at a time at a fairly constant rate. Humans will stop and think, go back and edit things, all of that.

I was asked to do some job interviews recently; the tech test had such an "animated playback", and the difference between a human doing it legitimately and someone using AI to copy-paste the answer was surprisingly obvious. The tech test questions were nothing to do with the job role at hand and were causing us to select for the wrong candidates completely, but that's more a problem with our HR being blindly in love with AI and "technical solutions to human problems".

"Absolute certainty" is impossible, but balance of probabilities will do if you're just wanting an estimate like they have here.

morrowind reply

I have no idea whether the probabilities are balanced. They claim 5% was AI even before chatgpt was released, which seems pretty off. No one was using LLMs before chatgpt went viral except for researchers.

szczuroarturo reply

programming.dev

Im pretty sure chatbots were a thing before AI. They certainly werent as smart but they did exists.

GenosseFlosse reply

feddit.org

Chatbots doesn't mean that they have a real conversation. Some just spammed links from a list of canned responses, or just upvoted the other chat bots to get more visibility, or the just reposted a comment from another user.

ubergeek reply

chat bots have been a thing, for a long time. I mean, a half decently trained Markov can handle social media postings and replies

PumpkinSkink reply

Yeah. This is a way bigger problem with this article than anything else. The entier thing hinges on their AI-detecting AI working. I have looked into how effective these kinds of tools are because it has come up at my work, and independent review of them suggests they're, like, 3-5 times worse than the (already pretty bad) accuracy rates they claim, and disproportionatly flag non-native English speakers as AI generated. So, I'm highly skeptical of this claim as well.

GuitarSon2024

The other 60% are old people re-sharing it.

WhatYouNeed reply

6% old people re-sharing. The other 54% were bot accounts.

futatorius reply

billwashere reply

Ok this made me laugh.

FlashMobOfOne

FB has been junk for more than a decade now, AI or no.

I check mine every few weeks because I'm a sports announcer and it's one way people get in contact with me, but it's clear that FB designs its feed to piss me off and try to keep me doomscrolling, and I'm not a fan of having my day derailed.

Jericho_Kane reply

lemmy.org

I deleted facebook in like 2010 or so, because i hardly ever used it anyway, it wasn't really bad back then, just not for me. 6 or so years later a friend of mine wanted to show me something on fb, but couldn't find it, so he was just scrolling, i was blown away how bad it was, just ads and auto played videos and absolute garbage. And from what i understand, it just got worse and worse. Everyone i know now that uses facebook is for the market place.

ChickenLadyLovesLife reply

My brother gave me his Facebook credentials so I could use marketplace without bothering him all the time. He's been a liberal left-winger all his life but for the past few years he's taken to ranting about how awful Democrats are ("Genocide Joe" etc.) while mocking people who believe that there's a connection between Trump and Putin. Sure enough, his Facebook is filled with posts about how awful Democrats are and how there's no connection between Trump and Putin - like, that's literally all that's on there. I've tried to get him to see that his worldview is entirely created by Facebook but he just won't accept it. He thinks that FB is some sort of objective collator of news.

In my mind, this is really what sets social media apart from past mechanisms of social control. In the days of mass media, the propaganda was necessarily a one-size-fits-all sort of thing. Now, the pipeline of bullshit can be custom-tailored for each individual. So my brother, who would never support Trump and the Republicans, can nevertheless be fed a line of bullshit that he will accept and help Trump by not voting (he actually voted Green).

notgold reply

aussie.zone

Good on him for not falling for the MAGA bulldust and trying for the third option

FlashMobOfOne reply

It's such a cesspit.

I'm glad we have the Fediverse.

Fandangalo

I’ve posted a notice to leave next week. I need to scrape my photos off, get any remaining contacts, and turn off any integrations. I was only there to connect with family. I can email or text.

FB is a dead husk fake feeding some rich assholes. If it’s coin flip AI, what’s the point?

EveningPancakes reply

Back when I got off in 2019, there was a tool (Facebook sponsored somewhere in the settings) that allowed you to save everything in an offline HTML file that you could host locally and get access to things like picture albums, complete with descriptions and comments. Not sure if it still exists, but it made the process incredibly painless getting off while still retaining things like pictures.

Fandangalo reply

Thank you real internet person. You make the internet great.

From Another Real Internet Person

UltraGiGaGigantic reply

Wait, you're not a dog using the internet while the humans are at work?

bassomitron reply

It still existed when I did the same thing a year ago or so. They implemented it awhile back to try and avoid antitrust lawsuits around the world. Though, now that Zuckerberg has formally started sucking this regime's dick, I wouldn't be surprised if it goes away.

adarza reply

Download a copy of your information on Facebook

LanguageIsCool

billwashere reply

These people should be shot. With large spoons. Because it’ll hurt more.

futatorius reply

brucethemoose

The bigger problem is AI “ignorance,” and it’s not just Facebook. I’ve reported more than one Lemmy post the user naively sourced from ChatGPT or Gemini and took as fact.

No one understands how LLMs work, not even on a basic level. Can’t blame them, seeing how they’re shoved down everyone’s throats as opaque products, or straight up social experiments like Facebook.

…Are we all screwed? Is the future a trippy information wasteland? All this seems to be getting worse and worse, and everyone in charge is pouring gasoline on it.

Petter1 reply

*where you think they sourced from AI

you have no proof other than seeing ghosts everywhere.

Not get me wrong, fact checking posts is important, but you have no evidence if it is AI, human brain fart or targeted disinformations 🤷🏻‍♀️

brucethemoose reply

No I mean they literally label the post as “Gemini said this”

I see family do it too, type something into Gemini and just assume it looked it up or something.

Petter1 reply

I see no problem if the poster gives the info, that the source is AI. This automatically devalues the content of the post/comment and should trigger the reaction that this information is to be taken with a grain of salt and it needs to factchecked in order to improve likelihood that that what was written is fact.

An AI output is most of the time a good indicator about what the truth is, and can give new talking points to a discussion. But it is of course not a “killer-argument”.

brucethemoose reply

The context is bad though.

The post I'm referencing is removed, but there was a tiny “from gemini” footnote in the bottom that most upvoters clearly missed, and the whole thing is presented like a quote from a news article and taken as fact by OP in their own commentary.

And the larger point I’m making is this pour soul had no idea Gemini is basically an improv actor compelled to continue whatever it writes, not a research agent.

My sister, ridiculously smart, professional and more put together than I am, didn’t either. She just searched for factual stuff from the Gemini app and assumed it’s directly searching the internet.

AI is a good thinker, analyzer, spitballer, initial source and stuff yes, but it’s being marketed like an oracle and that is going to screw the world up.

Petter1 reply

I agree 😇

brucethemoose reply

👍

Pennomi reply

No one understands how LLMs work, not even on a basic level.

Well that’s just false.

-3

Traister101 reply

Educate my family on how they work then please and thanks. I've tried and they refuse to listen, they'd prefer to trust the lying corpos trying to sell it to us

Pennomi reply

“Your family” isn’t who I was talking about. Researchers and people in the space understand how LLMs work in intricate detail.

Unless your “no one” was colloquial, then yes, I totally agree with you! Practically no one understands how they work.

noodle (he/him) reply

colloquially, no one enjoys a pedant

Pennomi reply

Except bureaucrats, of course.

brucethemoose reply

You know what I meant, by no one I mean “a large majority of users.”

Pennomi reply

I did not know that. There’s a bunch of news articles going around claiming that even the creators of the models don’t understand them and that they are some sort of unfathomable magic black box. I assumed you were propagating that myth, but I was clearly mistaken.

ZILtoid1991

> uses ai slop to illustrate it

harmsy reply

The most annoying part of that is the shitty render. I actually have an account on one of those AI image generating sites, and I enjoy using it. If you're not satisfied with the image, just roll a few more times, maybe tweak the prompt or the starter image, and try again. You can get some very cool-looking renders if you give a damn. Case in point:

Petter1 reply

😍this is awesome!

A friend of mine has made this with your described method:

PS: 😆the laptop on the illustration in the article! Someone did not want pay for high end model and did not want to to take any extra time neither…

Draces reply

Seems like an appropriate use of the tech

ChickenLadyLovesLife reply

That laptop lol.

surph_ninja

I wouldn’t be surprised, but I’d be interested to see what they used to make that determination. All of the AI detection I know of are prone to a lot of false-positives.

FundMECFS

This kind of just looks like an add for that companies AI detection software NGL.

Ace reply

Initiateofthevoid reply

lemmy.dbzer0.com

billwashere reply

Thank you. I’ve wondered the same thing. I mean the whole goal of the LLMs is to be indistinguishable from normal human created test. I have a hard time telling most of the time. Now the images I can spot in a heartbeat. But I imagine that will change too.

WhatSay

slrpnk.net

I was wondering who Facebook was for, good to know AI has low standards

Eezyville reply

Dead internet theory

SocialMediaRefugee

In the last month it has become a barrage. The algorithms also seem to be in overdrive. If I like something I get bombarded with more stuff like that within a day. I'd say 90% of my feed is shit that has nothing to do with anyone I know.

If it wasn't a way to stay in touch with family and friends I'd bail.

ThomasCrappersGhost reply

A friend told me he saw 16 posts before he saw a post from a friend or page he’d liked.

billwashere reply

I’m not surprised. And of those 16 posts how many of them made him mad? Since that seems like the entire purpose of FB anymore. Anger drives engagement. It’s why rage bait works so well. I highly recommend everyone disconnect from Facebook for this reason. Hell Reddit was even going down that path before we all left.

crozilla reply

SocialFixer would help.

UnderpantsWeevil reply

I'm a big fan of a particularly virtual table-top tool called Foundry, which I use to host D&D games.

The Instagram algorithm picked this out of my cookies and fed it to Temu, which determined I must really like... lathing and spot-wielding and shit. So I keep getting ads for miniature industrial equipment. At-home tools for die casting and alloying and the like. From Temu! Absolutely crazy.

SocialMediaRefugee reply

I made the mistake of clicking like on an Indian machine shop (I admired how they made do with crude conditions). Well now I get bombarded with not just those videos but Mexican welding shops, Pakistani auto repair places...

TheBrideWoreCrimson

sopuli.xyz

Thanks.
Now do Reddit comments.

ThomasCrappersGhost reply

There’s an AI reply option now. Interested to know how far that is off just being part of the regular comments.

Treczoks

And 58.82% are likely generated by human junk then.

Lexam

If you want to visit your old friends in the dying mall. Go to feeds then friends. Should filter everything else out.

brucethemoose

Also… the tremendous irony here is Meta is screwing themselves over.

They've hedged their future on AI, and are smart enough to release the weights and fund open research, yet their advantage (a big captive dataset, aka Facebook/Instagram/WhatsApp users) is completely overrun with slop that poisons it. It’s as laughable as Grok (X’s AI) being trained on Twitter.

SlopppyEngineer reply

Meta is probably screwed already. Their user base is not growing as before, maybe shrinking in some markets, and they need the padding to cover it up.

brucethemoose reply

Very true.

But also so stupid because their user base is, what, a good fraction of the planet? How can they grow?

SlopppyEngineer reply

38% of the population as user. 20% daily active users. The classic way to grow is to squeeze the users and advertisers more and more with fees, subscriptions, tiers, ... I guess the exodus at X has them spooked of what could happen if they continue with that plan, so they're trying this AI thing.

Shardikprime

8k posts sounds like 0.00014 percent of Facebook posts

billwashere reply

It probably is but it’s a large sample size and if the selection is random enough, it’s likely sufficient to extrapolate some numbers. This is basically how drug testing works.

Suburbanl3g3nd reply

lemmings.world

And statistical analysis. The larger the universe, the smaller the true random sample you need

Opinionhaver

Title says 40% of posts but the article says 40% of long-form posts yet doesn't in any way specify what counts as a long-form post. My understanding is that the vast majority of Facebook posts are about the lenght of a tweet so I doubt that the title is even remotely accurate.

will_a113 reply

Yeah, the company that made the article is plugging their own AI-detection service, which I'm sure needs a couple of paragraphs to be at all accurate. For something in the range of just a sentence or two it's usually not going to be possible to detect an LLM.

venusaur

Probably on par with the junk human users are posting

Treczoks reply

Hmm, "the junk human users are posting", or "the human junk users are posting"? We are talking about Facebook here, after all.

ILikeBoobies

That’s an extremely low sample size for this

thisbenzingring reply

lemmy.sdf.org

8,855 long-form Facebook posts from various users using a 3rd party. The dataset spans from 2018 to November 2024, with a minimum of 100 posts per month, each containing at least 100 words.

seems like thats a good baseline rule and that was about the total number that matched it

ILikeBoobies reply

With apparently 3 billion active users

Only summing up 9k posts over a 6 year stretch with over 100 words feels like an outreach problem. Conclusion could be drawn that bots have better reach

thisbenzingring reply

lemmy.sdf.org

each post has to be 100 words with at least 100 posts a month

how many actual users do that?

ILikeBoobies reply

I have no idea because I don’t use the site

But to say less than 0.0001% just seems hard to believe

thisbenzingring reply

lemmy.sdf.org

I don't use the site either but 100 words is a lot for a facebook post

ILikeBoobies reply

lipilee

feddit.nl

and, is the jury already in on which ai is most fuckable?

UnderpantsWeevil reply

I'd tell you, but my area network appears to have already started blocking DeepSeek.

WhatYouNeed reply

https://www.theregister.com/2025/01/30/deepseek_database_left_open/

Deekseek that was not encrypting data

dan reply

This doesn't have anything to do with encryption. They had a public database (anyone on the internet could query it) and forgot to put a password on it. It really shouldn't even be public.

UnderpantsWeevil reply

According to Wiz, DeepSeek promptly fixed the issue when informed about it.

:-/

werefreeatlast

Not my Annie! No! Not my Annie!

fwdbias

Deleted my account a little while ago but for my feed I think it was higher. You couldn't block them fast enough, and mostly obviously AI pictures that if the comments are to be believed as being actual humans...people believed were real. It was a total nightmare land. I'm sad that I have now lost contact with the few distant friends I had on there but otherwise NOTHING lost.

ToiletFlushShowerScream

Take note this does not appear to be an independent study. Tell me I'm wrong?

transfluxus

leminal.space

Considering that they do automated analysis, 8k posts does not seem like a lot. But still very interesting.

If you could reliably detect "AI" using an "AI" you could also use an "AI" to make posts that the other "AI" couldn't detect.

xor reply

Sure, but then the generator AI is no longer optimised to generate whatever you wanted initially, but to generate text that fools the detector network, thus making the original generator worse at its intended job.

Don_alForno reply

feddit.org

I see no reason why "post right wing propaganda" and "write so you don't sound like "AI" " should be conflicting goals.

The actual argument why I don't find such results credible is that the "creator" is trained to sound like humans, so the "detector" has to be trained to find stuff that does not sound like humans. This means, both basically have to solve the same task: Decide if something sounds like a human.

To be able to find the "AI" content, the "detector" would have to be better at deciding what sounds like a human than the "creator". So for the results to have any kind of accuracy, you're already banking on the "detector" company having more processing power / better training data / more money than, say, OpenAI or google.

But also, if the "detector" was better at the job, it could be used as a better "creator" itself. Then, how would we distinguish the content it created?

xor reply

I'm not necessarily saying they're conflicting goals, merely that they're not the same goal.

The incentive for the generator becomes "generate propaganda that doesn't have the language chatacteristics of typical LLMs", so the incentive is split between those goals. As a simplified example, if the additional incentive were "include the word bamboo in every response", I think we would both agree that it would do a worse job at its original goal, since the constraint means that outputs that would have been optimal previously are now considered poor responses.

Meanwhile, the detector network has a far simpler task - given some input string, give back a value representing the confidence it was output by a system rather than a person.

I think it's also worth considering that LLMs don't "think" in the same way people do - where people construct an abstract thought, then find the best combinations of words to express that thought, an LLM generates words that are likely to follow the preceding ones (including prompts). This does leave some space for detecting these different approaches better than at random, even though it's impossible to do so reliably.

But I guess really the important thing is that people running these bots don't really care if it's possible to find that the content is likely generated, just so long as it's not so obvious that the content gets removed. This means they're not really incentivised to spend money training models to avoid detection.

mugdad1

quenemm

You know what they say about Al...

not_IO

how tf did it take 6 years to analyze 8000 posts

-3

Hildegarde reply

I pretty sure they selected posts from a 6 year period, not that they spent six years on the analysis.

billwashere reply

I can’t even fathom how they would go about testing if it’s an AI or not. I can’t imagine that’s an exact science either.

dan reply

In that case, how/why did they only choose 8000 posts over 6 years? Facebook probably gets more than 8000 new posts per second.

Hildegarde reply

Every study uses sampling. They don't have the resources to check everything. I have to imagine it took a lot of work to verify conclusively whether something was or was not generated. It's a much larger sample size than a lot of studies.

dan reply

I have to imagine it took a lot of work to verify conclusively whether something was or was not generated

The study is by a company that creates software to detect AI content, so it's literally their whole job

(it also means there's a conflict of interest, since they want to show how much content their detector can detect)

It’s a much larger sample size than a lot of studies.

It's an extremely small proportion of the total number of Facebook posts though. Nowhere near enough for statistical significance.

tal reply

https://en.wikipedia.org/wiki/Sampling_(statistics)

It's an extremely small proportion of the total number of Facebook posts though. Nowhere near enough for statistical significance.

The proportion of the total population size is almost irrelevant when you use random sampling. It doesn't rely on examining a large portion of the population, but rather that it becomes increasingly unlikely for the sample set to deviate dramatically from the population size as the number of samples rises. This is a function of the number of samples you take, decoupled from the population size.

Usually if you see a major poll in a population, it'll be something like 1k to 2k people who get polled, regardless of the population size.

prole reply

I was wondering how far I'd have to scroll before getting to someone who doesn't understand statistics complaining about the sample size...

dan reply

There's likely been trillions of posts on Facebook during that time frame. Is a sample size of 8000 really sufficient for a corpus that large?

prole reply

Have you ever heard of "margin of error"?

Learn statistics, it's actually super informative.

Jack-A-Noodle

Anyone on Facebook deserves to be shit on by sloppy. They also deserve scanned out of all of the money and anything else.

If you’re on Facebook, you deserve this. Get the hell off Facebook.

Edit: itt: brain, dead, and fascist apologist Facebook Earth, who just refuse to accept that their platform is one of the biggest advent of Nazi fascism in this country, and they are all 100% complicit.

-16

Rekall Incorporated reply

While I agree with your message at a high level (I quit FB several years ago), I don't think it's productive to be so abrasive.

It's generally better to be respectful and convincing if you want to change minds.

Flying Squid reply

Have you ever successfully berated a stranger into doing what you wanted them to do?

Opinionhaver reply

Edit: itt: brain, dead, and fascist apologist Facebook Earth, who just refuse to accept that their platform is one of the biggest advent of Nazi fascism in this country, and they are all 100% complicit.

This is some Facebook quality content you're bringing to us here. It's so great seeing this kind of posts on my feed first thing in the morning. Shows that it's not just AI poisoning our social media platforms.

bassomitron reply

Jack-A-Noodle reply

You’ve made an excellent case and argument for both ditching all traditional, social media, but also that they are all intrinsically shitty and evil.

If you can’t bring yourself to break away from techno fascism, why should I have any pity for you?

I am not responsible for your apathy nor your weakness. When you gargle the balls of fascism, don’t be surprised when others come and point out how shitty that is.

-10

Ilovethebomb reply