AI Models Show Signs of Falling Apart as They Ingest More AI-Generated Data | Spyke

fuck_ai·Fuck AIbyThefuzzyFurryComrade

AI Models Show Signs of Falling Apart as They Ingest More AI-Generated Data

AI Models Show Signs of Falling Apart as They Ingest More AI-Generated Data

https://futurism.com/ai-models-falling-apartOpen link View original on pawb.social

435

Another problem I've realized today, is the proliferation of data that was originally hallucinated by AI.

I was discussing an issue on a software with a coworker and he asked an AI for help configure around it. He then sent me "apparently we can try changing this setting to this value". I told him to first validate if that setting really existed because AI tends to make up things like that when it's what you would want to hear and running a test would take us 20~30 minutes.

He found some discussions about that setting not working as people expected. "ok at least it exists then" and we tried it. It didn't work. I later cloned the source of that software and checked, the setting didn't exist - ever.

100

alaphic reply

I love that you even specifically said, "Yea, let's check to make sure that setting exists to begin with." To which instead of actually fucking checking, they proceed to google more about the setting and use someone else's 'discussion' online of it not working as proof that it does exist, even though they were likely having that discussion because the setting didn't exist.

This is also how I can tell this story is 100% true.

I don't miss working support at all and am reminded of it like this daily

92

Clent reply

lemmy.dbzer0.com

The benefit of working with open source code bases or being able to check the source for existing features.

It's very common for there to be hidden settings, with open source one can look at the code based but with closed source a search may be one's only hope.

2

Garbage in, garbage out.

Who could have possibly predicted that?

68

Kornblumenratte reply

The recycling industry begs to differ. Well, exceptions prove the rule.

11

jagged_circle reply

Depends what materials you're recycling. Glass and plastic both require virgin material, else you'd get garbage out.

9

dutchkimble reply

Paper works up to 6 times at best, if someone is able to track the same batch. But recycling paper uses more energy than using virgin, and if virgin paper sourced from a sustainable place like Canada, then recycled paper is actually worse for the environment because of the energy thing and de-inking water waste. Also the timber is cut for housing actually and only the edges of the logs are used to make chips for making paper. So trees aren’t being cut solely for paper (from sustainable countries). Until we meet again (insert Skeletor running away)

16

The_Decryptor reply

Glass and plastic both require virgin material, else you’d get garbage out.

Everything I'm reading suggests the problem with glass recycling is contamination, and that once that's accounted for what's left over can be infinitely recycled without quality loss.

5

jagged_circle reply

Source? Ive never found a factory that does this in practice. They all add virgin material

2

The_Decryptor reply

Unfortunately I'm not finding much explicit information on the specific processes, just that it's possible.

Wikipedia says it's infinitely recyclable
So does this waste disposal place
The Australian CSIRO also says it's "endless"
Alternatively, this PDF from a state government here says it's often mixed in with virgin materials, but doesn't mention why.

Now of course just because it can be recycled indefinitely, doesn't mean it is in practise. Could be contamination, colouring, or just plain cost.

2

jagged_circle reply

I don't think infinitely recyclable doesn't mean it doesn't need virgin material. I don't know what that would be called

2

dalekcaan reply

This has always been true, but LLMs have expedited the process by taking the garbage out and sticking it right back into the input.

3

It's very tempting to have schadenfreude about this failure but also disgusting that so much has been invested in it that should have been put to better use.

It's just another example of a system whose narrow definition of success is taking human and environmental value and using it to extract more. It's not aimed at solving worthwhile problems or making things better, which is why people are becoming more miserable and the planet is getting wrecked.

You could say that it's the system we live in which is the AI, feeding on itself and becoming more sick.

46

chonglibloodsport reply

The schadenfreude is what we’re here for! We can’t do anything about the waste of investors’ money. They could’ve spent it all on fireworks instead. That probably would’ve been more fun!

As for the system? I prefer not to think about it. Too much systemic thinking is bad for mental health. Much better to enjoy some schadenfreude and save your serious thinking energy for things you have the power to change, especially where they can make life better for you and those around you.

10

Churbleyimyam reply

I agree with all your points! What I will add though is that what we think of as 'investors money' is actually value that has been extracted from the environment and from workers.

8

leadore reply

In my case it's not so much schadenfreude as just wanting this nightmare era to end as quickly as possible. The sooner this LLM shit dies the sooner we can start to recover and move on, in terms of stopping the senseless waste of water and energy and maybe starting to rebuild some kind of useful internet.

6

avattar reply

3

So, reading this article, it's not about model collapse, but about RAG - letting the AI model google the question essentially. The problem is, the first 10 pages of google search results are all low effort adfarming slop sites, because of course it is, which is making the answers from the AI worse, as these slop sites often have incorrect or otherwise unproofed articles, which biases the AI to fork out the wrong answer.

I'm sure the major AI services will try and fix this with some slop site detection routines.

40

frunch reply

I'm sure the major AI services will try and fix this with some slop site detection routines.

Which will be run by AI 🙃

20

melechric reply

Don't forget! A lot of the slop on those first few pages of results is AI-generated.

Ouroboros is a very apt moniker for this phenomena.

14

avattar reply

We need a new, stronger name for this. Like shit ouroboros, or shouroboros. Yes, AI eating it's own shit and then regurgitating it is shouroboros.

1

ℍ𝕂-𝟞𝟝 reply

some slop site detection routines.

Why would they? I mean how are their incentives different from that of the search engine operators themselves?

I can see a future when the internet is degraded to a point where if you try to find out how to peel an apple, you will get back word salad and 25 different porn ads.

5

postmateDumbass reply

Mad AI Disease.

4

JeremyHuntQW12 reply

2

You probably would get better results from literally any other AI Gemini is routinely the worst. I don't know what Google are playing at surely they could actually put some real effort into this but they just seem to be doing it in the most naive way possible.

It comes to something when the Chinese are been the most innovative.

1

MrSilkworm reply

I'm sure the major AI services will try and fix this with some slop site detection routines.

No they will not, because this will harm their short term bottom line, which is always, "add short term value for the shareholder"

2

Blaster M reply

Unless the shareholder also owns the search slop site, it's competition to revenue and in good interest to filter out.

4

avattar reply

Plus, it's not an easy task.

2

Good. Eat yourself you technological prion disease.

35

masta_chief reply

sh.itjust.works

This is a rare insult and I like it

7

Good. Poison the AI well. Rot this shit to the ground.

35

sh.itjust.works

Aww boo hoo, did someone generate a degenerative feedback loop? Yeah? Did someone make a big ol' oppsy whoopsy that's gunna accelerate in hallucinations and slop as it collapses in on itself? How's the coded version of a microphone whine going to go, you silly buttholes?

33

Wilco reply

People are putting AI generated pitfalls to guard their content.

They reference nonsense links that usually cannot even be seen by normal users, the AI reads the pages and finds more garbage links even as more are generated by the site.

5

_druid reply

sh.itjust.works

It's just so unfortunate that, in causing AI to delve down these winding paths, to propagate these slopfest feedback loops, the computers that are running the AI are burning real resources, polluting our atmosphere.

Unfortunate is not the right word to describe the deep lament I feel, to cause such destruction for so little, if any, gain at all. My heart is heavy with regret for us all. Not just you and I, but for beast, bird, plant as well. Such a shame.

5

Ffs, neural networks and LLMs have their place and can be useful, but setting up datacentres that snort up the entire internet indiscriminately to create a glorified chatbot that spews data that may or may not be correct is insane.

26

Oh no! I HOPE us Taxpayers can Bail Out these AI Companies when they go Under! AFTER ALL we CUT my Child's LIFESAVING MEDICATION so I KNOW we have the Funds to Help these Poor Billionaire CEOS!

25

Etterra reply

I can't afford groceries now! I'm sure all those billionaires will help us out now that they've got a little but more though.

7

corsicanguppy reply

volume::normalize(that)

5

utopiah reply

Help these Poor Billionaire CEOS!

Right, self-made billionaires for whom the way to success was already paved by subsidies. Yes, those surely need help to "build" absolutely pointless non-working projects that are supposed to "save humanity". That's great. /$

3

SocialMediaRefugee

I predicted this. It is similar to a photocopy of a photocopy that eventually ends up a mess of garbage noise.

25

19

The silver lining of AI slop filling the WWW

18

AngryCommieKender

Cue Price is Right failure trombone.

18

shalafi reply

Hah! I heard that in my head!

5

I've been predicting this for a while now and people kept telling me I was wrong. Prepare for dot com burst two, electric boogaloo.

15

bthest reply

I hope it crashes but what if the market completely embraces feels-based economics and just says that incomprehensible AI slop noise is what customers crave? Maybe CEOs will interpret AI gibberish output in much the same way as ancient high priests made calls by sifting through the entrails of sacrificed animals. Tesla meme stock is evidence that you can defy all known laws of economic theory and still just coast by.

3

liang reply

2

There is a solution to this. Make a **perfect ** AI detecting tool. The only way I can think of is through adding a tag to every bit of AI-generated data,

Though it could easily be removed from text, I guess.And no, training AI to recognize AI will never work. Also every model would have to join this, or it won't work.

15

Etterra reply

LOL you're suggesting people already doing something unbelievably stupid should do something smart to compensate.

10

mutual_ayed reply

sh.itjust.works

4

bthest reply

Also people won't be able to pass AI work off as their own if it is labeled as such. Cheating and selling slop is the chief use for AI so any tag or watermark will be removed on the vast majority of stuff.

There's also liability. If your AI generates code that's used to program something important and a lot of people are injured or die, do you really want a tag that can be traceable to back the company to be on the evidence? Or slapped all over the child sex abuse images that their wonderful invention is churning out?

1

Him: Ugh I don't feel so good after all this data.

Her: Data is a nutritious source of information. However, ingesting too much data can trigger some unpleasant side effects. Here's what you can do to alleviate some of the symptoms:

Drink water
Lie down and rest
Listen to the sounds of nature

Is there anything esle I can do for you?

13

AI ingesting AI slop and falling apart is not dissimilar to boomers ingesting rightwing slop and falling apart.

12

Good.

11

How much money was invested in reminding us that if the snake starts eating its tail it's eating itself?

10

fill up your free cloud services with ai generated info. i mean thousand text file. like "how to make homemade butterfly". all of them will scrap by ai.

9

makes me think about the human centipede

8

HootinNHollerin

lemmy.dbzer0.com

Human cent-ipad.png

8

AbnormalHumanBeing reply

lemmy.abnormalbeings.space

That is so much better than their attempt (the "Lord of the Flies for AI" byline). Captures the essence of the problem better than the ~~capitalism~~ cannibalism metaphor does, as well.

EDIT: That has to have been one of my favourite Freudian ADHD word-confusion typos I accidentally made there

4

discuss.tchncs.de

Cuttlefish or asparagus?

2

AI dementia

7

Dizzy Devil Ducky

If I had the money and a computer able to handle the amount of stuff I'd be throwing at it with a local model, I would have a giant website full of AI generated nonsense purely for the purpose of letting AI gobble it up to help the AI incest problem.

Imagine if a whole metric ton of "websites" did this. The thieving AI companies would either have to start blocking all of these sites or deal with an issue they don't wanna because they're too stingy and will probably just have their AI try ( and fail ) to fix the problem.

7

ThefuzzyFurryComrade reply

Better yet, feed the nonsense to the crawlers that ignore robots.txt.

5

Dizzy Devil Ducky reply

That was certainly cool. Now I wish I could use tools like that nepenthes on my neocities page.

1

Tragic and funny at the same time. As if consuming all of Reddit hadn’t already irreparably skewed things and that was still real people doing Reddit things. Now, released, it’s eating itself. This self-poisoning model seemed inevitable.

6

The great news is that these ponze schemes will either collapse or spend the next decade trying to fix it by creating algorithms which detect AI content so as to filter it out.

5

queermunist she/her

Snake eating its own ass.

5

Peppycito reply

sh.itjust.works

Oroboros can have a little peice of oroboros, as a treat.

5

Match!! reply

much hotter of a mental image than this deserves

-1

Human society does the same thing.

We're ok when we talk about what we saw.

Less so when we talk about what somebody else saw.

Crazier and crazier when we talk about what somebody said about what somebody said about what somebody saw. Which is arguably the internet.

4

guyoverthere123

lemmy.dbzer0.com

good news everybody!

2

you realize what this means, right?

who is causing all the backwashed data? the peasants.

who is training the models? the peasants.

who benefits the most from AI? the oligarchy.

I bet in a year or two, access to AI will be cost prohibitive and will be illegal to host without an expensive license.

how does this benefit the oligarchy you ask?

because the oligarchy is the government now, and AI needed the support of the peasants to get infrastructure up and running well enough to run on its own.

they're just going to use AI to oppress the peasants and ensure they know their place as slave labor.

congrats everyone who supported AI by praising and promoting it as a solution, you fucked yourself.

-3

Michael reply

We have accessible, open-source AI models - your predictions won't come to pass.

4

GreenKnight23 reply

-3

Michael reply

Fortunately, they can't arrest everybody using open-source AI models. There are clear efforts to stop momentum with geo-tracking high-end GPUs and indirect efforts like the EU plan trying to backdoor everything.

Personally, I see it all as ineffective.

0

GreenKnight23 reply

what about this current administration is effective?

I think you're under the misconception that standard legal rules apply with the current government.

1

Michael reply

There's a whole world out there - if anybody can effectively run these models, how will they know to stop everyone?

The current US administration and sphere of influence/power may be tyrannical, but they aren't omnipresent or omniscient - even if they try to be.

For example, I highly doubt China will be able to be stopped before they burst the AI dam. Honestly, they already have - these AI companies are just in denial because they need more capital for their proprietary, inefficient, and centralized models.

0

ℍ𝕂-𝟞𝟝 reply

who is causing all the backwashed data? the peasants.

No, actually, it's the shitty slop sites. I mean they are usually not made by Big Tech, but it is also not your rando Twitter posts either.

I bet in a year or two, access to AI will be cost prohibitive and will be illegal to host without an expensive license.

I can run a Chinese model on my sub-1000 EUR GPU right now and generate all the word salad I want. I know, I know, they will make better models. But that's the point, if they lock away better models, all the slop will be made with the worse models.

The point is, all this means is that you can't infinitely train AI on random internet content, and the value of social media as an AI training data source is going down since they are also getting infected with slop. This is actually a good thing, because one way SaaS models could have gotten better than freely hostable ones is by having access to data that is not openly accessible.

These news mean that data they could have used as a differentiator is a pile of hot shit.

2

GreenKnight23 reply

you're a peasant and don't even realize it because you're not a part of the "club". same as all those slop sites. they aren't part of the club and so they're lowly peasants.

there were talks of making those Chinese models illegal. not much harder to just say anyone that's not in the club can't have one either, and if you're caught you go to jail.

-4

ℍ𝕂-𝟞𝟝 reply

Yeah, but how do you even make them illegal? Most of them are fly-by-night places, you can use a 600 EUR GPU to generate slop with a 4 gig model, the worse it is the more it hurts data collection.

They couldn't even get rid of phone farms. Cat's out of the bag.

-2

GreenKnight23 reply

did you know that most Texas Instruments software and hardware is illegal to use if you're not using it for the further advancement of American interests?

and if you're caught you can face prison time and possibly even visit a black site if you're charged under the espionage act.

does it happen? sure. will you get caught maybe not...but they don't go looking for people unless they're bad people.

this current administration will target every average citizen that isn't affiliated with one of the oligarchs before they target the actual bad guys.

0

RandomVideos reply

programming.dev

How would they stop going to a different country where the AI license doesnt exist?

0

GreenKnight23 reply

ever heard of the pirate bay? they certainly got fisted by the long arm of American Jurisprudence even though they weren't in the US...

1

RandomVideos reply

programming.dev

Piracy is still extremely popular in countries where it isnt enforced

0

GreenKnight23 reply

yes, keep moving those goal posts.

0

RandomVideos reply

programming.dev

I am not moving the goal post

The people in countries where it isnt illegal would post them to social media and hurt AI training

1

AI, the one currently used for actual productive work by scientific researchers, healthcare specialists, energy development, manufacturing, agriculture and such, is poised to be able to handle about 20% of all human related work by 2040.

By 2043, it will be able to handle 100% of any human related work in the fields. The takeoff is merely 3 years

It's fine if you guys want to live in a little mental bubble where this doesn't happen

But I'd suggest you start getting ready for what comes next.

-19

Stern reply

Oh boy I can't wait for our currently robust social safety net and already existent universal basic income to allow us to live a life pursuing the things that make us happy, rather then multi-billionaires firing everyone and the world becoming a plutocracy where the average person struggles to get even the bare minimum.

9

ProgrammingSocks reply

You sound exactly like Christian doomsday cultists screaming about the end times. I'll believe it when I see it.

8

Seefoo reply

Should cite sources for this if you want it taken seriously.

7

The source is a research paper that the AI I community have been going on about for a few days now. I can't link to it right now because I'm at work but I'll update when I can.

But if you Google for it you will find it as it's been a fairly hot topic the last few days.

1

postmateDumbass reply

And what comes next is?

Death?

6

discuss.tchncs.de

you were promised it will. You paid out the ass for it. Money's gone, that shit ain't happening :)

6

NιƙƙιDιɱҽʂ reply

Oddly specific but okay

5

dan00 reply

I think you should post sources for your claims. This sounds stupidly wrong. Are you American?

4

wondrous_strange reply

How exactly?

1

orcrist reply

1