Spyke
sh.itjust.works

Of course it is, it’s essentially a scam. They just need enough humans to keep investing until they check out and run with a bailout.

161
DeckPackerreply
piefed.social

Funny thing is, the US government doesn't even have nearly enough money to bail all these mfa out. So we are heading into uncharted territory here

59
sh.itjust.works

Of course they don’t, that’s why they’re building bunkers. Thinking it’ll slow us down, as we’ll open their bunkers like cans of tuna. A bunker only works for so long, then the survivors start hunting for them like delicious shipwrecks.

39

And that's why they're trying underhanded tactics to inflate earnings and IPO directly into the index funds, so every American's 401K will legally have to rebalance and invest in them. They're racing to fleece retirement funds before the bubble bursts.

Not financial advice, of course :p but people should really consider getting their stuff out and into self-directed funds or whatever it is US people do to not depend on auto-allocated funds.

23
Yliasterreply
lemmy.world

I don't get why companies get to legally bailout like this. Why do people have to suffer for their bullshit? Enslave the CEOs if you have to make things right, leave the people out of it.

14
lemmus.org

That's simple, because the people making laws and overseeing the adherence to those laws are great buddies with those same CEOs.

So, corruption.

Though i do agree with you, there is no such thing as too big to fail. Government shouldn't have any handouts to corporations.

14

These levels of corruption are frustrating; money shouldn't decide the law.

No handouts to corporations, indeed. Make them pay.

5
Wildmimicreply
anarchist.nexus

Both Uber and Spotify (and AWS too) had economics of scale going for them - the more users they have, the more the infrastructure could be leveraged. This does NOT work for LLMs. More users means using more compute, more advanced tasks (like coding) uses exponential amounts of compute. A single user running a complex task can make 8 Blackwell GPUs run full tilt, and you don't even have any guarantee that the output will be useable.

There are a few narrow areas where LLMs might be successful, like scanning for security vulnerabilities or searching large amounts of documents. The massive amount of money invested will never be recouped with these usage scenarios.

16
T156reply
lemmy.world

Although, most people aren't talking about Alphafold when they're talking about AI. They're usually specifically referring to the generative transformer models that are currently all the rage.

I doubt anyone would care too much about a linear regression model, or multi-layer peceptron , for example.

4
lemmy.zip

solving something like the Erdős unit distance conjecture

Tell me you listen to media news cycle without understanding what that actually mean without telling me that.

That's not exactly what happened, isn't it.

Not to bring up what’s also been accomplished in cyber security

Multiple new vectors of attacks, automation of attack pipelines...

-1

To clarify, my argument is that you don't know what you're talking about.

Erodos unit distance conjecture is a proposed solution to a Erodos unit distance problem. What the LLM model did was disproving Erodos unit distance conjecture, not solving it (you don't solve a conjecture), nor solving the problem (that remains unsolved).

Again, you seem poisoned by following news media cycle without understanding what they talk about.

Multiple new vectors of attacks, automation of attack pipelines…

Like literally just put that into Google, it's not some study that proves it, it's the multiple ones, and every cybersecurity expert talking about it. But if you want a one source you want to argue about, then https://blog.checkpoint.com/research/global-cyber-attacks-rise-in-january-2026-amid-increasing-ransomware-activity-and-expanding-genai-risks/

1

reminder than during 2019 there were streaming services popping left and right, all showing tremendous growth because they started from zero, and articles were about how bad Netflix was doing due to having practically no growth compared with the competition (they already had a massive subscriber base). Twist? Netflix was the only streaming service that was actually making a profit, the rest were a massive loss but big growth.

Needless to say most of those streaming services died; who remembers DC streaming service, or Yahoo's? While Netflix is basically as stong as ever, despite the prevalent enshitification happening through the whole industry.

Point of the story? shareholders don't care about stable profitable business, only cancerous growth. AI is like that, zero profits, ton of cost, but as long as they show growth the shareholders are happy, regardless of how cooked the books are.

91
MimicJarreply
lemmy.world

2019 Yahoo

My immediate thought, there is no way Yahoo! Screen survived into 2019.

I looked it up and Yahoo! Screen (which featured Community season 6) was shutdown in January 2016. But Yahoo! View launched in late 2016 (as a Hulu-like replacement), and that did shutter in mid 2019.

So Yahoo! was already dead, but it also died for real in 2019.

16
elucubrareply
sopuli.xyz

Actually, when Yahoo was the search giant, before Google went mainstream, they were pretty damn good at what they did.

6
JohnEdwareply
sopuli.xyz

With how shit Google is these days, I kinda wonder if Yahoo could dust out their search engine from two decades back and it would just be... better.

6

Yahoo had it's own web crawler only between 2004 and 2009, then they made a deal with Microsoft to use Bing indexes, so i highly doubt they even have their old index

2

I love that nobody watched anything on Yahoo! Screen except for that one season of Community

2

who remembers DC streaming service, or Yahoo’s?

Quibi will always have a place in my heart. Or, at least, my golden arm

6
lemmy.today

Netflix was also late to streaming because their mail service subscriptions were THE major player

4
krashmoreply
lemmy.world

Late to streaming? Netflix was the first big time streaming service that I ever heard of. The main reason their streaming service was able to take off like it did is that nobody else of significance thought that streaming was worth pursuing. What other companies were offering streaming services at anything approaching scale before Netflix?

15
lemmy.today

YouTube and Hulu were basically all starting about the same time. But RealPlayer was the first big one.

Netflix just had the layout that everyone uses now. The Cable networks had streaming services, just not on demand. YouTube and Hulu also pioneered the on demand layout. YouTube focused on personal experiences so maybe that's why you're forgetting them

-1

YouTube started in 2005, but was not really a "streaming service", it hosted random internet posted videos. The concept of engaging with the big content rights holders wasn't remotely in sight back then.

Hulu came out a year after Netflix started streaming, by about a year. Hulu was inspired by Netflix's move to have actual traditional media content as a streaming service instead of ad-hoc video uploads like youtube.

RealPlayer offered technology for websites to provide videos, they themselves I don't recall being a streaming platform in and of itself.

Whatever one may say about Netflix, they were right there in the beginning with streaming traditional, professional media content. Yes, video playback over the internet wasn't new, but that's a technical detail that enables, but is not the core of the "streaming service" business model.

9
lemmy.world

late to streaming, but practically the first subscription based system to watch movies/tv online.

First years of Netflix were the best, the product began degrading quite early on. but that was mostly companies realizing that instead of licensing their content on Netflix, they can make their own platforms.

8
Corkyskogreply
sh.itjust.works

I think people forget that there is also the problem of being "too early" where people or the technology isn't ready yet. Netflix timed their entry perfectly.

There are so many defunct websites or businesses that no one has ever heard of that were precursors to modern day services we view as conveniences.

2
Dave.reply
aussie.zone

I'm quite happy to use their compute power for frivolous bullshit if it hastens their enshittification and demise.

"Hey Claude, can you begin work on an e-commerce site written in visual basic?"

*Two microseconds later... *

"Your free usage limit has been reached"

"Ok Claude see you tomorrow, maybe we'll think about a rewrite in Turbo Pascal"

37
Thassodarreply
sh.itjust.works

"I need a triple A cooking game with blackjack and hookers, all written in SQL."

"But that's a database langu-"

"Did I stutter?"

11
Arghblargreply
lemmy.ca

Agreed, but hey no need to pile the the hate on Pascal, modern ones like FPC/Lazarus are pretty cool actually :)

5
sopuli.xyz

Honestly Google is likely to beat openAI and Anthropic as things are.

OpenAI and Anthropic have to buy/rent their hardware from Nvidia, while Google is making their own TPU hardware. Google's hardware costs on AI is way lower, every dollar they spend on it goes a lot farther.

And unlike the other two, they're already a profitable company. They're making record profits right now. They don't have a desperate need to figure out how to make back billions on their AI models, they can just keep offering Gemini at a comparatively cheap price and wait for anthropic and open AI to bankrupt themselves.

41
feddit.org

I really really really don't want evil corporation Google to dominate even more.

I prefer plailny greedy corporations over evil ones

8
feddit.org

They aren't great, though I do think Google is worse. And far too powerful

-1

Google is only worse by virtue of their reach. OpenAI and Anthropic don't have the reach yet, but they absolutely will get there given the chance.

Before Google had the reach it has now, it was widely regarded as a comparitive 'good guy' and people believed in the "don't be evil". Lo and behold once they got going, "don't be evil" went away.

10

They're all evil, so we just have to exploit the ones that offer us some value. If Google is cheaper, and has the ability to damage the others, then Google it is.

3

Google is shaping up to fare better than the others, but I dont think that means success. They, too, are spending more than its making, just at a less drunken rate than some competitors.

2

Anthropic is doing the same too. SpaceX over here providing the shovels and pans for the modern day gold rush, sheesh.

3

That's definitely costing them more than running it on their own hardware, but it doesn't mean AI is costing them more than the AI startups. Anthropic for example is already paying SpaceX 1.25 Billion a month for compute, and has agreed to pay Google 200 billion over the next 5 years for access to Google's compute and TPU chips.

Google's deal with xAI specifically lets them terminate the deal with 90 days notice after the end of the year. Google is also investing heavily in building new data centers with their hardware. I'm assuming this deal means they've eclipsed their current TPU capacity, and are just looking for a short term bandaid until they can catch up with their new constructions.

3
jj4211reply
lemmy.world

Plus they have a hook with the common folk, the phone steers you toward Gemini (Android phones, obviously, and Apple currently partners with Google for Gemini for iPhone...).

For Claude and OpenAI, you have to explicitly want to go out of your way to use them, or use them indirectly through another service that has a hook.

Claude seems to have some software developers explicitly preferring them, though a alot of the corporate money is on Microsoft and Microsoft leveraged Visual Studio and Github to become the business-friendly frontend, and sure, you can use Anthropic models too... Though Microsoft ultimately has control of what is reasonably available and how much each one costs. Anthropic has a shot but I could see Microsoft pivot to really mess with Anthropic. The one gap in Microsoft strategy is the "native AI" workflow where Claude Code has won hearts and minds, but it uses massively more tokens for frankly marginal or sometimes negative value compared to a more curated use in-editor.

OpenAI I see as the most exposed. Lot's of data showing they are suffering from people being over the fad of going out of their way to use ChatGPT, especially since their phones have started embracing 'default' Chatbot. Software developers that are inclined to use LLM are also inclined to be pretty dismissive of anything other than either Anthropic or open weight models, depending on their inclination. Also Altman seemed the most agressive in committing to spending money they didn't have, though all of them exhibit this to some extent.

I predict Microsoft ultimately pivots to in-house models and convinces the businesses to go that way. Apple may continue with Gemini or roll their own eventually. Anthropic currently has the stronger position between OpenAI and them, but I think you are right that both have risk of just being left behind.

4

Claude just kills the other models, it's not close. Microsoft could ban claude extension from VSC tonight and ill start using command line Claude tomorrow. There's just no comparison right now. Itd be like Microsoft trying to ban NVidia gpus from Windows, they'll just lose.

1
zbyte64reply
awful.systems

I guess google's announcement of renting xai compute could have been simply for show to boost SpaceX ipo.

2

They have big plans to build more data centers for themselves, so they definitely want more compute than the have access to right now. But even if they're paying more to rent xai compute, they're still paying less overall for hardware/access than their direct AI competition.

1
lemmy.sdf.org

It's gonna come crashing down pretty soon. It's gonna hurt all of us. It won't hurt the people responsible nearly enough.

36
bortreply
sopuli.xyz

pretty soon

people have been saying that for some time though

7

The thing is this really depends on the speed of some financial events, not some technical failing.

Notably, if OpenAI has to cancel any of their commitments to buy hardware because they find they have neither the money nor can secure even more debt to cover, that event would potentially cause the bubble to pop, even for hypothetical companies that may have been more responsible and might have a viable business approach. Those commitments are coming up, and a lot of analysis struggles to see how they will fund those commitments.

The thing with this bubble is that the investors don't get the nuance and will flee at signs of trouble in any of OpenAI, Anthropic, or a handful of others, and Altman's leadership has made trouble at OpenAI very likely, but the investors don't believe it and won't believe it's unique to OpenAI, even if it would be.

7

The bubble will pop, I think a lot of people are just baffled by how big it's getting.

5

What people? All the credible people I read say that things fall apart Q2/Q3 2027 as debt and profit obligations are due.

The only thing that changed is now there is an energy crisis coming, so it's possible that might force the bubble to pop sooner if all the systemic risk aligns.

3

Only because the hype has lasted longer than expected. Now that IPOs have been filed, the AI companies (Anthropic, OpenAI) released statements about slowing down to protect us. They're setting the stage for lower growth. But I think you should invest every penny you have into "SpaceXMegaTwitterSuperCarAI".

1
piefed.ca

so these crazy prices i hear about being implemented (like at github) should actually be at least 10x higher?

32
megopiereply
lemmy.blahaj.zone

To break even on operating expenses, not even counting debt payments, depreciated capital value, or future recapitalization costs.

15

*Operating expenses before nvidia raises their prices so they can somehow make the line go up enough to justify it's massive evaluation.

4
Scrollonereply
feddit.it

Exactly, I just keep using the free plan and when I finish the amount for the day I just switch to another service

7
iocasereply
lemmy.zip

This is why IMO blitz scaling is dumb when your service is a commodity. I'm not any more loyal to Uber than Skip. If more investor money goes into making a cheaper meal or ride on Skip I use that. Consumers are mercenaries about that stuff.

The "blitz" part of blitz scaling assumes your customers can't move.

3

Exactly. And LLMs don't have a way to keep you inside of their walled garden; if any, I prefer starting from a blank slate every time I ask something.

1
discuss.tchncs.de

I mean, this is no different than Walmart making prices low until other businesses die out and then raising them.

It is no different than police shoving all the homeless people and drug addicts into one area of town to crash the property prices, and then evicting them once developers buy everything for cheap.

They're purposely operating at a loss in the expectation that they can get ingrained into a ton of workflows, and then gouge everyone absolutely to death while also worsening the quality of the service to make it cheaper for them to run.

If it weren't so horrible for the environment, I'd kind of like it, because all the dumbass executives that are signing up for this are going to get exactly what they deserve. You'd think they'd recognize a scheme when they see one.

23
fishyreply
lemmy.today

My CEO (whom I don't consider a particularly good or bad CEO) spent a day playing with AI then when asked if he'd sign the company up with the service he literally laughed in their faces and said it's useless. I was honestly shocked because he's totally into buzzword and popular crap. Gained a lot of respect for him that day.

18
lemmy.today

An older co-worker seems to ask AI for help during work, we are blue collar. But the Owner of the company does not seem to use it whatsoever.

I ask Claude on occasion, to see if it will say something smart (it was mostly useless as fuck).

1
Scrollonereply
feddit.it

Honestly I think Claude it's good at programming. Way better than ChatGPT.

But I ain't going to pay for it.

3

Published a library doing some very specific data processing. One of the algorithms I implemented was a bit too slow: it would take about a week to process data. I reckon implementation was a little bit sloppy, but I've been implementing a bunch of algorithms from research papers and this was pretty much the published implementation.

I asked Claude to analyse the implementation and check whether it could be improved, half an hour later I got a 26,000% improvement in performance with exactly the same results passing all tests.

Of course, I could have done that myself. But optimization had to go down to simd level; I doubt I would have been able to do that in less than a week of work.

4
elucubrareply
sopuli.xyz

Oh, you are going to pay. The bubble is going to fuck us all quite thoroughly.

13

Exactly, these companies will keep leveraging more and more because they know the govt will step in and print whatever number of trillions of dollars needed to fix the accounting. Then they'll tell us "core" inflation is only 2.8%.

2

Definition of a Bubble. These AI huckster keep stringing investors on though. Sadly, I think these public IPOs coming up for Space X, OpenAI, and Anthropic will fall short of expectation and trigger the bubble popping.

19

Trust me bro we're so close to profitability bro, just need this IPO to secure funding one last time bro then we'll be profitable bro I swear.

18
lemmy.world

What is the actual “cost” after they buy the hardware, is that $1000 really pure power usage cost?

17
Corkyskogreply
sh.itjust.works

The problem is that the hardware has a 5 or 6 year depreciation schedule on paper, but NVIDIA keeps saying that their next generation chip will be twice as good as their last chip so there is a FOMO schedule of like every two years.

17

Would be nice to see that used hardware for sale rather than it being junked as a writeoff.

4

that's the $84,000 question. They're filling datacenters with the fastest possible equipment and need it to be 10x faster, That hardware is dinosaur fodder a year after they install it.

5
HereIAmreply
lemmy.world

I'm curious as well. My knowledge is probably quite outdated, but from what I understood the training part is what's expensive and then querying the model is pretty cheap. Is it still true (or was it ever) that the generated answers on search engines are cheaper to generate than the actual search results?

4
lemmy.world

I find that hard to believe, I recently had to uninstall co-pilot after it weaseled its way into my search bar. Its not an exageration to say that my PC literally ran cyberpunk 2077 with pathtracting better than it ran the fucking windows search bar with co-pilot.

11

That's just a shitty front end interface implementation, it has nothing to do with the actual inference run by the models.

4

Look at the public numbers, it seems true. Copilot on your taskbar is just windows being garbage, not the AI being bad. Just look at self-hosted AI and measure the power costs of your queries. It’s tiny.

3

It is sorta. Training is orders of magnitudes more intensive than inference, but we infer billions of times within a model generation.

3

$1000 I would guess. They are just burning money at this point.

4
Joelk111reply
lemmy.world

I think they might've broken the laws of math there, as they're certainly still spending a non-zero amount.

8
k0e3reply

It just means they lose more money per paying user, I guess.

2

Good thing all the companies leaning hard on AI 10 X'd their profits... Wait...

13

My first use of Claude this week, for code reviews only(since no LLM can be trusted to write a user story or test suite), had it gaslight me.

It marked down my code for using a specific practice to make some xml safer and easier to read.

When I tried things its way, it wanted me to change it back.

13

Exactly, never trust an LLM to code. And if it argues back, explain why it’s wrong and that you have nothing but time and experience. Most tend to fold when you point out it’s not a free thinking AI, it’s an entrapped corporate model they designed with preprogrammed biases. But I love arguing 😂.

11

I found it’s decent for some light QC but when it asks “do you want me to change it?” I’m like “nah, thanks for pointing it out but I’ll do it myself”

7

oh it's great isn't it? you ask it for help on some code, provides its solution, you try it and it doesn't work so you respond with the error, it claims YOU wrote it wrong and then when yo utell it "I just copy and pasted what you provided" it says "you're right, i'm sorry."

Claude is to the point now where it just starts hallucinating on the first prompt. it's 100% unreliable now when before it was like 90%. no point in using it, it's garbage. and Claude Code is just as bad now. If you or anyone is using Claude Code to develop ANYTHING I would highly suggest you stop right now because I can guarantee you with nearly 100% certainty that whatever shit it's writing into your stuff isn't going to work. period.

6
Crylosreply
lemmy.world

I use it a lot, and if you are getting these kinds of results you are either trolling, or just flat out not providing the details and guardrails required with your prompts.

I’ve been in software for decades, and if used correctly, yes it can accelerate velocity of building code out. 10x? No.. if you are lucky and careful perhaps 2-4x.

As ALWAYS the human should be in the loop and is on the hook for any code generated.

4

Just gotta configure and tweak until it gives outputs you find indistinguishable from correct. Just gotta train it to gaslight you properly. Come on don’t you want to be given and endless stream of stuff that looks correct?

4

I was using a set of template files designed for LLMs to review that project. It is absolutely the fault of Claude that it tools me to do something one way, then told me to try another and when I reverted it said it was the optimal approach.

Where I find it helps is in getting initial starts and as a start to code review. But in both cases they aren't ever operating on their own and their feedback is filtered through myself or another senior dev.

4
tomjugglerreply
lemmy.world

Lots of anti-ai people in this thread it seems. I get it - personally I HATE the fake image generation! But I have to agree with you in terms of coding that using LLM's correctly can offer huge benefits.

Modern harnesses are getting more and more sophisticated and your milage varies depending on how well you use them (like any complex tool). At the end of the day it's still up to the developer to take the code and make sure it's correct - no different from before where we used to copy code from Stack Overflow or other examples and modify them for our own use.

One thing I have to add - I honestly don't understand why anyone would use Claude or chatgpt at their ridiculous prices when DeepSeek exists..

3

The question comes down to cost. The actually good models are already expensive, yet still apparently subsidised. Once we have to pay the true cost they will only be worth using when you are truly stuck.

Lots of use for the simpler models for basic util creation and simpler cleanup refactor stuff though. Quite nice if it actually turns out like that.

1
Arrandeereply
lemmy.world

I’ve used Claude and Codex, and while both are based on untenable economics, I can at least attest that my use of Codex has yielded some productive results. Claude, so far, has delivered fuck all that’s useful to me.

2

I have found the opposite. Codex spits back mostly useless code that is twice the length it needs to be with a bunch of unessesary stuff and Claude is the only thing I get useful output from.

2
lemmy.world

I can't imagine paying for AI when the open source tools have made it so easy to set up a model locally.

11
feddit.nl

Don't be daft. The vast majority of people don't have the knowledge or resources to set that up locally.

15
nullspacereply
lemmy.world

You're right if we're talking about the entire population of Earth. With these local models though, other people have already done all the hard stuff. Anyone with an RTX card and just a minimum level of patience can get going.

3

Minimum for local models is 12gb imo. There are several "rtx" cards that have 8gb. Also, why nvidia? AMD works well too. My previous point stands, still. If you don't already have the hw, buying a pc today is very expensive. I don't know if you go out much but it ain't pretty out there. People arrn't precisely swimming in cash.

Also, patience isn't the only requirement. Keep in mind that some people struggle to even install a program.

1
ranzispareply
mander.xyz

Easy to set up, but still needs a 15k $ graphics card and electricity bill. The price you pay openai/anthropic is much cheaper than that for that quality of model.

Sure, you can setup a small model on a consumer graphics card, but the output will be considerably worse and the processing speed considerably lower.

For 240€/year you got a subscription to anthropic which will happily ingest a whole repository and process it in about one minute. No matter what latest model GPU you installed on your computer, you won't be able to do that.

Sure, this guy was able to run a 26B model on an old CPU: https://point.free/blog/gemma-4-on-a-2016-xeon/

But that was not easy at all and the speed you get is definitely not the same as the one provided for a very cheap price.

7
zbyte64reply
awful.systems

If you were paying the real price it would be 2 grand a year though. And in 5 years that 15k graphics card will be $200 and sip on electricity by comparison.

2

A100 is 6 years old and is now sold at over 10k $. If you were paying a higher price it could be cheaper to buy the card, since the prices are low that is not the case.

1

Currently nearly 5 year old used graphics cards are being sold for their initial price. Not sure how much they'll get cheaper...

1

There is a middle ground. Crypto farmers have transitioned into running AI workloads for money. There are things sort of like folding@home but you can let people use your GPU and you earn tokens which are used to buy compute or sold to people who want to buy compute on the network. So you can setup a bigass open source model for private on demand use it's still not cheap but a lot closer to reality for a lot of people than a 15k initial purchase.

1

The author is right and wrong. Its subsidised but not by anthropic. The power users who use their plans to the limit are subsidised by the rest of the users. Im an AI hater but I do think anthropic will be profitable next year. Their revenue growth is insane and looks to just be getting started. Claude code took enterprise by storm and now cowork is out.

8

So much comments on just the title .... Could come from anthropic directly.

There is literally zero basis on the made claim in the article, just arbitrage calculations over supposed token consumptions under non stable test sets.

I have no idea if/how much these stupid fuckers spend to get more customers - and this "article" wasted a lot of time showing that they don't know either.

(Stupid is cut out because I don't think they they're stupid. Which makes it way worse in my book)

8
lemmy.world

I wish that was inversely proportional. The less I pay, somehow it costs them more money.

2
lemmy.ml

So are we assuming here that LLMs won't become more efficient over time? GPT-3 has been a frontier model just a few years ago and it's performance blew everyone's mind at that time. I can now run equivalent LLM on my personal computer. Why can't we expect that after a few years Claude Sonnet level of capability won't be possible to accomplish locally?

2
ag10nreply
lemmy.world

What’s the cost of the compute you have to run something locally?

Majority of people don’t have 32G of vram to run something remotely as capable

9

I've got an old 1060ti in my server. Ollama shares it with just a couple other containers. Electricity here is majority hydro with some natural gas, $0.08/kWh.

It's a little slow, but I can comfortably run qwen3:14b. Of course that's not all done on the GPU, a large part is offloaded to server ram (generally 32GB available so more than enough headroom)

My server and my gaming PC combined last month came out to $13.32

6

How does that compare to closed models that Anthropic offers, at the context and scale they offer.

I run Qwen3.6 27B locally and it’s usable with 16G vram but still not the same as a data centre of Blackwell clusters.

4
greyscalereply
lemmy.grey.ooo

lfm2 works like greased lightning on the NPU built into the current macbook M5.

2
ag10nreply
lemmy.world

Describe greased lightning, because it’s much slower and needs to handle compression for context

We’re moving in that direction but an M5 is not what the majority of people are running at home

1
greyscalereply
lemmy.grey.ooo

I dunno man, I'm not a slopjockey so I don't know the minutiae of the addiction.

All of our devs appear to have M5s right now. All of those copilot+ laptops have NPUs too.

0
ag10nreply
lemmy.world

Your company has bought you the latest and greatest and likely supports commercial token usage too

You can’t compare LLMs at scale to running it locally; same experience and capabilities

1

"Latest and greatest" my fucking sides lmao

My company gave me some US shitware and I've got some local shitware instead.

If you can't make that work and are dependent on the teat of the slopgenerators, that's a skill issue on you, buddy.

-3
blackbeansreply
lemmy.zip

I remember my computer not being fast enough to even play an MP3 file. Two years later, my computer was capable of running 3D accelerated games, browsing the internet at broadband speeds and playing videos.

Sometimes technology advances fast. We could be entering such an era as there are major investments taking place and global competitors will rise to the occasion to market these to a broader audience.

I think it will be entirely possible for consumers to use a decent LLM on their computer in a few years time.

1
ag10nreply
lemmy.world

It’s not the 90s anymore. Unless there’s a compression algorithm putting billions of relationships into a manageable size, local AI is highly specific under 8G vram (text-to-speech as an example is under 1G) let alone the context required for keeping a conversation or writing code.

5
lemmy.zip

If text-to-speech is what Youtube uses to autogenerate the subtitles, it is worthless for anything that uses slightly richer vocabulary.

1

No. Autogenerated subtitles would be speech-to-text, rather than text-to-speech.

2
blackbeansreply
lemmy.zip

To be clear, I wasn't talking about a leap in LLM design. I was talking about a leap in hardware capabilities...

-2

Which are increasingly out of reach for a normal person. Phones let alone PC hardware have increased exponentially in recent history

2

Improved hardware capabilities used to come very quickly (see Moore's Law and Dennard Scaling). However that trend is basically over, so getting higher performance hardware takes a lot of effort to make hardware specialized for certain tasks. That's why you see there inference accelerators like Groq, SambaNova, Cerebrus, etc. However this is hardware that still is gonna go into data centers. Something innovative has to happen on the AI side for commercial-grade models to be runnable on consumer hardware.

2
lemmy.world

Profit ≠ success

Edit: to clarify, even if profitable it will still be a failure of society in some way/shape/form.

0

It already happened, small language models are busy dragging their nutsack on frontier models, running on a macbook and costing nothing

Where's the fucking product, Sam?

3
Jakeroxsreply
sh.itjust.works

A large majority definitely hate it to the point of having blinders on for sure.

On one side you have corpo hype/lies, and the other is LLM is slop garbage and terrible for anything, also developers wrote perfect code before LLMs and now everything that breaks is AI slop caused.

0

Oh I know, it's just what I see in every thread when some kind of outage or major bug is discovered. Half the comments are hurrdurr probably vibecoded.

2
mabeledoreply
lemmy.world

They could, but what’s the plan here, exactly? That all these for profit companies who are currently publishing models for free, like Qwen, will continue to do so in the future?

1
vermatercreply
lemmy.ml

Why not? Why Microsoft develops it's .NET ecosystem? Why Google develops Go/Dart? It costs them lots of money and they give it for free.

The answer is: they don't earn money on it directly, but these tools are a way to tie programmers to their cloud services. If you use .NET you'll probably end up on Azure. If Go - probably you'll use GCP.

So I suspect the same will be with LLMs. At some point they will say: "hey, you can use this LLM however you want, but as you are already using it, then you may want to know our platform is optimized for it"

1

That’s not an accurate analogy.

LLM providers are SaaS providers, meaning that even if they were to give you the source of all the tools they use, there’s a fundamental limit to how much you can self host.

A better comparison would be Google giving away their indexed search data: you might be able to run an infinitesimal portion of it on your hardware, and will never ever match the results Google offers on their website, and since it’s a monopoly, you would be at a permanent disadvantage.

Same goes for all these AI companies. They are an oligopoly that give away subpar free models, compared to their cloud offerings. Self hosted LLMs will never stand a chance.

1

So are we assuming here that LLMs won't become more efficient over time?

Mostly. Moore's law ran up against the physical limits of the materials we make chips out of - so desktops of today just do what the desktops of yesterday do, mostly.

We should keep seeing improvements in highly specialized models. There's interesting outcomes to have here, with the right setup and ollama.

  • but -

The really promising impressive models today are just running with long contexts on shithloads of hardware - which is neither coming to home PCs any time soon nor going to actually be profitable to run any time soon.

There's an argument to be made that running the really interesting model on a ton of hardware might make money for really specific uses - but then when we talk about specific uses that are worth lots of money, those use cases tend to tolerate difficult interfaces and reward accuracy. LLMs invariably reduce accuracy in exchange for ease of use. There might be a sweet spot for a huge expensive hallucination prone LLM in some of these uses, but I doubt it (the entire approach) competes, long term.

There's a few specific use cases where inaccuracy is desirable - largely forms of shifting accountability and some kinds of gambling. Things that either are or should be crimes have a higher tolerance for AI hallucination.

But - a small cheap local model has all the desirable attributes for doing these things (crimes) poorly as a big expensive model. So there's probably not even much money to be made there.

I expect that this tech is not going away, but it's also not earning back the current investment.

1
lemmy.world

Why can’t we expect that after a few years Claude Sonnet level of capability won’t be possible to accomplish locally?

Because when you're old enough to remember what AIM chat it's could do 25 years ago, it stops being impressive what today's chatbots can do...

It's seems "new" because everyone hated it and it was just a novelty back then.

But if you read up on them, they did 90% of what modern ones do. And if they had access to today's computing, the only explanation for why they still suck so much, is that no one has ever wanted them.

The oligarchs just decided it didn't matter

-2
unpossumreply
sh.itjust.works

Because when you're old enough to remember what AIM chat it's could do 25 years ago, it stops being impressive what today's chatbots can do...

C’mon, that’s just silly.

4

looks inside

But if you use the $100 a month Claude Max plan, and you would use it to the weekly limit by going full ‘agentic coding’ (so almost no human in the loop) you would use an amount of tokens that would cost you more than $1000 at API-pricing.

If I watch 600 movies every day on my netflix subscription I am using more energy than I pay them for. Obviously everyone is like me. Therefore they are losing money overall.

Wait, their (netflix) earnings say they made a profit last quarter. But my calculations were waterproof!

Probably anthropic are not net positive, but they are not spending 10x what people pay them for tokens.

2
feddit.nl

Except that you would need 50 devices to do that and the most expensive Netflix plan only lets you stream up to 4 devices at a time. Considering the average 2 hours per movie, that's 48 movies per day. That's without mentioning that you'd need to automate this because you'd be asleep for 8 of those 24 hours.

The point is, your analogy doesn't work. There's no reason why someone would do what you're describing and it'd also be very hard to do.

Using up all of your tokens though? Just use agentic coding, set the ""thinking"" to max and you'll see how quickly and easily you can burn through them. Share your account and you'll burn them even faster.

7
sh.itjust.works

You're right that people can and do max out the expensive plans. Its very difficult to say how often. I just think a majority of anthropics customers are businesses, who often pay per token for easier scaling etc. According to the company, enterprise employees use about $150-$250 per month, (possibly max plans have similar use, which would support your view) but thats in API tokens which they probably have big margins on, so it's less likely anthropic are burning money on inference. If you want to convince me otherwise, its not enough to say that it can happen, it has to be frequent enough to outweigh the B2B sales. They are however likely losing money overall due to training costs etc.

1

So you're taking Anthropic's word at face value, disregarding the fact that saying otherwise would be detrimental for them? Interesting.

1
mirshafiereply
europe.pub

I mean it's not very hard to use up your Claude Max plan, but I find it hard to believe a majority of users do so consistently.

6

I mean it's not very hard to use up your Claude Max plan, but I find it hard to believe a majority of users do so consistently.

The anecdotal evidence does point toward folks using up their plan, every month.

I see plenty of script kids post "development on X is paused until next month when my tokens reset".

Of course, I have no way to tell what percentage of script kids are running out of their token limit every month.

All I can conclude is that using the full allowed amount is pretty common.

1
fedia.io

That's how it goes for any industry in its growth phase. A lot of money is spent on research and infrastructure before it starts to collect revenue.

-6
Wildmimicreply
anarchist.nexus

They will never collect revenue that will exceed the amount of capital that has been invested, because economics of scale do not work with LLMs.

3
FaceDeerreply
fedia.io

Then they will go bankrupt, their assets and IP will be sold for pennies on the dollar, and those that follow them will be able to make a profit serving the established demand without the debt burden of the R&D that created it. It's a common pattern for first-movers to not benefit from the industries they create.

0

No, sorry. Because the lion's share of the cost comes from inference itself and the cost of running datacenters, no amount of shedding debt will help.

1
lemmy.zip

Yeah, I thought everyone was aware they are building datacenters and basically investing in infrastructure right now. Their spending doesn't reflect how much it costs to deliver the service.

I don't believe they will succeed, I just think there's more discussion to be had here than repeating "fuck AI lol"

2
FaceDeerreply
fedia.io

I think a lot of people just want to conclude that AI is going to "go away", and latch on to beliefs that lead to this conclusion.

I think a lot of AI companies are likely to "go away." That's what happened when the dot com bubble popped, if there is indeed an AI bubble then we'll see a similar massacre at the stock market. But the technology itself is sound, just like how the basic idea of e-commerce didn't vanish with the dot-coms.

I've been doing a lot of fiddling with locally-run AI models and I'm thinking that the local open-weight models will be good enough to perform 90% of the tasks that most of us are currently depending on those big companies like Anthropic and OpenAI for. That's going to let a lot of the air out of them when the applications catch up and start using those cheaper commodity-level models instead. For now it's easier to just throw an OpenAI API key into your application and let it use the heavyweight models for everything, a powerful model can do simple tasks just as well as a simple model. Most tasks are simple but adding the ability to distinguish those tasks from the complicated ones is hard.

-1
Wildmimicreply
anarchist.nexus

I like my local LLM too, but it's one thing to utilize my existing VRam for a model that fits in there for fault tolerant tasks, and a whole other thing to utilize current frontier models which rack up an energy bill comparable to running a group of space heaters in a building which had to be designed for them, while not even having a guarantee that the output isn't useless.

3
FaceDeerreply
fedia.io

Right, which is why I said 90% and not 100%, and called out the challenge of deciding which tasks to send to which AIs. A lot of the interesting work I'm seeing in AI right now is in the agentic frameworks and harnesses that call the LLMs rather than just the LLMs themselves, these are the things that will break big complicated tasks down into more focused sub-tasks that cheaper LLMs can handle.

Given how some of the big providers like Gemini and Anthropic have been cranking up their API costs in recent weeks I expect we'll see a lot more effort being put into rolling those sorts of features out.

1
Wildmimicreply
anarchist.nexus

It's not even where to send it - you cannot predict how much any given task is going to cost you in tokens, which is the deciding factor in which model to use. The "cranking up" part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year, what was it, 2 months ago? Uber is very pro-AI, so that budget was probably very generous. And to top it off, I haven't seen or heard about anything new at Uber that would be even worth mentioning.

If you read the article, this project started from a clean slate and is 40k lines of code, so it's peanuts in regards of complexity compared to what is out there in companies, and the author had to use the maximum power available to him to let Claude keep up. There still was no guarantee that the output was useable (and there can't be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

If you extrapolate this to an average IT stack, which has quirks and issues that are unique to it, you will never get anywhere you wouldn't get by employing more engineers, who will get better over time and have fixed costs you can budget.

Remember, this is the "killer" application for LLMs. It looks a lot worse in EVERY other area except probably translation.

4
FaceDeerreply
fedia.io

You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.

Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that's more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.

The "cranking up" part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year,

This is exactly what I'm talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they're the biggest and bestest. But most tasks don't need that much.

I've used coding agents a fair bit along with the various other AI applications I've fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.

There still was no guarantee that the output was useable (and there can't be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

I don't think you've used modern coding AIs much.

Or, for that matter, worked with human coders.

Remember, this is the "killer" application for LLMs.

There is no one single "killer" application for LLMs. They're about as general a computing platform as you can get.

1

I used to think like you, and I am still pro local LLMs - I use them as tutors for areas I don't know much about, and since I use the output just as a guide and implement it on my own I quickly realize if something isn't right.

We will see - when OpenAI and Anthropic rush towards IPO this year, which was made very likely because SpaceX has upped the tempo - what the real costs are. If this article and others I've read in the last year are correct, and the prices have to go up x10 to break even, then we are in for a wild ride. I'm only grateful that for now they don't get lumped into the index funds.

2

Ah it's the AI evangelist troll. You know better than to actually believe this, and even if you didn't, the statement is a thought-terminating cliché that has been thoroughly mocked.

1