Can we trust LLM CALCULATIONS?.

Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

View original on lemmy.world

Comments74

SomeRandomNoob

discuss.tchncs.de

short answer: no.

Long Answer: They are still (mostly) statisics based and can't do real math. You can use the answers from LLMs as starting point, but you have to rigerously verify the answers they give.

unexposedhazard reply

discuss.tchncs.de

The whole "two r's in strawberry" thing is enough of an argument for me. If things like that happen at such a low level, its completely impossible that it wont make mistakes with problems that are exponentially more complicated than that.

otp reply

sh.itjust.works

The problem with that is that it isn't actually counting the R's.

You'd probably have better luck asking it to write a script for you that returns the number of instances of a letter in a string of text, then getting it to explain to you how to get it running and how it works. You'd get the answer that way, and also then have a script that could count almost any character and text of almost any size.

That's much more complicated, impressive, and useful, imo.

confuser reply

lemmy.zip

A calculator as a tool to a llm though, that works, at least mostly, and could be better when kinks get worked out.

Mark with a Z

suppo.fi

LLMs don't and can't do math. They don't calculate anything, that's just not how they work. Instead, they do this:

2 + 2 = ? What comes after that? Oh, I remember! It's '4'!

It could be right, it could be wrong. If there's enough pattern in the training data, it could remember the correct answer. Otherwise it'll just place a plausible looking value there (behavior known as AI hallucination). So, you can not "trust" it.

msage reply

programming.dev

Every LLM answer is a hallucination.

CanadaPlus reply

lemmy.sdf.org

Some are just realistic to the point of being correct. It frightens me how many users have no idea about any of that.

NewNewAugustEast reply

lemmy.zip

A good one will interpret what you are asking and then write code, often python I notice, and then let that do the math and return the answer. A math problem should use a math engine and that's how it gets around it.

But really why bother, go ask wolfram alpha or just write the math problem in code yourself.

Greg Clarke reply

lemmy.ca

They don’t calculate anything

They calculate the statistical probability of the next token in an array of previous tokens

Zos_Kia reply

lemmynsfw.com

Actually no, they have some sort of "circuits" that approximate math, which is even more interesting imo. Still not reliable in the slightest, of course.

supersquirrel

sopuli.xyz

Why would I bother?

Calculators exist, logic exists, so no... LLMs are a laughably bad fit for directly doing math, they are bullshit engines they cannot "store" a value without fundamentally exposing it to hallucinating tendencies which is the worst property a calculator could possibly have.

tal reply

olio.cafe

Why would I bother?

Because you want to have a single interface that accepts natural-language input and gives answers.

That doesn't mean that using an LLM as a calculator is a reasonable approach --- though a larger system that incorporates an LLM might be. But I think that the goal is very understandable. I have Maxima, a symbolic math package, on my smartphone and computers. It's quite competent at probably just about any sort of mathematical problem that pretty much any typical person might want to do. It costs nothing. But...you do need to learn something about the package to be able to use it. You don't have to learn much of anything that a typical member of the public doesn't already know to use a prompt that accepts natural-language input. And that barrier is enough that most people won't use it.