Spyke

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

I trust AI far more than I do a random person. They have access to far more information, and are more likely to be correct about any particular question asked.

That is a terrifying stance. And, frankly, embarrassing.

"OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws": https://www.computerworld.com/article/4059383/openai-admits-ai-hallucinations-are-mathematically-inevitable-not-just-engineering-flaws.html

OpenAI, the creator of ChatGPT, acknowledged in its own research that large language models will always produce hallucinations due to fundamental mathematical constraints that cannot be solved through better engineering, marking a significant admission from one of the AI industry’s leading companies. [...] The research proposed “explicit confidence targets” as a solution, but acknowledged that fundamental mathematical constraints meant complete elimination of hallucinations remained impossible.

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

I think the Wikipedia definition of thought is quite good.

However, I have a feeling whatever definition I came up with, you'd just claim LLMs fit into it because their output is sometimes somewhat coherent.

You can claim that technically LLMs "think" because the output text sometimes contains conclusions, and sometimes they're even rational, even though the LLMs still struggle with counting Rs in "strawberry".

I find that disingenuous because it implies that the LLM is in any way aware of anything, that it can passively form ideas.

Most importantly, it implies that you can trust it for even basic reasoning. That you can trust the plagiarism machine that tells you that you should put glue on your pizza, eat rocks and walk to the car wash instead of driving, or that you will be able to trust it at some point in the future.

Whatever definition of thinking we use, it should include a simple rule - that the allegedly thinking entity should demonstrate that intelligence by being able to reliably answer simple queries correctly. Humans, by and large, can do that. LLMs fail at it miserably. If the LLMs were truly thinking, that should be shocking. Understanding the underlying technology - and that it is not truly reasoning - makes it obvious and expected.

Even OpenAI admitted hallucinations are an unfixable mathematical inevitability - something you handwaved as a matter of time to fix. No, the fact that humans can have hallucinations is not comparable.

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

Well, I suppose we can at least agree to disagree.

I have seen so much incoherent but confident nonsense produced by LLMs (mainly by frontier models trying to do even basic software development) that I would not be able to say in good conscience that thought was involved. Junior developers would have done better. The experience definitely fits the behavior of a word predictor, though.

Having seen what LLMs claim about software development, my stance is that absolutely no one should trust at face value what these models output. They're Dunning-Kruger machines.

As for producing new ideas, these models are as creative as a random number generator. Coincidentally, that's what is responsible for faking their creativity (the "temperature" parameter).

I guess that's all I feel like saying in this particular thread.

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

Word predictors don't think any more than a magic 8-ball does.

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

Except LLM output is largely gibberish. Just confident gibberish. There's a reason we call it "AI slop".

LLM responses are only ever "sound" when they're regurgitating existing information they were trained on. Beyond some simple transformations, they are unable to create original ideas. They very frequently break down on somewhat unique tasks, as evidenced by the ever-prevalent code-slop which is eroding our software.

They don't have a memory of previous conversations (unless you literally copy-paste it into the prompt), they don't learn (Claude "memories" is literally just copy-pasting a summary into the prompt, only automatically). They don't have any "thoughts" of their own between prompts (OpenClaw just keeps prompting them to pretend they are autonomous).

The underlying implementation of "reasoning" in LLMs is literally "hallucinate some more text which vaguely looks like thoughts and hope that influences the answer". LLMs are probabilistic models which we figured out how to make so they produce somewhat correct-looking answers at a rate a little higher than chance.

Magic 8-balls sometimes give sound responses. Do they think? Where do we draw the line with this interpretation of "thinking"?

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

No, humans are not word predictors, and my claim is absolutely not an oversimplification.

LLMs are word predictors. No amount of attention heads and backpropagation is going to change that. Scientific researchers agree.

The human brain works in a completely different way to how LLMs do and to conflate the two like you did is disingenuous.

View in thread

nostupidquestions

Comment on

Has anyone or anything ever passed the Turring Test? If so how and why?

Reply in thread

I hold a MSc from what is arguably the most prestigious University in Europe

Good for you. Have a cookie, I guess?

LLM do not simply regurgitate existing content, and are in fact capable of creating wholly new content not seen before.

Citation needed.

Hallucinations occur when their context buffer is too small, and as time goes on, it will largely be a thing of the past.

A whole book of citations needed. That claim is wildly inconsistent with the consensus about AI hallucinations.

Magic Eight Balls, as I'm sure you're aware, have a limited, predetermined number of responses.

You mean like how LLMs keep hallucinating the same passwords and nonexistent dependencies to the point that bad actors are using that fact to compromise vibe coded systems via techniques like slopsquatting?

I would disagree with you, and would suspect you are basing your assessment of their abilities on dated usage.

In fact, I keep experimenting with frontier models (including Fable when it was available) just so that the "but we've made so much progress in the past few months" argument can't be used against me. You're wildly overselling their capabilities.

View in thread

world

Comment on

Putin Says NATO Is Preparing for War—and Compares Western Countries to Nazi Germany in 1941

And we're reporting this why...?

View in thread

asklemmy

Comment on

Are you gonna be buying a Steam machine?

Reply in thread

The PS5 is subsidized to get you into the ecosystem. Valve let's you play anything you like on the Steam Machine, so no incentive for them to sell at a loss.

I imagine they went for the less powerful GPU because hardware prices are insane across the board right now and Valve has no negotiating power with the manufacturers, as they said.

View in thread

hackernews

Comment on

SpaceX sheds $400B in market value as debut rally hits reverse

Reply in thread

Doesn't work.

View in thread

technology

Comment on

*deleted by creator*

There are good reasons to dislike Telegram, but having "just" 30 engineers is not one of them. Software development is not a chair factory, more people does not equal more or better quality work as much as 9 women won't give birth to a baby in a month.

Edit:

Galperin told TechCrunch. “‘Thirty engineers’ means that there is no one to fight legal requests, there is no infrastructure for dealing with abuse and content moderation issues.”

I don't think fighting legal requests and content moderation is an engineer's job. However, the article can't seem to get it straight whether it's 30 engineers, or 30 staff overall. In the latter case, the context changes dramatically and I don't have the knowledge to tell if 30 staff is enough to deal with legal issues. I would imagine that Telegram would need a small army of lawyers and content moderators for that. Again, not engineers, though.

196

View in thread

programming

Comment on

80% of programmers are NOT happy… why? - YouTube

A part of it is horrible practices and a work culture which incentivizes them.

Who can be happy when the code doesn't work half the time, deployments are manual and happen after work hours, and devs are forced to be "on-call"?

Introduce Test-Driven Development, Domain-Driven Design, Continuous Deployment with Feature Flags, Mutation Testing and actual agile practices (as described in the Agile Manifesto, not the pathetic attempt to rebrand waterfall we have in most companies) to the project and see how happiness rises, along with the project's reliability and maintainability.

Oh, and throw in a 4 day work week, because no one can be mentally productive for that long.

IMO the biggest problem in the industry is that most developers have never seen a project actually following best practices and middle management is invested in making sure it never happens.

View in thread

gaming

Comment on

"We made a series of mistakes": GOG apologise for emailing Nazi symbols to people in newsletter about Slavic fantasy game

The bigger issue for me is that the apology was issued on Twitter. What is GOG still doing on Twitter?

View in thread

memes

Comment on

If you beat *THIS* mission, you probably are a gaming God

I just beat this level yesterday!

It becomes easy... Once you know what the tricks are supposed to be, which the game doesn't tell you at all.

For me, these were the tips I needed:

There's a dedicated button for burnout, which makes it super easy to do the 360
the slalom only counts if you do the pillars on one side of the garage BOTH WAYS
To do a backwards 180, drive backwards, then push one direction, then halfway through push the other direction.

Supposedly the PSX version also has a video in the options menu which shows you a dev completing the course, with button prompts on screen.

Oh, and there's a cheat code in-game to skip this level entirely.

View in thread

technology

Comment on

Las Vegas' dystopia-sphere, powered by 150 Nvidia GPUs and drawing up to 28,000,000 watts, is both a testament to the hubris of humanity and an admittedly impressive technical feat | PC Gamer

Reply in thread

Does this really make it any less worthy of criticism, though...?

View in thread

games

Comment on

Ubisoft has cancelled 6 games, including the Prince of Persia: The Sands of Time remake | VGC

Reply in thread

Given the original announcement footage, it might be for the best...

View in thread

technology

Comment on

Visa plugs its payment network into ChatGPT

“I think we’re generally at a place where most people are very comfortable with the shopping aspects of it [ChatGPT] and have discovered this as a superior discovery experience,” Forestell said in an interview.

Gaslighting at every step.

View in thread

technology

Comment on

Elon Musk's X pushes Trump tags on all US users

Hopefully this will enrage the users enough to go and actually vote against Trump.

View in thread

games

Comment on

GOG is Getting Acquired By Its Original Co- Founder: What It Means For You

Reply in thread

I really want them to bring back self-hosting. Multiplayer games don't need to have a limited lifespan.

View in thread

technology

Comment on

Nvidia CEO: Everybody should use AI. Society has no choice but to change. I used to play in the streets. When cars came along, you obviously can’t play in the streets now

Why are we sharing this? Why is this important? CEO of Oreo says you should eat 10 oreos a day to stay healthy. Shocker.

View in thread

Replies