Spyke
kbin.run

The thing that I find the most funny about this post, is the fact that you call this Italian

231
lseifreply
sopuli.xyz

how am i supposed to know how italians speak. i've never seen one

204

That’s right! None of us knows how Italians can speak in the dark 🤌

2
IronKrillreply
lemmy.ca

Blud I'm gonna be fr no cap rn but wtf does blud mean I've been meaning to ask for months and I still don't get it

9

Typical 'muricans being unable to comprehend anything besides English.

::: spoiler /s i don't mean to be racist yes i was a r/2we4u user, how'd you know? :::

2
lemm.ee

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

54

Fucking hate when do that.

You are repeating the same mistake.

I'm sorry for repeating the same mistake, here's a new solution with corrections *proceed to write the exactly thing already told it was wrong*

17
Wappenreply
lemmy.world

Nope, they replaced an asterisk with an arrow!

14

Gotta remember they were trained off of the internet. Which is to say the largest body of people loadly professing the opinions are fact and refusing to say otherwise.

3
lemmy.world

This might be happening because of the 'elegant' (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

68
PlexSheepreply
infosec.pub

Do you have a source for that? Seems like an internal detail a corpo wouldn't publish

31
feddit.de

Seriously? Python for massive amounts of data? It's a nice scripting language, but it's excruciatingly slow

3

There are bindings in java and c++, but python is the industry standard for AI. The libraries for machine learning are actually written in c++, but use python language bindings. Python doesn't tend to slow things down since machine learning is gpu-bound anyway. There are also library specific programming languages which urges the user to make pythonic code that can be compiled into c++.

6
NeatNitreply
discuss.tchncs.de

I suppose it's conceivable that there's a bug in converting between different representations of Unicode, but I'm not buying and of this "detected which language is being spoken" nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there's soooooo many tokens. It would make no sense to make those tokens ambiguous.

18

I completely agree that it's a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here's the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

9

Damn, wild Glagolitic script found. I didn't even realise it was in the Unicode standard.

63

Well, it certainly doesn't overflow on 32 bit systems

58
feddit.uk

It looks so badass, I could have used that script now because im Ukrainian but instead I have cyrillic script which is so boring

33
Match!!reply
pawb.social

rebel against Russian imperialism, return to glagolitic

5
Vitalyreply
feddit.uk

It's not russian, If my bulgarian friend is right then it was created by a bulgarian guy

3

There is no single person responsible for Cyrillic script. It is mostly believed to be created by mixing and changing Greek and Glagolic scripts by the scholars of Preslav Literary School, which was indeed in Bulgaria. After a while, Peter the Great changed it a lot. And then Stalin stomped out almost all the deviations in the usage of the script.

The last part is mostly why it is considered Russian. A lot of languages suffered because of Moscow just forcing them to use the version of Cyrillic that Russians were using.

3

Cyrillic is literally greek+glagolitic and it was partly a diplomatic creation of the Eastern Roman Empire(aka Byzantine Empire), in order to bring the slavs culturally closer to them.

Russians have nothing to do with it, other than them claiming they are the continuation of Eastern Roman Empire, something which is kinda laughable but whatever dont let your dreams be dreams.

2
lemm.ee

Ah, I see you're using FartGPT instead of ChatGPT

31
lemy.lol

You may not understand, but we do.
Questo segreto rimarrà custodito gelosamente dalla stirpe italica. ◉‿◉

14

Non c'è scelta, se l'ultimo italiano dovesse lasciarci, allora anche questa informazione dovrà lasciare l'umanità

4
lemmy.zip

Taken literally, that implies you do care.

(To mitigate the pedantry: Given it's a rather dispassionate response in the context of a provocation, it is probably a very weak "care" though. Just because it's nonzero doesn't mean it's significant.)

2

Rememeber, whenever you break one spaghetto you break one heart 💔

2

Wow, an alien ion drive formula! Try to get warp drive out of it too!

9
feddit.uk

Kind of looks like the writing system of Georgian language but I'm not sure

6
Alleroreply
lemmy.today

No, this is Glagolitic script, an alternative to Cyrillic. Mostly used in old Slavic scriptures, was later replaced by Cyrillic and Latin.

Most Slavs themselves don't know how to read this

20
programming.dev

It's a dead script that was not that common in the first place, in Kievan Rus' it was even used as a form of encryption in XI—XVI centuries for how little spread it was. It is also very different from modern Cyrillic. So, saying "most Slavs don't know how to read it" is a bit of an understatement. Noone knows how to read it, apart from some linguists and overzealous Witcher fans.

4
opfar.v30reply
lemmy.ml

It was widespread in Croatia until the late middle ages, about XIV-XV century.

Noone knows how to read it, apart from some linguists and overzealous Witcher fans.

I could fluently read and write it in high school. Was bored.

3
programming.dev

Yea, Croatia is the only place it got widely used. Is it some kind of historical elective course in Croatian schools? Been a coupe of times in Croatia, never seen Glagolitic in the wild, though. Maybe wasn't looking good enough.

2
opfar.v30reply
lemmy.ml

Is it some kind of historical elective course

No, there was a poster showing correspondence with Latin on the wall, somewhere. The symbols are almost 1-1 with modern orthography, so it takes only about a week of practice. And I was really bored.

never seen Glagolic in the wild

It's about as distant from modern use as runes are for germanic speakers, but maybe with different connotations. Decorative nonsense.

But I did submit essays written with that when I wanted to fail with style. :)

I also met a guy in college who used it to keep notes. That guy was also bored.

3
programming.dev

I guess I'll just add you guys to the "overzealous Witcher fans" and consider my point valid.

1

I mean regular people don't know how to read it, except if you randomly decided you wanted to. It's pretty big culturally, e.g. the Baška tablet is a very important piece of history written in glagolitic that everyone knows about, and I've seen the alphabet randomly displayed in a few places, but nobody actually uses it today.

1

Nah, Georgian is arcs and circles everywhere, like this: ეს ქართული დამწერლობაა.

16