Spyke
mander.xyz

A lot of people are replying as if OP asked a question. It's a link to a blog post explaining why a kilobyte is 1000 and not 1024 bytes (exactly as the title says!). OP knows the answer, in fact they know it so well they wrote an extensive post about it.

Thank you for the write up! You should re-check the spelling and grammar as some sections had some troubles. I have a sentence I need to go to the post to get, so let me edit this later!

Edit: the second half of this sentence is a mess: "The factors don’t solely consist of twos, but ten are certainly lot of them." Otherwise nothing jumped out at me but I would reread it just in case!

107
LOGIC💣reply
lemmy.world

I also assume that people are answering that way because they thought it was a question.

However, it's also possible that they saw it described as a 20 minute read, and knew that the answer actually takes about 10 seconds to read, and figured that they'd save people 19 minutes and 50 seconds.

46
kbin.social

However, it’s also possible that they saw it described as a 20 minute read

Bit of a tangent and anecdotal, but I went back in to higher education a few years ago. I'm middle-aged, I was surrounded by younger people. We're asked to read an article, everyone starts reading. I read it through, underline the important bits, I'm done reading. I look around. Everyone's still reading. Oh well, they'll be done soon. Nope. I think it took most of them 15 minutes to read an article I'd read in under 5. I was a bit perplexed. This is higher education, these aren't idiots, these are people who should be able to read articles quickly.

There are plenty of reports of functional literacy decreasing. That children are slower at reading and are less able to understand what they've read. Anecdotally, it seems like younger generations really aren't used to reading longer articles anymore. I grew up reading books as a kid. That's what we did before phones and the internet. I wonder if younger generations simply don't have that much experience reading, which is why it takes them so long to read, which is why they read even less.

In the case of this article, they see 20 minutes, they're scared off. So they simply guess what was in the article. That's pretty worrying if that's what people do. If you're unable or unwilling to read longer stuff, you're likely to make ill informed choices or be more easily influenced.

-2
lemmy.world

I read slowly. It sucks, but it's not from lack of experience or lack of education. Reading speed seems a weird metric to start wondering if people lack intelligence.

Being able to read quickly is a valuable skill. I don't think I could handle jobs like editing, policy making, or lawyering simply because there are not enough hours in the day to make up for my reading deficit.

Of course, your anecdote is about a group, and mine is about one person. But the sweeping conclusion (if even it isn't a firm one) on generations irks me. Every generation has its outliers. There will never be a generation without hardworking geniuses in every active field. As far as I know, you are an outlier in your generation, and the comparison simply fails. Maybe peers you knew personally didn't get the cold judgment of intelligence by reading speed that you are applying to kids you don't have a relationship with.

I don't know. I will never dismiss the importance of reading. But you sound like Lucy here.

13
LOGIC💣reply
lemmy.world

I read relatively slowly, but I have the ability to read much faster. I simply like reading more slowly. I have this weird suspicion that people who read very quickly are getting information more quickly, but that they're either not absorbing it fully, or they're not enjoying it as much as I do. But that's obviously a biased perspective.

6
lemmy.world

TLDR: old person went back to school and reads faster than younger people, thinks younger people don't know how to read quickly.

12
lemmy.world

they see 20 minutes, they’re scared off

I'm not "scared off". I'm on Lemmy to have discussions, not to read articles. If I want to read articles I'll get a magazine.

-1
wischireply
programming.dev

It's true that the actual "story" is very short. 1 kB is 1000 bytes and 1 KiB is 1024 bytes. But the post is not about this, but about why calling 1024 a kilobyte always was wrong even in a historical context and even though almost everybody did that.

-11
LOGIC💣reply
lemmy.world

It’s true that the actual “story” is very short. 1 kB is 1000 bytes and 1 KiB is 1024 bytes. But the post is not about this, but about why calling 1024 a kilobyte always was wrong even in a historical context and even though almost everybody did that.

Yes. But it does raise the question of why you didn't say that in either your title:

Why a kilobyte is 1000 and not 1024 bytes

or your description:

I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.

This one is about why it was a mistake to call 1024 bytes a kilobyte. It’s about a 20min read so thank you very much in advance if you find the time to read it.

Feedback is very much welcome. Thank you.

The title and description were your two chances to convince people to read your article. But what they say is that it's a 20 minute read for 10 seconds of information. There is nothing that says there will be historical context.

I get that you might want to make the title more clickbaitey, but why write a description out if you're not going to tell what's actually in the article?

So, that's my feedback. I hope this helps.

One other bit of closely-related feedback, for your writing, in general. Always start with the most important part. Assume that people will stop reading unless you convince them otherwise. Your title should convince people to read the article, or at least to read the description. The very first part of your description is your chance to convince people to click through to the article, but you used it to tell an anecdote about why you wrote the article.

I'm the kind of person who often reads articles all the way through, but I have discovered that most people lose interest quickly and will stop reading.

22
wischireply
programming.dev

I tried to make the title the exact opposite of clickbait. There are no unanswered questions on purpose. No "Find out if a kilobyte is 1024 bytes or 1000 bytes". I think people are smart enough that I not just reiterate for 20min why a kilobyte is 1000 bytes but instead go into more details.

The main problem is probably that people won't sacrifice 20min of there time on something they are not sure if it's a good read but the only thing I can do is trying to encourage them to read it anyway.

There are not ads, no tracking, no cookies, no login, no newsletter, no paywall. I don't benefit if you read it. I'd like to clear up misconceptions but I can't force people to read it.

-11
LOGIC💣reply
lemmy.world

I don’t benefit if you read it.

You don't benefit financially, but there are other benefits. For example, you specifically asked for feedback, and you have received some.

10
wischireply
programming.dev

I don't get feedback just because you read it. I'm thankful for feedback but my sentence was accurate. I don't benefit if you read it.

-14

Every part of your comment has something factually wrong or fallacious.

I don’t get feedback just because you read it.

My reading the part I am giving feedback on is a prerequisite for actually giving feedback. I am obviously a person who graciously responded to your request, not somebody that you somehow ordered to give feedback. I don't know what you think you gain from viewing it this way.

I’m thankful for feedback but my sentence was accurate.

I didn't say it was inaccurate, but that it didn't tell people why to read the article. You didn't ask me to tell you inaccuracies. You asked for "feedback". You also don't seem to be thankful, because if you were thankful, you'd simply accept the feedback instead of throwing up straw-man arguments.

I don’t benefit if you read it.

You have exactly repeated your previous statement that I already proved wrong.

I will offer you one last piece of feedback. Just stop arguing. You can never look gracious pursuing an argument where you ask for advice and then argue with people who took time out of their day to help you.

Upvotes and downvotes don't determine whether people are factually right, but they do help you gauge what people think when they read your comments, and what I'm seeing is that you're not ingratiating yourself to the people who you are asking to read your article. Even if you could win this argument, and you can't, you wouldn't want to, because you'd look bad in doing so. When you ask for feedback, and feedback is given, just graciously accept it. If it's bad feedback, then just ignore it.

7

But that's also a simple answer: kilo is a metric prefix that means 1000, so kilobyte means 1000 bytes. The historical context is the history of the metric system, which is much older than modern computers.

4

This is a great example of how a lot of people dont read the posts they are replying to.

This is even more prevalent when arguments break out in the comments where people misunderstand each other or argue about things that one side said that they qualified later in the original comment but the other side didnt read the whole comment and instead hyperfocused on that one sentence that really garbled their goolies.

I trust that none of these people would have read the article even if they had realised it was there.

P.s. i fully agree with you. It's a great blog post. Good write-up. Very informative. The only quibble i have is that I've always loved the words mebibyte, gibibyte, etc.

7

Thank you very much. I'll try to fix that sentence later. I'm not a native speaker so it's not always obvious for me when a sentence doesn't sound right even though I pass sentences I'm not sure about through spell checks, MS Word grammar check and chat gpt 🤣

7

A lot of people are replying as if OP asked a question.

I think part of that is because outgoing links without a preview image are really easy to confuse with text-only posts, particularly because Reddit didn't allow adding both a text and a link simultaneously. Though in this case the text should've tipped people off that there's a link as well.

As for the actual topic, I agree with OP. I often forget to do it right when speaking, but I try to at least get it right when writing.

2
lemmy.world

Well it’s because computer science has been around for 60+ years and computers are binary machines. It was natural for everything to be base 2. The most infuriating part is why drive manufacturers arbitrarily started calling 1000 bytes a kilobyte, 1000 kilobytes a megabyte, and 1000 megabytes a gigabyte, and a 1000 gigabytes a terabyte when until then a 1 TB was 1099511627776 bytes. They did this simply because it made their drives appear 10% bigger. So good ol’ shrinkflation. You could make drives 10% smaller and sell them for the same price.

70
wischireply
programming.dev

If a hard drive has exactly 8'269'642'989'568 bytes what's the benefit of using binary prefixes instead of decimal prefixes?

There is a reason for memory like caches, buffer sizes and RAM. But we don't count printer paper with binary prefixes because the printer communication uses binary.

There is no(!) reason to label hard drive sizes with binary prefixes.

-30

It more accurately describes how much space you have and how you can expect to see it shown in your software when you actually install it somewhere.

8

So here’s the thing. I don’t necessarily disagree with you. And if this had done from the start it would never had been a problem. But it wasn’t and THAT is what caused the confusion. You put a lot of thought and research into your post and I can very much respect that. It’s something you feel strongly about and you took the time to write about your beef with this. IEC changed the nomenclature in the late 90s. But the REASON they changed it was to avoid the confusion caused by the drive manufacturers (I bet you can guess who was in the committee that proposed the change).

But I can tell you as a professional IT person we never really expect any drive (solid state or otherwise) to be any specific size. RAID, file system overhead, block size fragmentation, etc all take a cut. It’s basically just bistromathics (that’s a Hitchhiker’s reference) and the overall size of any storage system is only vaguely related to actual drive size.

So I just want to basically apologize for being so flippant before. It’s important enough to you that you took the time to write this. It’s just that I’m getting rather cynical as I get older and just expect the enshittification of every to continue ad infinitum on everything digital.

6
wischireply
programming.dev

Pretty obvious that you didn't read the article. If you find the time I'd like to encourage you to read it. I hope it clears up some misconceptions and make things clearer why even in those 60+ years it was always intellectually dishonest to call 1024 byte a kilobyte.

You should at least read "(Un)lucky coincidence"

-70
lemmy.world

Ok so I did read the article. For one I can’t take an article seriously that is using memes. Thing the second yes drive manufacturers are at fault because I’ve been in IT a very very long time and I remember when HD manufacturers actually changed. And the reason was greed (shrinkflation). I mean why change, why inject confusion where there wasn’t any before. Find the simplest least complex reason and that is likely true (Occam's razor). Or follow the money usually works too.

It was never intellectually dishonest to call it a kilobyte, it was convenient and was close enough. It’s what I would have done and it was obviously accepted by lots of really smart people back then so it stuck. If there was ever any confusion it’s by people who created the confusion by creating the alternative (see above).

If you wanna be upset you should be upset at the gibi, kibi, tebi nonsense that we have to deal with now because of said confusion (see above). I can tell you for a fact that no one in my professional IT career of over 30 years has ever used any of the **bi words.

You can be upset if you want but it is never really a problem for folks like me.

Hopefully this helps…

20
graymanreply
lemmy.world

Pushing 30 years myself and I confirm literally not a single person I've worked with has ever used **bi... terms. Also, I recall the switch where drive manufacturers went from 1024 to 1000. I recall the poor attempt from shill writers in tech saying it better represents the number of bits as the format parameters applied to a drive changes the space available for files. I recall exactly zero people buying that excuse.

3
lemmy.ml

I just think that kilobyte should have been 1000 (in binary, so 16 in decimal) bytes and so on. Just keep everything relating to the binary storage in binary. That couldn't ever become confusing, right?

1

Because your byte is 10 decimal bits, right? EDIT: Bit is actually an abbreviation, BIT, initially, so it would be what, DIT?.. Dits?..

1
λλλreply
programming.dev

kilobit = 1000 bits. Kilobyte = 1000 bytes.

How is anything about that intellectually dishonest??

The only ones being dishonest are the drive manufacturers, like the person above said. They sell storage drives by advertising them in the byte quantity but they're actually in the bit quantity.

-21

They sell storage drives by advertising them in the byte quantity but they're actually in the bit quantity.

No, they absolutely don’t. That’d be off by 8x.

The subject at hand has nothing to do with bits. Please, read what OP posted. It’s about 1024 vs 1000

30

Calling 1024 a kilo is intellectually dishonest. Your conversation is perfectly fine.

-26
lemmy.world

I genuinely don't understand your disdain for using base 2 on something that calculates in base 2. Do you know how counting works in binary? Every byte is made up of 8 bits, and goes from 0000 0000 to 1111 1111, or 0-15. When converted to larger scales, 1024 bytes is a clean mathematical derivation in base 2, 1000 is a fractional number. Your pedantry seems to hinge on the use of the prefix right? I think 1024 is a better representation of kilo- in base 2, because a kilo- can be directly translated up to exabytes and down to nybbles while "1000" in base 2 is extremely difficult. The point of metric is specifically to facilitate easy measuring, right? So measuring in the units that the computer uses makes perfect sense. It's like me saying that a kilogram should be measured in base 60, because that was the original number system.

55
psudreply
lemmy.world

TLDR: the problem isn't using base 2 multipliers. The problem is doing so then saying it's a base 10 number

In 1998 when the problem was solved it wasn't a big deal, but now the difference between a gigabyte and a gibibyte is large enough to cause problems

7

Using kilo- in base 2 for something that calculates in base 2 simply makes sense to me. However, like I said to OP, ultimately this debate amounts to rage bait for nerds. All I ask is that I'm not pedantically corrected if the conversation isn't directly related to kibi- vs kilo-

3
wischireply
programming.dev

Did you read the post? The problem I have is redefining the kilo because of a mathematical fluke.

You certainly can write a mass in base 60 and kg, there is nothing wrong about that, but calling 3600 gramm a "kilogram" because you think it's convenient that 3600 (60^2) is "close to" 1000 so you just call it a kilogram, because that's exactly what's happening with binary and 1024.

If you find the time you should read the post and if not at least the section "(Un)lucky coincidence".

-44
rockSlayerreply
lemmy.world

I started reading it, but the disdain towards measuring in base 2 turned me off. Ultimately though this is all nerd rage bait. I'm annoyed that kilobytes aren't measured as 1024 anymore, but it's also not a big deal because we still have standardized units in base 2. Those alternative units are also fun to say, which immediately removes any annoyance as soon as I say gibibyte. All I ask is that I'm not pedantically corrected if the discussion is about something else involving amounts of data.

I do think there is a problem with marketing, because even the most know-nothing users are primed to know that a kilobyte is measured differently from a kilogram, so people feel a little screwed when their drive reads 931GiB instead of 1TB.

28
lemmy.world

Yeah I’m with you, I read most of it but I just don’t know where the disdain comes from. At most scales of infrastructure anymore you can use them interchangeably because the difference is immaterial in practical applications.

Like if I am going to provision 2TB I don’t really care if it’s 2000 or 2048GB, I’ll be resizing it when it gets to 1800 either way, and if I needed to actually store 2TB I would create a 3TB volume, storage is cheap and my time calculating the difference is not.

Wait until you learn about how different fields use different precision levels of pi.

10

It's not 2000 Vs 2048. It's 1,862 Vs 2048

The GB get smaller too.

7
kbin.social

I was confused when I just read the headline. Should be "Why I (that would be you not me) think a kilobyte should be 1000 instead of 1024". Unpopular opinion would be a better sub for it.

43
wischireply
programming.dev

You should read the blog post. It's not a matter of option.

-76
lemmy.world

It totally is a matter of opinion. These are arbitrary rules, made up by us. We can make up whatever rules we want to.

I agree that it's weird that only in CS kilo means 1024. It would be logical to change that, to keep consistency across different fields of science. But that does not make it any less a matter of opinion.

45
lemmy.world

You can't store data in base 10, nor address memory or storage in base 10 given present computers. It's a bit more than a matter of opinion that computers are base 2

-4

Yes computers are base 2 but we can still make up whatever rules we want about them. We could even make up rules that say that we are to consider everything a computer does to be in base 10 but it can only use the lowest 2 values of any given digit. It would be a total mess and it would make no sense whatsoever but we could define those rules.

4
kbin.social

I know there is no option as 1024 is what the standard is now. Im not reading that anymore than someone saying how a red light really means go.

14
lemmy.world

1024 is not the standard. The standard term for 1024 is "kibi" or "Ki" and the standard term for 1000 is "kilo" and has been since the year 1795.

There was a convention to use kilo for 1024 in the early days of computing since the "kibi" term didn't exist until 1998 (and took a while to become commonly used) — but that convention was always recognised as an incorrect use of the term. People just didn't care much especially since kilobytes were commonly rounded anyway. A 30,424 byte file is 29.7109375 kibibytes or 30.424 kilobytes... both will likely be rounded to 30 either way, so who cares if it's slightly wrong? Just use bytes if you need to know the exact size.

Also - hard drives, floppy disks, etc have always referred to their size in base 1000 numbers so if you were working with 30KB in the early days of computers it was very rarely RAM. A PDP-11 computer, for example, might have only had 8196 bytes of RAM (that's 8 kibibytes).

There are some places where the convention is still used and it can be pretty misleading as you work with larger numbers. For example 128 gigs equals 128,000,000,000 bytes (if using the correct 1000 unit) or 137,438,953,472 bytes (if kilo/mega/giga = 1024).

The "wrong" convention is commonly still used for RAM chips. So a 128GB RAM chip is significantly larger than a 128GB SSD.

6

I've never met anyone that actually uses the new prefixes for 1024 and the old prefixes to mean 1000

4

Also - hard drives, floppy disks, etc have always referred to their size in base 1000 numbers

That is not true. For a long time everything (computer related) was in the base 2 variants. Then the HD manufacturers changed so their drives would appear larger than they actually were (according to everyone's notions of what kn/mb/gb meant). It was a marketing shrinkflation stunt.

3

Here’s my favorite part.

“In addition, the conversions were sometimes not even self-consistent and applied completely arbitrary. The 3½-inch floppy disk for example, which was marketed as “1.44 MB”, was actually not 1.44 MB and also not 1.44 MiB. The size of the double-sided, high-density 3½-inch floppy was 512 bytes per sector, 18 sectors per track, 160 tracks, that’s 512×18×16 = 1’474’560 bytes. To get to “1.44” you must first divide 1’474’560 by 1024 (“bEcAuSE BiNaRY obviously”) to get 1440 and then divide by 1000 for perfect inconsistency, because dividing by 1024 again would get you an ugly number and we definitely don’t want that. We finally end up with “1.44”. Now let’s add “MB” because why the heck not. We already abused those units so much it’s not like they still mean anything and it’s “close enough” anyways. By the way, that “close enough” excuse never “worked when I was in school but what would I know compared to the computer “scientists” back then.

When things get that messy, numbers don’t even mean anything any more. Might as well just label the products using entirely qualitative terms like “big” or “bigger”.

39
lemmy.world

Thanks for this article. Unfortunately, you used the word “prefix” when you really meant “unit symbol”. So, “kilo” and “mega” are prefixes, kB and MB are unit symbols. You repeatedly called the latter “prefixes”.

37
wischireply
programming.dev

Thank you for the feedback. I know that only the "first" part is the prefix and I tried to be careful to not use it wrong. I just checked all 53 instances of "prefix" and I don't see a wrong one, but to be fair there are situations that could be misunderstood easily like here:

Today the only correct conversions are to either use SI prefixes (like 1 MB = 1000² bytes) or binary prefixes (1 MiB = 1024² bytes).

But with prefix I only meant the "M" and "Mi" part and they are both prefixes.

I'll try to clarify that later so the difference is clear to all readers. Thank you.

1
Tattersreply
lemmy.world

Ok, I understand what you are trying to do, but I that is not how I read it at the time. Prefix to me in this context means e.g., “kilo” in “kilobyte”, and not the “k” in “kB”. I am not sure it is helpful to split the unit symbol up like that.

9

In terms of language you are correct. But in terms of SI usage it seems to me OP is expressing it correctly. The SI unit prefixes have a name, a symbol and a multiplier. The prefix is a concept that encompasses all three of those attributes. So "kilo" is one way of identifying the 10^3 unit prefix, but the name kilo is not the prefix itself. It's just the name we use to refer to it. And the symbol k in km is certainly the unit prefix portion of that unit of measure.

1

But the first part is called prefix even in the standard itself. I wanted to make that distinction because it's not important what the base unit is. By speaking about prefixes instead of the unit as a whole I wanted to make it clear that you can (at least in theory) use any base unit. So everything I said about KiB and kB is also true for Kib and kb and even for kK (kilokelvin) and KiB (kibikelvin) 🤣

-6
lemm.ee

While we're nitpicking, the post says multiple times that SI prefix symbols are "all uppercase except for kilo (k)".

That's just factually wrong. More than half of them are lowercase! There's centi- (c), micro- (µ), nano- (n), etc. On the positive side there's even deca- (da) and hecto- (h), though they aren't particularly common or useful. I did at least see milli- (m) and bit (b) mentioned in a brief note though.

Obviously context matters and only the positive powers from kilo upward are relevant in computer science. But I studied chemistry and physics so I guess it irked me to see the statement repeatedly ignore all the negative powers of ten.

Overall, good rant though 😅 I'll be more careful to use KiB and MiB from here out when appropriate.

1

❤️ Thank you for taking the time to read it. And thank you so much for pointing that out, you are completely right and I totally didn't think about that while writing the article, probably because negative exponents are pretty rare in computer science (as in milli-bytes, etc.). I'll fix that in a few days. Thanks again for pointing that out.

2

when you format a 256GB drive and find out that you don't actually have 256GB

Most of the time you have at least 256GB. It's just you 256GB=238.4GiB, and windows reports GiB but calls them GB. You wouldn't have that problem in Mac OS that counts GB properly, or gnome that counts GiB and calls them GiB.

(This is ignoring the few MB that takes to format a drive, but that's also space on the disk and you're the one choosing to partition and format the drive. If you dumped a file straight into the drive you'd get that back, but it would be kind of inconvenient)

7

Here's the summary for the wikipedia article you mentioned in your comment:

Both the British imperial measurement system and United States customary systems of measurement derive from earlier English unit systems used prior to 1824 that were the result of a combination of the local Anglo-Saxon units inherited from Germanic tribes and Roman units. Having this shared heritage, the two systems are quite similar, but there are differences. The US customary system is based on English systems of the 18th century, while the imperial system was defined in 1824, almost a half-century after American independence.

^article^ ^|^ ^about^

-2
wischireply
programming.dev

So why don't they just label drives in Terabit instead of terabyte. The number would be even bigger. Why don't Europeans also use Fahrenheit, with the bigger numbers the temperature for sure would instantly feel warmer 🤣

Jokes aside. Even if HDD manufacturers benefit from "the bigger numbers" using the 1000 conversation is the objectively only correct answer here, because there is nothing intrinsically base 2 about hard drives. You should give the blog post a read 😉

-26
lemmy.world

there is nothing intrinsically base 2 about hard drives

did you miss the part where those devices store binary data?

20
wischireply
programming.dev

Binary prefixes (the ones with 1024 conversations) are used to simplify numbers that are exact powers of two - for example RAM and similar types of memory. Hard drive sizes are never exact powers of two. Disk storing bits don't have anything to do with the size of the disk.

-13
lemmy.world

sure, but one of the intrinsic properties of binary data is that it is in binary sized chunks. you won't find a hard drive that stores 1000 bits of data per chunk.

8

The "chunk" is often 32,768 bits these days and it never matches the actual size of the drive.

A 120 GB drive might actually be closer to 180 GB when it's brand new (if it's a good drive - cheap ones might be more like 130 GB)... and will get smaller as the drive wears out with normal use. I once had a HDD go from 500 GB down to about 50 GB before I stopped using it - it was a work computer and only used for email so 50 GB was when it actually started running out of space.

HDD / SSD sellers are often accused of being stingy - but the reality is they're selling a bigger drive than what you're told you're getting.

2

Look up the exact number of bytes and then explain to me what the benefits are of using 1024 conversations instead of 1000 for a hard drive?

-7

Not even SSDs are. Do you have an SSD? You should lookup the exact drive size in bytes, it's very likely not an exact power of two.

1
wewbullreply
feddit.uk

there is nothing intrinsically base 2 about hard drives

Yes there is. The addressing protocol. Sectors are 512 (2⁹) bytes, and there's an integer number of them on a drive.

17

That's true but the entire disk size is not an exact power of two that's why binary prefixes (1024 conversation) don't have any benefit whatsoever when it comes to hard drives. With memory it's a bit different because other than with storage devices RAM size is always exactly a power of two.

-16
lemmy.world

Kilo = 1000

Byte = Byte

Kilobyte = 1000 bytes

Kibibyte = 1024 bytes

36

This is why I only use nibbles. At least it's not spelled funny. But, unfortunately, it sounds like dogfood... Kibinibbles.

2
lemmy.ml

I know it's already been explained but here is a visualization of why.

0 2 4 8 16 32 64 128 256 512 1024

26
wischireply
programming.dev

Did you read the blog post? If you don't find the time you should at least read "(Un)lucky coincidence" to see why it's not (and never was) a bright idea to call 1024 "a kilo".

-48
Crashumbcreply
lemmy.world

No we didn't read your click bait and have no interest in doing so.

19

Dude you're pretty condescending for a new author on an old topic.

Yeah I read it and it's very over worded.

1024 was the closest binary approximation of 1000 so that became the standard measurement. Then drive manufacturers decided to start using decimal for capacity because it was a great way to make numbers look better.

Then the IEC decided "enough of this confusion" and created binary naming standards (kibi gibi etc...) and enforced the standard decimal quantity values for standard names like kilo-.

It's not ground breaking news and your constant arguing with people in the thread paints you as quite immature. Especially when plenty of us remember the whole story BECAUSE WE LIVED IT AS IT PROFESSIONALS.

We lacked a standard, a system was created. It was later changed to match global standard values.

You portray it with emotive language making decisions out to be stupid, or malicious. A decision was made that was perfectly sensible at the time. It was then improved. Some people have trouble with change.

Your writing and engagement styles scream of someone raised on clickbait news. Focus on facts, not emotion and sensationalism if you want to be taken seriously in tech writing.

Focus on emotion and bullshit of you want to work for BuzzFeed.

And if you just want an argument go use bloody twitter.

15
lemmy.world

You asked for feedback, so here is my feedback:

The article is okay. I read most of it, but not all of it, because it seemed overly worded for the sentiment. It could have been condensed quite a bit. I would argue the focus should be more on the fact that there should be a standard in technical documentation, OS's, specification sheets, etc. That's the part that impacts most people, and the reason they should care. But that kind of gets lost in all the text.

Your replies here come off as pretty condescending. You should anticipate most people not reading the article before commenting. Just pay them no attention, or reiterate what you already stated in the article. You shouldn't just say "did you read the article" and then "it's in this section of the article". Just like how people comment on youtube before watching the video, people will comment on the topic without reading the article.

Maybe they didn't realize it was an article, maybe they knew it was an article and chose not to read it, or maybe they read it and disagree with some of the things you said. It's okay for people to disagree with something you said, even if you sincerely believe something you said isn't a matter of opinion (even though it probably is). You can agree to disagree and move on with your life.

25
wischireply
programming.dev

Thank you for taking the time to read it and your feedback.

Your replies here come off as pretty condescending.

That was definitely never my intention but a lot of people here said something similar. I should probably work on my English (I'm not a native speaker) to phrase things more carefully.

You shouldn't just say "did you read the article" and then "it's in this section of the article"

It never crossed my mind this could be interpreted in a negative way. I tried to gauge if someone read it and still disagreed or if someone didn't read it and disagrees, because those situations are two different things, at least for me. The hint with the sections was also meant as a pointer because I know that most people won't read the entire thing but maybe have 5min on their hand to read the relevant section.

4
Corrreply
lemm.ee

Most native English speakers tend to take blunt statements/questions negatively due to the culture (especially true in north America).

I enjoyed reading the article but I would agree with the above commenter that it may be a bit lengthy. Generally speaking writing tends to be more engaging in this format if it's a bit more concise, both as a whole and on a per sentence basis.
There was also a typo somewhere, I think "the" instead of another word, I read the article a few hours ago now so I can't remember, sorry. I don't think I would have guessed you were not a native English speaker from the article. Overall, I liked it and congratulations on putting something out there!

3

Thank you for taking the time to read it ❤️. I'm currently out of office I'll try to find and fix the typo you mentioned once I'm back, thanks for pointing it out.

2
lemmy.world

I feel bad for you OP, I get this a lot and I'm totally gonna go there because I feel your pain and your article was fantastic! I read almost every word ;p

This phenomena stems from an aversion to high-confidence people who make highly logical arguments from low self-confidence people who basically make themselves feel unworthy/inadequate when justly critiqued/busted. It makes sense for them to feel that way too, I empathize. It's hard to overcome the vapid rewarding and inflation in school. They should feel cheated and insolent at this whole situation.

I'll be honest in front of the internet; people (in majority mind you, say 70-80% of Americans, I'm American) do not read every word of the article with full attention because of ever present and prevelant distractions, attention deficit, and motivation. They skip sentences or even paragraphs of things they are expecting they already know, apply bias before the conclusion, do not suspend their own perspective to understand yours for only a brief time, and come from a skeptical position no matter if they agreed with it or not!

In general, people also want to feel they have some valid perspective "truth" (as it's all relative to them...) of their own to add and they want to be validated and acknowledged for it, as in school.

Guess what though, Corporations, Schools, Market Analysis, Novelists, PR people, Video Game Makers, Communications Managers and Small and Medium Business already know this! They even take a much more, ehh, progressive? approach about it, let's say. That is, to really not let them speak/feedback, at all. Nearly all comment sections are gone from websites, comment boxes are gone from retail shops, customer service is a bot, technical writers make videos now to go over what they just wrote, Newspapers write for 4th graders, etc., etc.

Nothing you said is even remotely condescending and nothing you said was out of order. Don't defend yourself in these situations because it's just encouragement for them to do it again. Don't take it personally yourself, that is just the state of things.

Improvise, Adapt, Re-engineer, Re-deploy, Overcome, repeat until done.

-4

"I am smart."... "Most people have an attention span the length of a yo mama joke."... "Ramble ramble yada yada yada."

1
lemmy.world

you can't ask for feedback, then attack everyone who doesn't share your opinion with "did you read it?", that's not cool...

22
wischireply
programming.dev

I still don't get how "did you read it?" is attacking anyone? It's true I asked for feedback but I'm a bit overwhelmed that I had to clarify that I'm interested in feedback about the post from people who actually read it.

-6

"the tone makes the music" as the Germans would say. you're asking for volunteer help and are rude to the ones replying

6
lemmy.world

A kilobyte (kB) is 1000 bytes, that's what the prefix kilo means. A kibibyte (KiB) is 1024 bytes (the "bi" in the prefix means base 2 or binary). People often confuse them, but they're similar enough for smaller units, 10^3 ~ 2^10.

Oh and at first, kilobyte was used for both amounts, which is why kibibytes were introduced to fix the confusion, which perhaps was a bit late anyway.

22

True and that's what the article is about. You should check out the interactive diagram in the "(Un)lucky coincidence" section.

-34
programming.dev

The mistake is thinking that a 1000 byte file takes up a 1000 bytes on any storage medium. The mistake is thinking that it even matters if a kB means 1000 or 1024 bytes. It only matters for some programmers, and to those 1024 is the number that matters.

Disregarding reality in favor of pedantics is the real mistake.

20

I suggest considering this from a linguistic perspective rather than a technical perspective.

For years (decades, even), KB, MB, GB, etc. were broadly used to mean 2^10, 2^20, 2^30, etc. Throughout the 80s and 90s, the only place you would likely see base-10 units was in marketing materials, such as those for storage media and modems. Mac OS exclusively used base-2 definitions well into the 21st century. Windows, as noted in the article, still does. Many Unix/POSIX tools do, as well, and this is unlikely to change.

I will spare you my full rant on the evils of linguistic prescriptivism. Suffice it to say that I am a born-again descriptivist, fully recovered from my past affliction.

From a descriptivist perspective, the only accurate way to define kilobyte, megabyte, etc. is to say that there are two common usages. This is what you will see if you look up the words in any decent dictionary. e.g.:

I don't recall ever seeing KiB/MiB/etc. in the 90s, although Wikipedia tells me they "were defined in 1999 by the International Electrotechnical Commission (IEC), in the IEC 60027-2 standard".

While I wholeheartedly agree with the goal of eliminating ambiguity, I am frustrated with the half-measure of introducing unambiguous terms on one side (KiB, MiB, etc.) while failing to do the same on the other. The introduction of new terms has no bearing on the common usage of old terms. The correct thing to have done would have been to introduce two new unambiguous terms, with the goal of retiring KB/MB/etc. from common usage entirely. If we had KiB and KeB, there'd be no ambiguity. KB will always have ambiguity because that's language, baby! regardless of any prescriptivist's opinion on the matter.

Sadly, even that would do nothing to solve the use of common single-letter abbreviations. For example, Linux's ls -l -h command will return sizes like 1K, 1M, 1G, referring to the base-2 definitions. Only if you specify the non-default --si flag will you receive base-10 values (again with just the first letter!). Many other standard tools have no such options and will exclusively use base-2 numbers.

19

Here's the summary for the wikipedia article you mentioned in your comment:

In the study of language, description or descriptive linguistics is the work of objectively analyzing and describing how language is actually used (or how it was used in the past) by a speech community.All academic research in linguistics is descriptive; like all other scientific disciplines, it seeks to describe reality, without the bias of preconceived ideas about how it ought to be. Modern descriptive linguistics is based on a structural approach to language, as exemplified in the work of Leonard Bloomfield and others. This type of linguistics utilizes different methods in order to describe a language such as basic data collection, and different types of elicitation methods.

^article^ ^|^ ^about^

4
startrek.website

Because SI prefixes are always powers of the base. Base 10 is the most common, but that's more human psychology that math.

16
wischireply
programming.dev

SI prefixes are literally just base ten and not really about human psychology.

-6

I think they mean it's easier to refer to powers of 1000 with the SI units, rather than of 1024 as with Kibi and the lot. Especially higher up in the prefixes, because it starts to diverge more and more from the expected value.

7
lemmy.world

Because a kilo is 1000. That's why you have kibi, mebi, gibi binary prefixes for those times where 1024 (power of 2's) matter.

16

To me the bigger problem is the fact we don't have a written standard. Idc what people say, but if you buy a 10TB hard drive, then plug it in and the OS doesn't show 10TB, then it can be easy to blame the drive manufacturer when the OS is just using a different prefix quantity, but calling it the same. There should be some way to know exactly how many bytes there are on a drive before you buy it, and it should match when you plug it into your computer. I don't think that's crazy, but the article is a little overboard for that sentiment

1
lemmy.ml

It's a scam by HDD makers to sell less storage for more money.

15

that's what it was initially, reporting decimal 'megabytes' for hdd capacity. lawsuits and settlements followed.

the dust settled and what we have now is disclaimers on storage products (from the legal settlements) and they continue to use 'decimal' measurements...

and we also a different set of prefixes for 'binary' units of measurements (standards body trying to address the problem of confusion): kibi, mebi, gibi, tebi, pebi, exbi; which are not widely used yet.. the 'old' ones are for decimal but still commonly used for binary.

4
wischireply
programming.dev

Did you read the blog post? It's not a scam. HDD vendors might profit from "bigger numbers" but using the units they do is objectively the only sensible and correct option. It's like saying that the weather report is in Fahrenheit because in Celsius the numbers would be lower and feel somehow colder 🤣

If it would be about bigger numbers why don't HDD manufacturers just use Terabit instead of terabyte? The "bigger number" argument is not a good one.

-17

Because it's much easier to mistake a number for a somewhat close number than one that is orders of magnitude different...

I'll try to read the article later but the reality is that HDD manufacturers could help customers disambiguate but that would hurt their bottom line so they don't.

9

Videogame companies literally did use "megabit" when the truth was "128KiB", because it sounded better. Actual computer companies were still listing binary power numbers, because buyers had more to invest and care about accuracy.

You say "sensible", but it's lying for profit.

2
smo
lemmy.sdf.org

This has been my pet rant for a long time, but I usually explain it .. almost exactly the other way around to you.

You can essentially start off with nothing using binary prefixes. IBM's first magnetic harddrive (the IBM 350 - you've probably seen it in the famous "forklifting it into a plane" photo) stored 5 million characters. Not 5*1024*1024 characters, 5,000,000 characters. This isn't some consumer-era marketing trick - this is 1956, when companies were paying half a million dollars a year (2023-inflated-adjusted) to lease a computer. I keep getting told this is some modern trick - doesn't it blow your mind to realise hdd manufacturers have been using base10 for nearly 70 years? Line-speed was always a lie base 10, where 1200 baud laughs at your 2^n fetish (and for that matter, baud comes from telegraphs, and was defined before computers existed), 100Mbit ethernet runs on a 25MHz clock, and speaking of clocks - kHz, MHz, MT/s, GT/s etc are always specified in base 10. For some reason no-one asks how we got 3GHz in between 2 & 4GHz CPUs.

As you say, memory is the trouble-maker. RAM has two interesting properties for this discussion. One is that it heavily favours binary-prefixed "round numbers", traditionally because no-one wanted RAM with un-used addresses because it made address decoding nightmarish (tl;dr; when 8k of RAM was usually 8x1k chips, you'd use the first 3 bits of the address to select the chip, and the other 10 bits as the address on the chip - if chips didn't use their entire address space you'd need to actually calculate the address map, and this calculation would have to run multiples of times faster than the cpu itself) . The second, is that RAM was the first place non-CSy types saw numbers big enough for k to start becoming useful. So for the entire generation that started on microcomputers rather than big iron, memory-flavoured-k were the first k they ever tasted.

I mean, hands up who had a computer with 8-64k of RAM and a cassette deck. You didn't measure the size of your stored program in kB, but in seconds of tape.

This shortcut than leaked into filesystems purely as an implementation detail - reading disk blocks into memory is much easier if you're putting square pegs into square holes. So disk sectors are specified in binary sizes to enable them to fit efficiently into memory regions/pages. For example, CP/M has a 128-byte disk buffer between 0x080 and 0x100 - and its filesystem uses 128-byte sectors. Not a coincidence.

This is where we start getting into stuff like floppy disk sizes being utter madness. 360k & 720k were 720 and 1440 512-byte sectors. When they doubled up again, we doubled 2800 512-byte sectors gave us 1440k - and because nothing is ever allowed to make sense (or because 1.40625M looks stupid), we used base10 to call this 1.44M.

So it's never been that computers used 1024-shaped-k's. It should be a simple story of "everything uses 1,000s, except memory because reasons". But once we started dividing base10-flavoured storage devices into base2-flavoured sectors, we lost any hope of this ever looking logical.

10

aside: the little-k thing. SI has a beautifully simple rule, capital letters for prefixes >1, small letters for prefixes <1. So this disambiguates between a millivolts (mV) and megavolts (MV).

But, and there's always a but. The kilogram was the first SI unit, before they'd really thought it through. So we got both a lower-case k breaking such a beautifully simple rule, and the kilogram as a base unit instead of a gram. The Kilogram is metric's "screw it, we'll do it live".

Luckily this is almost a non-issue in computing as a fraction of a bit never shows up in practice. But! If you had a system that took 1000 seconds to transfer one bit, you could call that a millibit per second, or mbps, and really mess things up.

8
lemmy.world

2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. It's pretty fucking logical m8. You know what's not logical? Base 10

1
fedia.io

This whole mess regularly frustrates me... why the units can't be used consistently?!

The other peeve of mine with this debacle is that drive capacities using SI units do not use the full available address space (since it's binary). Is the difference between 250GB and 256GiB really used effectively for wear-levelling (which only applies to SSDs) or spare sectors?

10
Lmaydevreply
programming.dev

Power of 2 makes more sense to the computer. 1000 makes more sense to people.

11
fedia.io

Of course. The thing is, though, that if the units had been consistent to begin with, there wouldn't be anywhere near as much confusion. Most people would just accept MiB, GiB, etc. as the units on their storage devices. People already accept weird values for DVDs (~4.37GiB / 4.7GB), so if we had to use SI units then a 256GiB drive could be marketed as a ~275GB drive (obviously with the non-rounded value in the fine print, e.g. "Usable space approx. 274.8GB").

6

They were consistent until around 2005 (it's an estimate) when drives got large enough where the absolute difference between the two forms became significant. Before that everyone is computing used base 2 prefixes.

I bet OP does too when talking about RAM.

4
wischireply
programming.dev

It's not as simple as that. A lot of "computer things" are not exact powers of two. A prominent example would be HDDs.

-9
Lmaydevreply
programming.dev

In terms of storage 1000 and 1024 take the same amount of bytes bits to represent. So from a computer point of view 1024 makes a lot more sense.

It's just a binary Vs decimal thing. 1000 is not nicely represented in binary the same as 1024 isn't in decimal.

Edit: was talking about storing the actual number.

6
lemmy.world

In terms of storage 1000 and 1024 take the same amount of bytes.

What? No. A terabyte in 1024 units is 8,796,093,022,208 bits. In 1000 units it's 8,000,000,000,000 bits.

The difference is substantial with larger numbers.

4

Both require the same amount of bits again. So the second one makes more sense for a computer.

-4
nousreply
programming.dev

Huh? What does how a drive size is measured affect the available address space used at all? Drives are broken up into blocks, and each block is addressable. This is irrelevant of if you measure it in GB or GiB and does not change the address or block size. Hell, you have have a block size in binary units and the overall capacity in SI units and it does not matter - that is how it is typically done with typical block sizes being 512 bytes, or 4096 (4KiB).

Or have anything to do with ware leveling at all? If you buy a 250GB SSD then you will be able to write 250GB to it - it will have some hidden capacity for ware-leveling, but that could be 10GB, 20GB, 50GB or any number they want. No relation to unit conversions at all.

9
fedia.io

Huh? What does how a drive size is measured affect the available address space used at all? Drives are broken up into blocks, and each block is addressable.

Sorry, I probably wasn't clear. You're right that the units don't affect how the address space is used. My peeve is that because of marketing targeting nice round numbers, you end up with (for example) a 250GB drive that does not use the full address space available (since you necessarily have to address to up 256GiB). If the units had been consistent from the get-go, then I suspect the average drive would have just a bit more usable space available by default.

My comment re wear-levelling was more to suggest that I didn't think the unused address space (in my example of 250GB vs 256GiB) could be excused by saying it was taken up by spare sectors.

0

(for example) a 250GB drive that does not use the full address space available

Current drives do not have different sized addressable spaces and a 256GiB drive does not use the full address space available. If it did then that would be the maximum size a drive could be. Yet we have 20TB+ drives and even those are no where near the address size limit of storage media.

then I suspect the average drive would have just a bit more usable space available by default.

The platter size might differ to get the same density and the costs would also likely be different. Likely resulting in a similar cost per GB, which is the number that generally matters more.

My comment re wear-levelling was more to suggest that I didn’t think the unused address space (in my example of 250GB vs 256GiB) could be excused by saying it was taken up by spare sectors.

There is a lot of unused address space - there is no need to come up with an excuse for it. It does not matter what size the drive is they all use the same number of bits for addressing the data.

Address space is basically free, so not using it all does not matter. Putting in extra storage that can use the space does cost however. So there is no real relation between the address spaces and what space is on a drive and what space is accessible to the end user. So it makes no difference in what units you use to market the drives on.

Instead the marketing has been incredibly consistent - way back to the early days. Physical storage has essentially always been labeled in SI units. There really is no marketing conspiracy here. It just that is they way it was always done. And why it was picked that way to begin with? Well, that was back in the day when binary units where not as common and physical storage never really fit the doubling pattern like other components like ram. You see all sorts of random sizes in early storage media so SI units I guess did not feel out of place.

1

You know what else is frustrating? Time units. It’s like we’re back in the pre-SI days again. Try to compare the flow rates of two pumps when one is 123 m^3/h and the other is 1800 l/min. The French tried to fix this mess too while they were at it, but somehow we’re still stuck with this archaic mess.

1

The other peeve of mine with this debacle is that drive capacities using SI units do not use the full available address space (since it’s binary).

The "full available address space" goes down as the drive gets older and bad sectors are removed.

With a good drive, it might take ten or more years before you actually see the "size" of the drive shrink, but that's mostly because you 500GB drive actually had something like 650GB of storage when it was brand new.

1
kbin.run

Nice to learn about the SI standard notation KiB, MiB, etc. I had no idea.

7

KiB and MiB are not SI prefixes but IEC binary prefixes but the names are derived from the SI names for simplicity.

-1
lemmy.world

I was taught 1024 in my tech school. So I won’t ever refer to it as 1000 instead 1024. Not that it seems even remotely relevant though.

5
PupBirureply
kbin.social

kilobyte (KB) is 1000, kibibyte (KiB) is 1024

at least according the the IEC, and id tend to go with them… SI units say that kilo means 1000

3
PsychedSyreply
sh.itjust.works

That was a retcon, though. Initially the SI prefixes were used and used 1024 instead of 1000. I feel like people started getting more fussy about it as hard drives started hitting hundreds of gb.

23
lemmy.world

Initially the SI prefixes were used and used 1024 instead of 1000

Only CPUs and RAM use 1024. Floppy disks and hard drives going way back to the 1970's used 1000. In software, both are used depending on the context (and also obviously depending on the software). Most modern operating systems use 1024 for RAM and 1000 for file sizes (in the early days of computing, that agreed upon approach didn't exist, and it varied from one computer to the next).

@smokin_shinoby's tech school was shit. There has never been consistency on this issue and it's really sad that they failed to teach both numbering systems as they are (and always were) widely used.

0
Eyronreply
lemmy.world

How do you define a recon? Were kilograms 1024 grams, too? When did that change? It seems it's meant 1000 since metric was created in the 1700s, along with a binary prefix.

From the looks of it, software vendors were trying to recon the definition of "kilo" to be 1024.

-1
PsychedSyreply
sh.itjust.works

Kilo was used outside of decimal power rules for data storage/memory because it could only use binary powers at smaller scales. Well, that's the standard we went with anyway.

They didn't 'retcon' the use of kilo as applicable to other units, they went with the closest power of two. When hard drive manufacturers decided to use power of tens it confused people and eventually got standardized by making kb power of ten and kib power of two.

From the looks of it you aren't familiar with the situation.

7
Eyronreply
lemmy.world

This is all explained in the post we're commenting on. The standard "kilo" prefix, from the metric system, predates modern computing and even the definition of a byte: 1700s vs 1900s. It seems very odd to argue that the older definition is the one trying to retcon.

The binary usage in software was/is common, but it's definitely more recent, and causes a lot of confusion because it doesn't match the older and bigger standard. Computers are very good at numbers, they never should have tried the hijack the existing prefix, especially when it was already defined by existing International standards. One might be able to argue that the US hadn't really adopted the metric system at the point of development, but the usage of 1000 to define the kilo is clearly older than the usage of 1024 to define the kilobyte. The main new (last 100 years) thing here is 1024 bytes is a kibibyte.

Kibi is the recon. Not kilo.

-2
wewbullreply
feddit.uk

Kilo meaning 1,000 inside computer science is the retcon.

Tell me, how much RAM do you have in your PC. 16 gig? 32 gig?

Surely you mean 17.18 gig? 34.36 gig?

5

abhibeckert in this thread had a good point. Floppies used the power of ten prefixes, so it wasn't particularly consistent.

1

209GB? That probably doesn't include all of the RAM: like in the SSD, GPU, NIC, and similar. Ironically, I'd probably approximate it to 200GB if that was the standard, but it isn't. It wouldn't be that much of a downgrade to go to 200GB from 192GiB. Is 192 and 209 that different? It's not much different from remembering the numbers for a 1.44MiB floppy, 1.5436Mbps T1 lines, or ~3.14159 pi approximation. Numbers generally end up getting weird: trying to keep it in binary prefixes doesn't really change that.

The definition of kilo being "1000" was standard before computer science existed. If they used it in a non-standard way: it may have been common or a decent approximation at the time, but not standard. Does that justify the situation today, where many vendors show both definitions on the same page, like buying a computer or a server? Does that justify the development time/confusion from people still not understanding the difference? Was it worth the PR reaction from Samsung, to: yet again, point out the difference?

It'd be one thing if this confusion had stopped years ago, and everyone understood the difference today, but we're not: and we're probably not going to get there. We have binary prefixes, it's long past time to use them when appropriate-- but even appropriate uses are far fewer than they appear: it's not like you have a practical 640KiB/2GiB limit per program anymore. Even in the cases you do: is it worth confusing millions/billions on consumer spec sheets?

1
PsychedSyreply
sh.itjust.works

I'm not sure if you just didn't read or what. It seems like you understand the history but are insistent on awkward characterizations of the situation.

Kibi is the recon. Not kilo.

I mean kibi is the retcon because it made all previous software wrong.

They didn't modify the use of kilo for other units - they used it as an awkward approximation with bytes. No other units were harmed in the making of these units.

And they didn't hijack it - they used the closest approximation and it stuck. Nobody gave a fuck until they bought a 300gb hd with 277gb of free space.

4

Nobody gave a fuck until they bought a 300gb hd with 277gb of free space

The difference was a lot smaller when you were dealing with 700 byte files - it was often a rounding error. Also - you needed two sectors (1024 bytes at the time) two store your 700 byte file, so what did it matter anyway? If you want to get really specific, you actually needed three sectors - because there's metadata on the file... however the metadata will share space with other files so does that count?

Filesystems are incredibly complex and there's no way they can be explained to a lay person. Storage is and always has been an approximation.

It's even worse with RAM these days - my Mac has 298TB of memory address space currently allocated... but only between 6GB and 7GB of "app memory" in use (literally fluctuating between those two from one second to the next when I'm not even doing anything but watching the memory usage).

1

To me, your attempt at defending it or calling it a retcon is an awkward characterization. Even in your last reply: now you're calling it an approximation. Dividing by 1024 is an approximation? Did computers have trouble dividing by 1000? Did it lead to a benefit of the 640KB/320KB memory split in the conventional memory model? Does it lead to a benefit today?

Somehow, every other computer measurement avoids this binary prefix problem. Some, like you, seem to try to defend it as the more practical choice compared to the "standard" choice every other unit uses (e.g: 1.536 Mbps T1 or "54" Mbps 802.11g).

The confusion this continues to cause does waste quite a bit of time and money today. Vendors continue to show both units on the same specs sheets (open up a page to buy a computer/server). News still reports differences as bloat. Customers still complain to customer support, which goes up to management, and down to project management and development. It'd be one thing if this didn't waste time or cause confusion, but we're still doing it today. It's long past time to move on.

The standard for "kilo" was 1000 centuries before computer science existed. Things that need binary units have an option to use, but its probably not needed: even in computer science. Trying to call kilo/kibi a retcon just seems to be trying to defend the use of the 1024 usage today: despite the fact that nearly nothing else (even in computers) uses the binary prefixes.

1
kbin.social

That's a relatively recent change though. AFAIK KB=1024 and MB=1024^2 was more common. As the article mentions, it's still commonly used in some sectors:

https://www.jedec.org/standards-documents/dictionary/terms/mega-m-prefix-units-semiconductor-storage-capacity

If you ask someone in their twenties, they're going to say 1000. If you ask someone who's older, or someone who knows a lot about disk storage they're likely to say 1024. Hell, as the article mentions windows uses the 1024 definition, which is one of the rasons why drives always seem smaller than their advertised size. The box says 250 GB, but when you install it windows says it'll say it's less than that. It's not actually less than 250 GB. It's just that windows is using GiB/Gibibytes but calling them GB/Gigabytes.

TLDR: no wonder people are confused.

15

Only recent in some computers: which used a non-standard definition. The kilo prefix has meant 1000 since at least 1795-- which predates just about any kilobyte.

-5
lemmy.world

I went to school before that took effect. But go ahead and downvote me for chiming in I guess.

6
PupBirureply
kbin.social

i didn’t downvote you, and i went to school before a bunch of things but technology evolves and either we evolve with it or we end up being just straight up wrong in a modern context

1

Yes. When the standards were changed, and they where, the old world should have no longer been used. Setting the definition to something only makes things more confusing.

1
kbin.social

It is only a mistake from a Human PoV. It is more efficient for the chip since 1000 bytes and 1024 bytes take up the same space. But Humans find anything not base 10 difficult.

3
wischireply
programming.dev

It's not really about the space numbers need inside the computer but about unit prefixes.

-17
56!
lemmy.ml

Unlike many comments here, I enjoyed reading the article, especially the parts in the "I don’t want to use gibibyte!" chapter, where you explain that this (the pedantry) is important in technical and formal situations (such as documentation). Seeing some of the comments here, I think it would have helped to focus on this aspect a bit more.

I also liked the extra part explaining the reasoning for using the Nokia E60.

I don't quite agree with the recommendation to use base 10 SI units where neither KiB or kB would result in nice numbers. I don't see why base 10 should have an influence on computers, and I think it makes more sense to stick to a single unit, such as KiB.

The reasons I have this opinion are probably to do with:

  • My computer has shown me values using KiB, Gib, etc for years - I think it's a KDE default - so I'm already used to the concept of KiB being different from kB.
  • I dislike the concept of base 10 in general. I like the idea of using base 16 universally (because computers. Base 12 is also valid in a less computer-dominant society). I therefore also think 1024 is a silly number to use, and we should measure memory in multiples of 2^8 or 2^16...

p.s, I agree with other commenters that your comments starting with "Pretty obvious that you didn’t read the article." or similar are probably not helping your case... I understand that some comments here have been quite frustrating though.

3

❤️ Thank you for taking the time to read it and thank you for your feedback, I really appreciate it.

2
chitak166reply
lemmy.world

I dislike the concept of base 10 in general.

You're not human.

2

i mean, you can't get to 1000 by doubling twos, so, no?

Reality doesn't care what you prefer my dude

2

The only place where kilobyte is 1000 bytes has been Google and everywhere else it's 1024 so even if it's precise I don't see the advantage of changing usage. It would just cause more confusion at my work than make anything clearer.

2
  • Kilobyte is 2^10 bytes or about a thousand bytes within a few reasonably significant digits.
  • Megabyte is 2^20 bytes or about a thousand megabytes within a few reasonably significant digits.
  • Terabyte is 2^30 bytes or about a a thousand megabytes within a few reasonably significant digits.

The binary storage is always going to be a translation from a binary base to a decimal equivalent. So the shorthand terms used to refer to a specific and long integer number should comes as absolutely no surprise. And that's just it; they're just a shorthand, slang jargon that caught on because it made sense to anyone that was using it.

Your whole article just makes it sound like you don't actually understand the math, the way computers actually work, linguistics, or etymology very well. But you're not really here for feedback are you. The whole rant sounds like a reaction to a bad grade in a computer science 101 course.

2
psudreply
lemmy.world

But on packaging of a disc it's misleading when they say gigabytes but mean gibibytes. These are technical terms with specific meaning. Kilo— means a factor of 1000, not "1000 within a couple of sig figs"

2
sh.itjust.works

They don't advertise gigabytes or terabytes on the packaging though. They advertise gigabits and terabits, a made up marketing term that sounds technical and means almost nothing. If you want to rant against something, get angry with marketers using intentionally misleading terminology like this.

0

I don't think I have seen anything advertised with bits other than network speed.

Though some mistakenly use "b" to mean bytes where the correct symbol is "B"

GB, TB, PB are in millions of-, thousands of millions of-, and millions of millions of- bytes respectively

If you buy ram though, you'll buy a package that says 32GB but it will not have 32 million bytes.

1

Based on your other replies, no, I absolutely will not waste my time reading your opinion piece.

And, a blog post is just another way of saying this is your opinion. That's all it is.

0
Humaniusreply
lemmy.world

Short answer: It's because of binary.
Computers are very good at calculating with powers of two, and because of that a lot of computer concepts use powers of two to make calculations easier.

1024 = 2^10^

Edit: Oops.. It's 2^10^, not 2^7^
Sorry y'all.. 😅

23
irdcreply

Yeah, I deserve that. I’m just gonna leave my typo. Thanks for the laugh!

6

1024 = 2^7^

I'm confused, why this quotation? 1024 is 2^10^, not 2^7^

6

Just to add, I would argue that by definition of prefixes it is 1000.

However there are other terms to use, in this case Kibibyte (kilo binary byte, KiB instead of just KB) that way you are being clear on what you actually mean (particularly a big difference with modern storage/file sizes)

EDIT: Of course the link in the post goes over this, I admit my brain initially glossed over that and I thought it was a question thread

3
TheMurphyreply
lemmy.world

I believe it's because you always use bytes in pairs in a computer. If you always pair the pairs, you would eventually get the number 1024, which is the closest number to a 1000.

The logic is like this:

2+2 = 4

4+4 = 8

8+8 = 16

16+16 = 32

32+32 = 64

64+64 = 128

128+128 = 256

256+256 = 512

512+512 = 1024

3

not exactly because of pairs unless you’re talking about 1 and 0 being a pair… it’s because the maximum number you can count in binary doubles with each additional bit you add:

with 1 bit, you can either have 0 or 1… which is, unsurprisingly perhaps, 0 and 1 respectively - 2 numbers

with 2 bits you can have 00, 01, 10, 11… which is 0, 1, 2, 3 - 4 numbers

with 3 bits you can have 000, 001, 010, 011, 100, 101, 110, 111… which is 0 to 7- 8 numbers

so you see the pattern: add a bit, double the number you can count to… this is the “2 to the power of” that you might see: with 8 bits (a byte) you can count from 0 to 255 - that’s 2 (because binary has 2 possible states per digit) to the power of 8 (because 8 digits); 8^2

the same is true of decimal, but instead of to the 2 to the power, it’s 10 to the power: with each additional digit, you can count 10 x as many numbers - 0-9 for 1 digit, 00-99 for 2 digits, 000-999 for 3 digits - 10^1, 10^2, 10^3 respectively

and that’s the reason we use hexadecimal sometimes too! we group bits into groups of 8 and call it a byte… hexadecimal is base 16, so nicely lets us represent a byte with just 2 characters - 16^2 = 256 = 2^8

5
Kalkalinereply
leminal.space

Harvard's CS50 has a great explanation on it. Makes a ton of sense. In fact CS50 should be required for high school, people would have a much better understanding of how software works in general.

2
Lmaydevreply
programming.dev

Understanding that has very little advantage for the average person.

-10
vithigarreply
lemmy.ca

So teaching it alongside things like the quadratic equation makes perfect sense then.

6

Exploring concepts that aren't familiar to you can help you with other issues in your daily life. It helps you problem solve from a new perspective.

4
lemmy.world

"Kilo" means 1000 under the official International System of Units.

With some computer hardware, it's more convenient to use 1024 for a kilobyte and in the early days nobody really cared that it was slightly wrong. It has to do with the way memory is physically laid out in a memory chip.

These days, people do care and the correct term for 1024 is "Kibi" (kilo-binary). For example Kibibyte. There's also Gibi, Tebi, Exbi, etc.

It's mostly CPUs that use 1024 - and also RAM because it't tightly coupled to the CPU. The internet, hard drives, etc, usually use 1000 because they don't have any reason to use a weird numbering system.

0
kbin.social

As the article mentions, windows also uses KB/MB/GB to refer to powers of 2 when calculating disk space. AFAIK Linux somes does too, although the article says otherwise. Apparently OSX uses the KB=1000 definition.

It may be outdated, but it's still incredibly common for people to use KB/MB/GB to refer to powers of 2 in computing. Best not to assume KB is always 1000.

3

Windows and Mac both use KB = 1000. With Linux I think it depends on the distro.

You're thinking of very old versions of Windows... old versions of MacOS were also 1024.

It's honestly irrelevant anyway - if you want to actually know how much space a file is using on disk, you should look up how many pages / sectors are being used.

A page (on an SSD) or sector (on a HDD) is 32768 bits on most modern drives. They can't store a file smaller than that and all of your files take up a multiple of that. A lot of modern filesystems quietly use zip compression though. Also they have snapshots and files that exist in multiple locations other shit going on which will really mess with your actual usage.

I'm not going to run du -h / on my laptop, because it'd take forever, but I'm pretty sure it would be a number significantly larger than my actual disk. Wouldn't surprise me if it's 10x the size of my disk. Macs do some particularly interesting stuff in the filesystem layer - to the point where it's hard to even figure out how much free space you have... my Home directory has 50 GB of available space on my laptop. Open the Desktop directory (which is in the Home directory...) and the file browser shows 1.9 TB of available space.

1
mb_reply

Weird numbering system? Things are still stored in blocks of 8 bits at the end, it doesn't matter.

When it gets down to what matter on hard drives, every byte still uses 8 bits, and all other numbers for people actually working with computer science that matter are multiples of 8, not 10.

And because all internal systems use base 8, base 10 is "slower" (not that it matters any longer.

1
lemmy.world

This is such a strange post and comment section to me. Computers work because of binary.

6

Which nobody uses in the industry because we all know that storage uses base2 prefixes.

1
Lmaydevreply
programming.dev

It's actually a decimal Vs binary thing.

1000 and 1024 take the same amount of bytes so 1024 makes more sense to a computer.

Nothing to do with metric as computers don't use that. Also not really to do with units.

4

It wasn't/isn't. It's nothing to do with Americans. It was (and often still is) because of binary, as the article mentions.

2 8 16 32 64 128 256 512 1024.

So no, kilo is not always a thousand when dealing with computers.

5

I've honestly just come to the conclusion that being an asshole about the fact that other countries exist is just the continental past time of Europe.

Like, Americans get the most of it but they're like this toward people from other European countries too.

2