Saturday, January 25, 2025

 KB KiB MB MiB GB GiB TB TiB

What are the differences between KB and KiB, MB and MiB, GB and GiB, TB and TiB? KB (and kin) are decimal representations of numbers, as powers of 10 in multiples of 1000. Let's call this notation "Decimal Thousands." For example, 1 KB = 1000 bytes, 1 MB = 1000 * 1000 = 1,000,000 bytes, 1 GB = 1000 * 1000 * 1000 = 1,000,000,000 bytes, etc.

In contrast, KiB (and kin) are "binary" representations of numbers, as powers of 2 in multiples of 1024. "Binary 1024's." For example 1 KiB = 1024 bytes, 1 MiB = 1024 * 1024 bytes = 1,048,576 bytes, etc.  It's fairly easy to calculate.

But why should we have to calculate it? I admit I got caught up in the hype and began using Binary 1024's for a while in my own work, but for what benefit? I haven't found one. Both representations can express exactly the same values, of course, so the question is convenience. Is there an application in the computing world where Binary 1024's are actually more convenient? Let's examine:

DISK DRIVE CAPACITY:

My Windows desktop computer has several rotating disks and NVMe SSD's. The capacity of every single one of those drives, if expressed in BYTES (the true measure), is very close to a decimal round number of 1000's and nowhere near a round number of Binary 1024's. Three examples:

  • WD Gold 12 TB SATA internal HDD = 12,000,119,746,560 bytes, within a fraction of the even decimal number of 12,000,000,000,000 bytes (12 TB), which is the nominal size that Western Digital claims. Expressed in Binary 1024's, that drive has a capacity of 10.9 TiB and change. Decidedly not a round number of 1024's or even 1000's.
  • WD Black 2 TB "gaming" external USB drive = 2,000,362,139,648 bytes, which is also very close to its 2 TB nominal capacity. 
  • WD Black 1 TB M.2 NVMe SSD = 1,000,067,821,588 bytes, just above its nominal 1,000,000,000,000. 
  • All of our other disk drives (on the desktop and several laptops) are like that too, very close to a nice even decimal number of thousands and close to their nominal capacities. 

I don't happen to have any Seagate drives (my last one just expired, so sad) but Seagate points out that "The storage industry standard is to display drive capacity in decimal." And Seagate does. Obviously, so does Western Digital. I didn't check Toshiba, Hitachi, or others, but I'm sure Seagate is correct. So what would be the point of converting those numbers to GiB or TiB? None that I can see.

FILE or FOLDER SIZE:

These don't tend to be such nice round numbers. Is it somehow easier to describe the size of a file in Binary 1024's instead of Decimal Thousands? Two real examples:

  • Macrium Reflect disk image: 57,891,323,904 bytes. Either 57.9 GB or 53.9 GiB, but I had to get out the calculator to get 53.9 GiB. Binary 1024's certainly don't seem easier. How about a smaller file?
  • Macrium Reflect log file: 28,203 bytes. Either 28.2 KB or 27.5 KiB. Again the calculator, again decimal is easier.

To be truthful, Windows actually did calculate those Binary 1024 values for me (though I checked them), but named them GB and KB instead of GiB and KiB. DUH! Only Microsoft would use KB where the value is actually in KiB.

CALCULATIONS:

Suppose you are using a computer or a language that is limited to 32-bit calculations, such as the Command Line Script language (CMD.exe) on Windows. You will not be able to handle decimal numbers over 2,147,483,647 (9+ decimal digits, if that). However, if you needed to write code that would determine whether a drive has sufficient remaining space for a file, for example, DIR would give you both the size of the file in bytes and the drive's free space in bytes as decimal character strings. 

Both of those strings could well be 10 digits or longer. So divide by a million or a billion by trimming off the least-significant 6 or 9 digits from both values. Now they are in MB or GB. Round them if you like, and do the mathematical comparison. Simple. But try doing it in Binary 1024's. I wouldn't know how to start. 

64-bit precision or floating point mathematics will make the calculations in Binary 1024's much easier, of course, but regardless of the precision, you will still have to convert from decimal bytes to Binary 1024's. When you start with values in decimal, calculations will always be easier in decimal. Either way, the computer actually does its work in binary, but we don't. We read, write, and understand decimal, so the computer transforms binary to decimal for us.

WHICH IS MORE ACCURATE?

Some blogs say that the binary representation is more accurate, but that's incorrect. They are just different ways to express exactly the same numbers. The most accurate (and precise) representation will always be bytes, expressed in decimal right down to the least significant digit. 

Notice that ALL of the numbers are eventually described in decimal. Even the disk image size of 53.9 GiB (above) is a decimal representation of a value calculated in Binary 1024's. We actually convert a decimal byte count to a Binary 1024 value, then express that in decimal so that we can understand it! WHAT IS THE POINT OF BINARY 1024'S? We live in a decimal world.

MICROSOFT WINDOWS:

Another note about Windows: The capacities of the disk drives listed above are as reported by Windows, in bytes. How much do we trust Microsoft to correctly report those drive capacities? Well, the desktop computer can dual-boot Linux Mint, which can then examine the same drives. For each of the three drives, Linux fdisk found the drive size to be exactly the same as the size reported by Windows PLUS 4096 BYTES. 

I assume those 4096 bytes are a non-negotiable part of Windows' NTFS file system, so the Windows developers have chosen not to show that space as if it might be available. In any case, 4096 bytes are insignificant compared with the 1,000,000,000,000 bytes in a small 1 TB drive.


Agree or disagree? Comments may be moderated, but if your comments are decent and relevant they will show up.


No comments: