Snow Leopard changes the way we look at Gigabytes (and megabytes, and kilobytes, as well).

gigabyte_difference

For a long time, there’s been an interesting discrepancy between the capacity listed on a hard drive’s label and the capacity reported by the computer. For example, attaching a 250GB hard drive would show up in the system as having 232.74GB available. Many would chalk it up to “formatting.” While the formatting information takes up some space, 17GB is a little excessive for formatting data. So where did this other space go?

The real culprit here is the discrepancy between base-10 mathematics (how most of us count) and binary (aka “base-2″) counting. To drive manufacturers, a kilobyte was 1000 bytes, a megabyte was 1000 kilobytes and a gigabyte was 1000 megabytes.

However, computers don’t natively use base-10; they use a base-2 system. To them, a kilobyte is defined as 1024 (which is 210) bytes, a megabyte is 1024 kilobytes, and a gigabyte is 1024 megabytes.

This methodology worked fine for many years; after all, 1024 isn’t TOO far off from 1000. As drive capacities increased, however, this became more and more pronounced. Drive manufacturers were defining “gigabyte” as 1,000,000,000 bytes (1000 x 1000 x 1000), while computers recognized a gigabyte as 1,073,741,824 bytes (1024 x 1024 x 1024). Every gigabyte added to a drive exacerbated the problem, adding 73,741,824 bytes to the discrepancy.

Snow Leopard, though, changes this. Instead of simply reporting the base-2 number for a unit of drive space, it converts it to an easier-to-understand base-10 number – the same way it is measured by drive manufacturers. In easier terms: a 500GB drive shows up as 500GB in the Finder, rather than 463.13GB.

Of course, this doesn’t mean that you magically get more drive space. You still have the same number of bytes (the base unit) to deal with. The number of bytes that make up larger increments has just changed. Of course, this change in measurement is applied across the board in the finder. All your files will seem “larger,” even though they all have the same number of bytes in them. For example, here’s a pair of screen shots of a folder in my music library.

sizes

These shots are of the same files, in the same folder, on the same drive. In 10.6, though, they’re reported as being “larger.” But are they? The main folder shows up as having 308,937,619 bytes in both systems. The only difference is the 10.5 uses base-2 for its measurement, and 10.6 uses base-10. In 10.5, a megabyte is 1,048,576 bytes. In 10.6, it’s an even 1,000,000. Divide 308,937,619 by both of those, and you can see how the Finder in each OS arrived at its figure.

This may be a bit confusing for a while – after all, we’ve kind of gotten used to things the way they were. There is a bright point, though: now you don’t have to ask where all that space went when you install or attach your new hard drive.

For more information, you can check out this Apple KnowledgeBase article.


LEAVE A COMMENT

Current day month ye@r *


  • Programmers defending base-1024 is strong evidence that “computer scientist” is an oxymoron.




  • You know, up until the late 80s, everyone doing anything related to the computer industry followed the computer industry’s established power-of-2 standard, because computers worked in powers of 2 and powers of 10 were not useful. Contrary to some of the comments above, it made perfectly good sense. Memory capacities were dictated by address lines, which meant memory was sized in powers of 2, and it was only reasonable to expect a chuck of data that occupied 4k in memory to also occupy 4k on disk or tape – not 4.096kb. Everyone was fine with this. It was logical, functional, and internally consistent.

    Then, around about the late 1980s or early 1990s if I remember rightly, some twit associated with SI (Systeme International, not Sports Illustrated) noticed that the computer industry was measuring things in multiples of 1024 rather than multiples of 1000, and had an absolute conniption fit about it. This Just Would Not Do. It was utterly declassé, quite beyond the pale, completely /nekulturny/. And he got on his high horse and opined that the computer industry should be reasonable and use powers of 10, regardless of whether or not powers of 10 were actually useful in the computer industry context.

    And there it would probably have ended, had not his little snit come to the attention of the storage industry (not that I’d, like, you know, mention Seagate by name or anything …. wait, was that my out-loud voice?), which did the math and realized that this would mean they could take the exact same disk they were selling today, relabel it, and sell it as a bigger hard disk tomorrow. They promptly swarmed all over this idea like a pack of tiger sharks mobbing a wounded whale. It was like a license to print money. Today’s 100MB disk was tomorrow’s 105MB disk, and all they had to do was change the label. A 400MB disk was now a 420MB disk, with a corresponding price increase, for exactly the same hardware.

    And we’ve been stuck with it ever since. With the enthusiastic backing of the computer storage industry, and over the objections of the rest of the industry, we got saddled with the silliness of the Gibi-rish units, and hard disks whose capacity was measured in different units than the memory that went into the same computer. And it all started because of one person who was offended that the practical usage that made perfect working sense in binary computer architectures was untidily inconsistent with the Great SI Scheme of Things.

    “A foolish consistency is the hobgoblin of little minds.” — Ralph Waldo Emerson




  • You are all forgetting about the lawsuit brought against Seagate over the measurement of hard rive space. The rest of the industry will be following suit.

    The lawsuit is Cho v. Seagate Technology (US) Holdings, Inc., San Francisco Superior Court, Case No. 453195. In the suit, the plaintiff alleges that in the sale and marketing of hard disc drives, Seagate has denied and continues to deny each and all of plaintiff’s claims, and denies that anyone has been harmed or deserves compensation. The Court has not made a decision on the merits.

    Plaintiffs and their counsel have previously been awarded attorneys’ fees, expenses and incentive awards in the amount of $1,792,000, to be paid separately and in addition to the benefits available to settlement class members. Awarded amounts will be paid only if the settlement is approved.

    All claims of settlement class members which were or could have been asserted in the litigation, based upon the facts alleged in the litigation (as well as in a related case entitled Lazar v. Seagate Technology LLC, et al., San Francisco Superior Court, Case No. 439700; and California Court of Appeal, Case No. A116350) will be released.




  • What’s really important is consistency. With this SL change, there’re more confusions caused due to the way Apple introduced the change.

    I can just see how some script somewhere breaks because of this dumb move. Is Apple going to change the definition of something like “du -h”, as well? Or are they even going to just update the man page for that? And, think about it. This is just a small and simple example.

    I love using my Macs, but this move by Apple is not something I’d applaud them for. I’ve been using Apple computers since the Apple ][. Apple was OK with the definition of what a “kilobyte” is up to SL and now they think they should just suddenly make this change and hope to make it be less confusing for consumers with this change adopted in SL? Yes, there are some smart people at Apple but what they did with SL (whether Engineering or Marketing or Steve Jobs driven) was not a smart move.

    I agree that by using different “unit symbols” is a way to clearly differentiate to avoid confusions and, like what other posters mentioned, Apple should at least have made it be an option as it’s definitely confusing to look at the file sizes for a shared drive from different platforms.

    It’s not about old habits. It’s about how to properly approach introducing a change like this.

    As a side note, quoting from http://en.wikipedia.org/wiki/Kibibyte : “In The Art of Computer Programming, Donald Knuth proposed that this unit be called a large kilobyte (abbreviated KKB).” It’d be amusing if a “large gigabyte” is not referred to as GGB but as KGB. ;^)




  • do you know the percentage difference between binary prefixes and SI-prefixes

    kibibyte +2.4% or −2.3% kilobyte
    mebibyte +4.9% or −4.6% megabyte
    gibibyte +7.4% or −6.9% gigabyte
    tebibyte +10.0% or −9.1% terabyte
    petibyte +12.6% or −11.2% petabyte
    exbibyte +15.3% or −13.3% exabyte
    zebibyte +18.1% or −15.3% zettabyte
    yobibyte +20.9% or −17.3% yottabyte




  • - kilometer = 1000 meter
    - kilogram = 1000 gram
    - kiloampere = 1000 ampere
    - kilomole = 1000 mole

    Knowing this, now, how many bytes is a kilobyte?

    the only right awnser would be 1000, and awnsering 1024 is just plain dumb.




  • I suppose you idiots think that optical disks like DVD and Blu-Ray are marketed in decimal to inflate their capacity, too, right? Even though the size is defined in the specs and disks made by different manufacturers are all the same size? And I suppose processors and network cards and memory buses are all measured in decimal megahertz to inflate their speed?

    Please, use your brain for once. Apple is doing it right. Using “k-” to mean 1024 was wrong in the 1960s, and it’s even more wrong now.




  • I think adding an “i” to GB to make it GiB, would have been easier to make everything technically proper. But instead Apple kept the familiar GB and changed ALL the numbers! (except for bytes)




  • I’ve been waiting for about 7 years for this fix. Now a Gigabyte is a GB. But I would have rather Apple started using GiB (GibiByte) instead of changing the numbers to the proper GB values.




  • I’ve been a programmer since mainframes had memory listed in “Kwords” (“words” varied in the number of bits). I continue to write software where binary sizes matter (eg: with 14 bits you can have up to 16,384 values, and it’s easier to say 16K) so I intimately understand the usefulness of binary prefixes in programming.

    Nevertheless, the decimal prefixes (M = 1,000,000) are way more logical for everything but some programming tools. Apple has switched to decimal prefixes, Linux is in the process thereof, and hopefully Windows will switch soon.

    Actually, in Windows XP for example, if you right click and choose properties for a selected drive or group of files, Windows will already report exact bytes in decimal as well as an approximation in binary units. If they will switch to using decimal approximations (3.1 GB = 3,100,000,000 B) then this little aspect of the world will become a bit saner.

    In the relatively rare cases where one needs binary prefixes, use the new unambiguous ones: KiB, MiB, GiB etc. Use KB, MB, GB etc only for power of ten.

    This is not a marketing ploy, it is common sense once you get over old habits. Suppose disk drive manufacturers used binary prefixes. Then 1.5 Terabytes would be 1536 Gigabytes, or 1500 Gigabytes would be about 1.465 Terabytes. Every time you cross a 1000 boundary in adding or multiplying, you would need to adjust your figures. Quick: 400 files averaging 22 MBytes each would take up how many gigabytes? Decimal usage answer: 8800 MB or 8.800 GB. Binary usage answer: 8800 MB or 8.594 GB Or round to 8.6 GB. Even as a programmer, the former makes a lot more sense for normal usage.

    Currently, a 700 MB CD holds 700 x 1024 x 1024, while a 4.7 GB DVD holds 4.7 * 1000 * 1000 * 1000 and a 1.44 MB floppy held 1.44 * 1000 * 1024. USB 2.0 rates are 480 x 1000 x 1000 bits/sec, a 500 GB drive holds around 500 x 1000 x 1000 x 1000 bytes. The only way this is going to get consistent is to adopt decimal throughout.

    It’s time we cleaned up our acts. While I mostly work with Windows and Linux, I applaud Apple for leading the way and having the courage to stand up to the conservatives who have trouble switching. It will not be long before Apple folks are somewhat justifiably ridiculing Windows for it’s backwardness in this regard (rather than complaining to Apple). Hopefully Microsoft will see the light.




  • John F,

    Actually, displaying storage as multiples of 1024 was an early fudging of the numbers to make it easier for us humans to read the numbers. A computer doesn’t think “this file is 1024 bytes,” it thinks, “this file is 10000000000 bytes”.

    1024 was “close enough” to 1000 for early programers to consider at a “kilo” (aka 1000) byte, to make larger number of bytes easier to read. But as storage sizes have increased, this arbitrary way of counting has gotten outmoded and confusing, since it’s consistently borrowed base-10 abbreviations, kilo, mega, giga, and used them in a base-2 environment.




  • John F speaks the truth. Apple shouldn’t be helping the drive manufacturers cover up their dishonest representations of space in this way. Instead, the drive manufacturers should be legally required to advertise the base-2 representations, which have been accepted standards in computing since its inception.

    I hope we can disable this misrepresentation in 10.6.1. I couldn’t find a way in 10.6.




  • The drive manufacturers seem to have snowed everyone, including both Apple and the writer of this blog

    DECIMAL kilobytes, megabytes, gigabytes, and terabytes were wrong in the 1980s (when the drive manufacturers started lying about drive capacity), and they’re even more wrong now.

    Computers simply do not allocate and use memory (including block-structured storage such as disk space, flash memory cards, or USB sticks)) in *ANY* unit that is a multiple of 10.

    Therefore, to be accurate, the storage on any block-structured storage medium MUST be marketed using multiples of 1024 bytes — otherwise it’s a lie, plain and simple.

    Until the drive manufacturers stop using “marketing” gigabytes, they will be as wrong as TV manufacturers who used to lie routinely about the diagonal measures of their picture tubes. They’re stating that the device has more capacity than it really does. Period.




  • Ah, finally someone gets it. G (giga) is exactly 10 to the power of 9, not something 7.3 % larger (that programmers think so doesn’t overrule IEC or SI). Like the base-2 prefixes? Use kibi, mebi and pals, don’t overload the well-defined SI base-10 prefixes.

    Besides, computer doesn’t think in base-2 gigabytes, it thinks in bytes. Kilos, megas, gigas, teras and so on are only to make the representation more pleasing and easier for our human brains to visualise.

    Sometimes, correcting a fault is worth the hassle. Now only if certain other major vendors followed suit…




  • There are 10 types of people – those who understand binary and those who don’t…
    I’m guessing Apple users don’t.




  • I hope there’s some way to switch this off?




  • It’s funny how people who are supposed to be experts don’t understand how computers store decimal numbers in memory.

    A kilobyte is 1000 bytes, not 1024. That’s 1111101000 in binary, and no, computers have no trouble storing this in memory or representing it as “1000 bytes”.

    Dividing by 1024 makes absolutely no ******* sense when reporting numbers like this to the user. It adds an extra layer of complication and calculation with no benefit. I suppose when Google Earth says “256 km”, you think it means 262,144 meters? Because computers work in base 2, right? Idiots.




    • “It’s funny how people who are supposed to be experts don’t understand how computers store decimal numbers in memory.

      A kilobyte is 1000 bytes, not 1024. That’s 1111101000 in binary, and no, computers have no trouble storing this in memory or representing it as “1000 bytes”.

      Dividing by 1024 makes absolutely no ******* sense when reporting numbers like this to the user. It adds an extra layer of complication and calculation with no benefit. I suppose when Google Earth says “256 km”, you think it means 262,144 meters? Because computers work in base 2, right? Idiots.”

      I’m afraid your argument is flawed because firstly computers used the base 2 (Binary) numbering system for capacities and sizes and always have done, windows and unix have even now still use the base 2 system when displaying file or disc sizes when using the command line or certain system tools. secondly your comparison of memory sizes to distances are two totally different units of measurement and not even remotely related.

      and for your information 256 km is 256,000 meters or 256,000 m




  • Excellent piece of info, nobody else has picked up on this.

    And it’s **** retarded. This issue is, as pointed out, the HD capacity, because users don’t understand the computer science business…drive manufacturers should be keel-hauled for their marketing guys pulling this stunt. What Apple SHOULD have done is report the salesguy number FOR THE DRIVES ONLY, so the Genius Bar guys don’t have to sigh heavily and roll their eyes when the 200th guy comes up to them to ask why the 250GB drive in their Macbook is 237Gb in the Finder and What Kind Of Fool Does Apple Think They Are.

    The problem is, now 200 guys will be coming up to the Genius Bar asking why their files are always bigger on their Mac than on their Windows server at work. I predict a LOT of eye rolling at the Genius Bar.




  • The reason computer folks are averse to this is that data is stored on your COMPUTER and not in your pin head.

    When I burn a DVD the measurements are STILL on base2 but when I look in the finder it says my data is 4.6GB when it’s *actually closer to 3.8GB? WTF?

    I really want a setting like:

    defaults write GB.retard.setting.for.lusers -OFF

    Take. me. BACK. to 10.5




  • “Please clarify your statement”

    I’m not the original commenter, but I concur with their judgment.

    With 10.6, if I want to transfer 150GB of video from my TiVo to my Mac Mini (with its lovely OWC FW800 enclosure), I’m going need to fire up a damn spreadsheet to do the conversion math.

    Well, you may say, most folks don’t have TiVo’s. Fine. But the point is that if you want to transfer files between a 10.6 machine and anything else in the digital universe that isn’t running OS X, you’ve got conversion math if you want to keep track of sizing issues.

    All of this is to save Apple a almost invisible bit of hassle in new customers being confused as to why their drives aren’t quite as large as they thought they’d be. Some days, it seems as if Apple cares more about the new car smell than how the car runs a week later…




  • This is the way it should have been since the beginning of time. I’ve been arguing this for years. So many computer programmers are angrily averse to this it’s ridiculous.

    But now that Apple’s done it, everyone will suddenly think it’s a great idea.

    *sigh*




  • Thus Apple continues to distance itself from other vendors by exacerbating cross-platform compatibility problems.




    • Please clarify your statement; I’m not quite sure what you mean there. The change in size reporting is only cosmetic, allowing the end user to think in a more “natural” base-10, rather than in base-2, which many people find a little obtuse to work with.

      Despite the change in definition larger sizes, the actual sizes – in bytes – of the drives and files remain the same. As computers only read in bytes (the metric prefixes are there for our benefit not the computer’s) this redefinition of terms would not be the cause any compatibility problems.