Hacker News new | comments | show | ask | jobs | submitlogin
0.30000000000000004 (0.30000000000000004.com)
768 points by beznet 6 months ago | hide | past | web | 402 comments | favorite

The big issue here is what you're going to use your numbers for. If you're going to do a lot of fast floating point operations for something like graphics or neural networks, these errors are fine. Speed is more important than exact accuracy.

If you're handling money, or numbers representing some other real, important concern where accuracy matters, most likely any number you intend to show to the user as a number, floats are not what you need.

Back when I started using Groovy, I was very pleased to discover that Groovy's default decimal number literal was translated to a BigDecimal rather than a float. For any sort of website, 9 times out of 10, that's what you need.

I'd really appreciate it if Javascript had a native decimal number type like that.

Decimal numbers are not conceptually any more or less exact than binary numbers. For example, you can't represent 1/3 exactly in decimal, just like you can't represent 1/5 exactly in binary.

When handling money, we care about faithfully reproducing the human-centric quirks of decimal numbers, not "being more accurate". There's no reason in principle to regard a system that can't represent 1/3 as being fundamentally more accurate because it happens to be able to represent 1/5.

Money are really best dealt with as integers, any time you'd use a non-integer number, use some fixed multiple that makes it an integer, then divide by the excess factor at the end of the calculation. For instance computing 2.15% yearly interest on a bank account might be done as follows:

  DaysInYear = 366
  InterestRate = 215
  DayBalanceSum = 0
  for each Day in Year
    DayBalanceSum += Day.Balance
  InterestRaw = DayBalanceSum * InterestRate
  InterestRaw += DaysInYear * 5000
  Interest = InterestRaw / (DaysInYear * 10000)
  Balance += Interest
Balance should always be expressed in the smallest fraction of currency that we conventionally round to, like 1 yen or 1/100 dollar. Adding in half of the divisor before dividing effectively turns floor division into correctly rounded division.

This is called fixed-point arithmetic:


> In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after (and sometimes also before) the radix point.

> A value of a fixed-point data type is essentially an integer that is scaled by an implicit specific factor determined by the type.

Yeah, though that notion tends to come with some conceptual shortcomings, like presuming a power of 10 radix. In the above code the radix is implicitly different on leap years, applying such tricks is usually not possible with a fixed point library or language construct.

Sounds like fractions cleanly describe what you're saying?

But that practically holds only for a reasonable amount of simple arithmetics. Fractional components tend to grow exponential for many numerical methods repeated multiple times. This can happen if you're describing money and want to apply a complex numerical method from an economics article for whatever purpose. Might be worth it but be careful not to carry ever expanding fractions in your system.

This only for dealing with actual money, generally our banking systems have rounding rules that prevent the fractions from getting out of hand.

If you are running an economic simulation you generally don't have to worry about rounding, the whole thing is only approximate anyway.

Yup. Once worked on a big project with one of the largest US exchanges. We were migrating large OTC (over the counter) CDS (credit default swaps) contracts to standardized centralized contracts. We were testing with large contracts, millions of contracts worth trillions of dollars. I was off by a single penny and failed the test. Took a while to find, but it was due to a truncate to zero instead of a proper round. I was using a floating point type instead of a proper decimal. Dont think the language I was using had a proper decimal type at the time, though it does now, 11 years later.

>Money are really best dealt with as integers

I wish I could up vote you more than once. You are bang on.

The real lesson is, no matter what base (radix) you use, floating point math is inexact.

The value of floating point is that it can represent extremely huge or extremely infinitesimal values.

If you're working with currency / money, floating point is the wrong thing to use. For the entire history of human civilization, currency has always been an integer type, possibly with a fixed decimal point. Money has always been integers for as long as commerce has existed, and long before computers.

If you're building games, or AI, or navigating to Pluto, then floating point is the tool to use.

> The real lesson is, no matter what base (radix) you use, floating point math is inexact.

This is just not true. If you add 1.5 + 4.25 with IEEE754, there is nothing inexact or rounded. That you cannot exactly represent 0.1 in base2 FP is a problem of base2, not FP.

You get inexact results with FP math for underflows, overflows, or if you don't have enough precision for the result (or an intermediate result). But the same is true for normal integer types.

I think what that commentator meant is that floating-point math is not an accurate model of rational-number arithmetic, not that there aren't certain computations that are in fact exact. (As you point out, there are: 1.5 + 4.25 is indeed exact)

> is that floating-point math is not an accurate model of rational-number arithmetic

Well, this is true. But integer math is also not an accurate model of rational-number arithmetic, yet nobody would claim that integer math is inexact.

A 32 bit floating point number can only have around 4 billion unique values, yet must represent numbers from 10^38, to very small decimals. 99.99999% of numbers in this range cannot be accurately represented in floating point form.

Compare that to a 32 bit integer, which can have 4 billion unique values, and supports numbers from 0 to 4 billion. It's a 1:1 mapping.

To be mathematically pedantic, 100% of numbers in that range cannot be accurately represented in floating point form.

> yet must represent numbers from 10^38

No, they don't must represent all number in the range. I don't know where you get from that they must. An integer also can't represent all real numbers in its range.

There's no such thing as a "problem of base2". Base 2 is an ineffable fact of the universe, and it is neither virtuous nor problematic. All the problems you are describing are problems of floating-point arithmetic.

> There's no such thing as a "problem of base2".

That you cannot represent 1/3 as a non-periodic decimal number is a problem of base 10.

That you cannot represent 1/10 as a non-periodic binary number is a problem of base 2.

These are just mathematic facts. Maybe you don't like the world "problem", but it does not change that this is where we are.

The problem that you cannot represent 0.1 in base 2 FP, is a problem of base 2. You can represent it exactly in base 10 FP.

> Decimal numbers are not conceptually any more or less exact than binary numbers.

True but irrelevant. The problem isn't with the math fundamentals, it's the programmers.

The issue is if you get your integer handling wrong it usually stands out. Maybe that's because integers truncate rather than round, maybe it's because the program has to handle all those fractions of cents manually rather than letting the hardware do it so he has to think about it.

In any case integer code that works in unit tests usually continues to work, but floating point code passing all unit tests will be broken on some floating point implementations and not others. The reason is pretty obvious: floating point is inexact, but the implementations contain a ton of optimisations to hide that inexactness so it rarely raises it's ugly head.

When it does it's in the worst possible way. In a past day job I build cash registers and accounting systems. If you use floating point where exact results are required I can guarantee you your future self will be haunted by a never ending stream of phone calls from auditors telling you code that has worked solidly in thousands of installations over a decade can not add up. And god help you if you ever made the mistake of writing "if a == b" because you forgot a and b are floating point. Compiler writers should do us all a favour and not define == and != for floating point.

Back when I was doing this no complier implemented anything beyond 32 bit integer arithmetic, in fact there was no open source either. So you had to write a multi precision library and all expression evaluation had to be done using function calls. Despite floating point giving you hardware 56 bit arithmetic (which was enough), you were still better off using those clunky integers.

As others have said here: if you need exact results (and, yes currency is the most common use case), for the love of god do it using integers.

> If you're going to do a lot of fast floating point operations for something like graphics or neural networks, these errors are fine. Speed is more important than exact accuracy.

Um... that really depends. If you have an algorithm that is numerically unstable, these errors will quickly lead to a completely wrong result. Using a different type is not going to fix that, of course, and you need to fix the algorithm.

From your description, I fail to understand how does it depend. You're saying that the algorithm is wrong, and changing the type doesn't help. If the type is not the issue, what difference does it make?

A single problem can be solved by using many different algorithms.

However, even though algorithm A and B are "correct" they can behave differently when rounding errors are introduced.

For example – if algorithm A uses


and B uses naive summation then you can expect the end result of A to be more precise than the end result of B – even though both algorithms are correct.

> and B uses naive summation then you can expect the end result of A to be more precise than the end result of B – even though both algorithms are correct.

Formally speaking, no. The problem can be defined precisely. At least one of the algorithms fails to solve the problem.

In practice of course, some amount of error may be acceptable.

In the world of money, it is rare to have to work past 3 decimal places. Bond traders operate on 32nds, so that might present some difficulties, but they really just want rounding at the hundreds.

Now, when you’re talking about central bank accruals (or similar sized deposits) that’s a bit different. In these cases, you have a very specific accrual multiple, multiplied by a balance in the multiple hundreds of billions or trillions. In these cases, precision with regards to the interest accrual calculation is quite significant, as rounding can short the payor/payee by several millions of dollars.

Hence the reason bond traders have historically traded in fractions of 32.

A sample bond trade:

‘Twenty sticks at a buck two and five eights bid’ ‘Offer At 103 full’ ‘Don’t break my balls with this, I got last round at delmonicos last night’ ‘Offer 103 firm, what are we doing’ ‘102-7 for 50 sticks’ ‘Should have called me earlier and pulled the trigger, 50 sticks offer 103-2’ ‘Fuck you, I’m your daughter’s godfather’ ‘In that case, 40 sticks, 103-7 offer’ ‘Fuck you, 10 sticks, 102-7, and you buy me a steak, and my daughter a new dress’ ‘5 sticks at 104, 45 at 102-3 off tape, and you pick up bar tab and green fees’ ‘Done’ ‘You own it’

That’s kinda how bonds are traded.

Ref: Stick: million Bond pricing: dollar price + number divided by 32 Delmonicos: money bonfire with meals served

I'm curious about the "off tape" part. Presumably this means not on a ticker or not made public somehow - how are these transactions publicized and/or hidden?

Hear, hear! It would be great if javascript had any integral type that we could build decimals, rationals, arbitrarily-large integers and so on off. It’s technically doable with doubles if you really know what you’re doing, but it would be so much easier with an integral type.

ES does have an arbitrarily large integer type, BigInt.


It’s not supported everywhere though, so it’s not like you could use it to actually build a library, you would need to use something that fell back to Doubles anyway.

Because the double type can guarantee accurate reproduction of values up to the size of its mantissa (52 bits) you can effectively use than as integers up to that size. It would be nice to be able to just have an integer directly though as that would be more efficient

IIRC some JS engines are capable of detecting many circumstances where floating-point is not needed, particularly for simple cases like loop counters, and their JiT compilers will produce code that uses integer values instead of floats for those purposes - but how reliable that is for cases any more complex than that I don't know.

Though the lack of support in IE, current Edge, and Safari, blocks that from client-side use for many.

There are several BigInt libraries out there that you could use, though obviously this is not as convenient and even if they wrap BigInt when available will be less efficient.

Latest Edge dev preview has supported it since the switch to Chromium. The Chromium-based Edge launches on Jan 15th, at which point Edge will support it.

Safari (WebKit) actually has a fully working implementation, they just haven't shipped it yet. Search the release notes for "BigInt": https://developer.apple.com/safari/technology-preview/releas...

How is a true integer easier than just pretending a double is an integer? In both cases, you have to be aware of the range of values they can hold to prevent overflow (integers) or rounding (doubles), and you have to be careful not to perform operations that aren't valid for integers to avoid truncation (integers) or non-zero decimal places (doubles).

'Decimal' is a red herring. The number base doesn't matter. (And what are you going to do when you need currency coversions, anyways?)

Floats are a digital approximation of real numbers, because computers were originally designed for solving math problems - trigonometry and calculus, that is.

For money you want rational numbers, not reals. Unfortunately, computers never got a native rational number type, so you'll have to roll your own.

Historically, it's correct-but-too-vague to say computers were for "solving math problems". Historic computer problems should be divided into two types: business problems and scientific/engineering problems. Business problems include things like tabulation and accounting. Programmable digital computers go back at least as far as UNIVAC I, in 1951 (using programmable digital computers for science doesn't go back THAT MUCH farther).

Prior to the IBM/360 (1964), mainframes sold for business purposes generally had no support for floating point arithmetic. They used fixed-point arithmetic. At the hardware level I think this is just integer math (I think?), but at a compiler level you can have different data types which are seen to be fractions with fixed accuracy. I believe I've read that COBOL had this feature since I-don't-know-how-far-back.

This sort of software fixed-point is still standard in SQL and many other places. Some languages, and many application-specific frameworks, have pre-existing fixed-point support. So it's also not accurate to say that you necessarily need to roll your own, though certainly in some contexts you'll need to.

And for money, you very much do not want arbitrary rational numbers. The important thing with money is that results are predictable and not fudgable. The problem with .1 + .2 != .3 is not that anyone cares about 4E-17 dollars, it's that they freak out when the math isn't predictable. Using rationals might be more predictable than using floats, but fixed-point is better still. And that's fixed-point base-10, because it's what your customers use when they check your work.

Agree that rational isn't it. But "reproducing the existing quirks" seems like an accurate description. If you want to pay 7% APR on month-end balances, then that's a real-number calculation, but to match what customers expect you need in addition to specify when to round off to cents.

I enjoy Haskell's approach to numbers.

The type of any numeric literal is any type of the `Num` class. That means that they can be floating point, fractional, or integers "for free" depending on where you use them in your programs.

`0.75 + pi` is of type `Floating a => a`, but `0.75 + 1%4` is of type `Rational`.

Hm... what happens if you've got a neural network trained to make decisions in the financial domain?

Is there a way to exploit the difference between numeric precision underlying the neural network and the precision used to represent the financial transactions?

Neural networks are by their very nature a bit vague, random and unpredictable. Their output is not suitable as a direct, real monetary value you can rely on. At best, they predict trends, approximations or classifications.

> I'd really appreciate it if Javascript had a native decimal number type like that.

Was proposed in the late 90's Mike Cowlishaw but the rest of the standards committee would have none of it.

A new proposal for adding arbitrary-precision Decimal support to JavaScript is being presented at TC39 this week.

Proposal: https://github.com/littledan/proposal-bigdecimal

Slides: https://docs.google.com/presentation/d/1qceGOynkiypIgvv0Ju8u...

I'd agree for saner defaults, especially in web development. I can understand that if you want to have strictly one number type it may make sense to opt for floating point to eke out the performance when you do need it, but I'd rather see high-precision as the default (as most expect that you'd be able to write an accurate calculator app in JavaScript without much work) and opt-in to the benefit of floating point operations.

MS Excel tries to be clever and disguise the most common places this is noticed.

Give it =0.1+0.2-0.3 and it will see what you are trying to do and return 0.

Give it anything slightly more complicated such as =(0.1+0.2-0.3) and this won't trip, in this example displaying 5.55112E-17 or similar.

Are you sure it is not showing the exact answer because the the the cell precision set to a single decimal digit?

Yup: https://i.imgur.com/VuawaE1.png, on Excel v1911 (Build 12228.20332).

Kahan (architect of IEEE 754) has a nice rant on it:


(and plenty of other rants...:

https://people.eecs.berkeley.edu/~wkahan/ )

I remember in college when we learned about this and I had the thought, "Why don't we just store the numerator and denominator?", and threw together a little C++ class complete with (then novel, to me) operator-overloads, which implemented the concept. I felt very proud of myself. Then years later I learned that it's a thing people actually use: https://en.wikipedia.org/wiki/Rational_data_type

An other compromise in to use fixed point which is effectively a rational with a fixed denominator. Extremely popular on machines which can handle integer arithmetics but not floating point (since you can trivially do fixed-point arithmetics using integer operations, you just need to be very careful when you handle overflows). If you look at the code of old school games (including classics like Doom if memory serves) the game engine used fixed-point to work on commodity hardware without FPU.

There's also BCD (binary coded decimal) that can solve some problems by avoiding the decimal-to-binary conversions if you're mainly dealing with decimal values. For instance 0.2 can't usually be represented in binary but of course it poses no problem in BCD.

Beware that BCD, and decimal in general, accumulates roundoff error at a much higher rate than binary, if you do any inexact operations.

It is more common these days to use base-1000, instead, when you need exact decimal representations. You can fit three base-1000 "digits" in a 32-bit word, with two bits left over for sign plus any other flag you find useful. (One such use could be to make a zero in the second place indicate that the rest of the word is actually binary; then regular arithmetic works on such words.) Calculations in base-1000 are quite a lot faster than BCD.

Almost always when people think they need decimal, binary -- even binary floating-point, if the numbers are small enough -- is much, much better. Just be sure to represent everything as an integer number of the smallest unit, say pennies; and scale (*100, /100) on I/O.

"Much, much better" in what sense? Just performance?

Performance, correctness, and maintainability. The amount of code needed is very small, and uses native instructions for the work, which are pretty well-tested.

Fixed/floating is an interesting tradeoff for many real-time strategy games too where changes in game state are a synchronized simulation. Fixed point math in software can give more reliable and cross-platform math operations, but with a performance cost (eg: Homeworld: Deserts of Kharak). Using the CPU's floating-point hardware is faster, but you often have to ensure the correct CPU registers are set before doing calculations and those registers can be changed by other software such as a DirectX driver or the operating system (eg: Age of Empires II, Rise of Nations. etc).

I currently build deterministic multiplayer WebGL games in Unity, built via C#->IL2CPP->Emscripten->WASM. The server is the same code base running on Microsoft's .Net runtime.

The chances of being able to run deterministic floating point calculations across this stack is basically zero (even leaving aside that the games are often run on ARM chips), and so we use this library when floats are absolutely necessary (but more often just plain longs):


It is a little terrifying that e.g. normalizing a vector involves a while loop, but all things considered the whole thing runs surprisingly well.

(I agree with everything in your post, just thought I could add a real world field report)

We also built and shipped a deterministic multiplayer WebGL game[1], but using CoffeeScript[2] + C++ -> Emscripten/dylib/DLLs to run the game in the browser and on Windows and Mac.

Our game would snapshot the entire game state every few seconds and send that back to server to detect desyncs and cheaters. Floating point math, to our astonishment, was not the source of any non-determinism.

I'm 80% sure that only source of non-determinism we encountered were from trig functions, so we just hard-coded lookup tables.

1: https://guardiansofatlas.com/

2: It was 2012 when we started.

You use that library when you want fractional values right? That is, numbers with a binary point but not floats.

For the most part, I use longs (for instance a FixedVec is a (long,long,long) struct where 1 = 1/1000 of a meter).

However, complicated calculations or anything involving angles or other math functions quickly becomes more convenient when expressed as a Fix64, which is more or less a drop in replacement for float.

I would ideally use Fix64 everywhere, but given the torturous route the C# takes to be transformed into something that's executed on the client machines, my faith in the compiler's ability to generate good code for that is basically zero. I mentally treat long + long as a single instruction, but Fix64 + Fix64 as a function call.

> There's also BCD (binary coded decimal) that can solve some problems by avoiding the decimal-to-binary conversions if you're mainly dealing with decimal values. For instance 0.2 can't usually be represented in binary but of course it poses no problem in BCD.

BCD is/was super common in measurement equipment for internal calculations for this reason, and also because it is trivial to format for display (LED/LCD/VFDs) or text output (bus system, printer/plotter).

Many CPUs support BCD, at least in a limited number of ways compared to their normal binary representation.

The 8086 (and its descendants, of course) supports BCD by having instructions to adjust the result after the basic add/sub/mul/div instructions, though only one byte at a time.

The 6502's add and subtract instructions would operate on, and output, BCD values if the special purpose "decimal" flag was set. Again only in 8-bit (two digit) chunks but that is to be expected as it was an 8-bit chip generally.

It's actually in use in many places, for things like handling currency and money, and for when you get funny corner cases involving rounding such numbers and pooling the change.

Whenever I see someone handling currency in floats, something inside me wither and die a small death.

> Whenever I see someone handling currency in floats, something inside me wither and die a small death.

Meh. When used correctly in the right circumstances it is acceptable to use floats.

Here's an example. Suppose you are pricing bonds, annuities, or derivatives. All the intermediate calculations make essential use of floating point operation. The Black–Scholes model for example requires the logarithm, the exponential, the square root, and the CDF of the normal distribution. None of that is doable without floats.

Even for simpler examples it is sometimes okay to use floats. If you only ever need to store an exact number of cents, you can totally store the number of cents in a double. Integer operations are exact using IEEE-754 double operations when they are smaller than 2^53-1 or so. There's usually no benefit of doing so, but hey it's possible.

Currency handling is almost never done with rationals (numerator and denominator) and is frequently (and correctly so!) done with fixed or floating point decimal types.

I develop accounting software for banks, brokerage houses and likes.

Currency, taxes, rebates, etc. handling is NEVER done with floating point.

Whatever you do with money you need predictable, reproducible results. It is norm that calculations are checked by software at two companies on both sides of transaction. Any discrepancies are alarms, bug reports, unhappy customers.

Every significant operation is exactly specified with rounding rules, etc.

For card payments and especially on terminals usually BCD is used.

For everything else usually some kind of arbitrary length decimal library (BigInteger, BigDecimal).

> Currency, taxes, rebates, etc. handling is NEVER done with floating point.

Nonsense. I’ve seen real banking code at reputable banks that uses floats.

> Whatever you do with money you need predictable, reproducible results.

Floats aren’t random. They’re perfectly deterministic, predictable and reproducible. If you do the same operation in two places you get the same result.

I write real banking code. There is definitely a banking code that uses floats, e.g. valuation of financial instruments. The parent comment talks about software that does transactions and “simpler” calculations, like taxes and fees etc.

When people talk about non-determinism of floating point, what they usually mean is non-associativity, that is (x+y)+z may not be exactly equal to x+(y+z).

> Floats aren’t random. They’re perfectly deterministic, predictable and reproducible. If you do the same operation in two places you get the same result.

That's not exactly true in real hardware, or at least it wasn't until ~10 years ago. With the x87 FPU, internal precision was 80 bits, while the x86 registers were at most 64 bits. So, depending on the way the program would transfer data between the CPU and FPU your could get different results. It is very likely that different compilers and different optimization decisions could change the way these operations were implemented, so you would get slight differences between different versions of the software.

There are/were also several global FP flags that could get changed by other programs running on the same CPU/FPU that could impact the result of calculations. So, if you want 100% reproducible FP, you would have to either audit all software running on the same machine to ensure it doesn't touch those flags, or set the flags yourself for every FP calculation in your your program.

I did not say floats are random. But when you do accounting you need to be able to sum large sets of numbers and compare results with another sum of different numbers and the sum must match. This just does not work with FP.

Poor souls that use FP for accounting are scourge of the industry and source of jokes.

You're confusing foreign exchange conversion with accounting arithmetic. Two different things.

This is false. It's not correct to handle currency with floating point types.

I don't see any problem with it if it's decimal. Here's an accepted answer on stack overflow with hundreds of upvotes recommending the use of `decimal` to store currency amounts in C#. That's a decimal floating point type.


They said floating point decimal types which probably means BCD.

There are different implementations, and BCD is only one of them. Another popular one is a mantissa and exponent, but the exponent is for a 10-based shift rather than the typical floating point.

Tbey mean radix-10 floating point, as compared to the radix-2 floating point you are thinking of. The packing of the decimal fractional digits in the significand of a radix-10 FP number need not be in BCD, it can use other encodings (e.g., DPD or something else).

0.3 is exactly representable in radix-10 floating point but not radix-2 FP (would be rounded to a maximum of 0.5 ulp error as seen in the title), for instance, just as 1/3 = 0.3333... is exactly representable in radix-3 floating point but neither radix-2 or radix-10 FP, etc.

Right, it is not correct. But many programs do it wrong. If you just do a couple of additions the problem will never be noticed. It's easy to write a program that sums up 0.01 until the result is not equal to n * 0.01. Not at my computer now, so I can't do it again. I remember n was bigger to be relevant for any supermarket cashier. But of course applications exist where it matters.

But it is correct.

> It's easy to write a program that sums up 0.01 until the result is not equal to n * 0.01.

It's not easy to do that if you use a floating point decimal type, like I recommended. For instance, using C#'s decimal, that will take you somewhere in the neighborhood of 10 to the 26 iterations. With a binary floating point number, it's less than 10.

It's not correct, but it happens anyway, even in large ERP systems that really should know better but somehow don't.

It is correct! Using decimal types is the widely recommended way of solving this problem. That includes fixed and floating point types. The problem is using base-2 floating point types, since those are subject to the kinds of rounding errors in the OP. But decimal floating point types are not subject to these kinds of rounding errors.

But they still can't precisely represent quantities like 1/3 or pi.

It's not correct, but in many cases it's plenty accurate

If you are dealing with other people’s money, the only accurate is accurate. Close enough should not be in any financial engineer’s mindset, imho.

It's not always terrible. I've seen doubles appropriately used in cases where performance was paramount, and floating point error was either not relevant or less important.

That said, yeah, when working with money in situations where money matters, some sort of decimal or rational datatype should be the rule, not the exception.

Storing money in floating point is always terrible. If speed is an issue, store it in integer types representing the smallest unit in the currency, e.g. pennies.

Unless you’re doing, what, massively parallel GPU algos on batches of independent amounts? But even then you could use the float as an int in that way... Honestly when is float ever actually good for money? Not for speed, not for correctness, ...

I think you mean that storing money in floating point is always terrible for accounting. Not all of finance is accounting.

Imagine you work at a hedge fund, and you have a model that predicts the true value of some option. Assume the option is trading for $3.00. You do not really care if your model spits out $3.5 or $3.5000000001, you are going to buy either way. And your model probably involves a bunch of transcendental functions or maybe even non-deterministic machine learning, so it's not really meaningful to expect it to be “exact” to some decimal or even rational value.

Even more saliently, you probably don't care whether your model outputs 2.9999999 or 3.000000 or 3.000001, either, because in any of those cases the actual correct interpretation is “we’re just not sure whether to buy or not”.

I think a good first-order characterization of domains where floating point can safely be used is “when the difference between < and <= is not very meaningful” (in calculus terms: when “how meaningful is a difference of `x`” is a continuous function of `x`).

I think the "floating point are bad for storing currencies" is one of the most common misconception about floating point.

Most people don't realize that the IEEE-754 single precision floating point represent real numbers with 9 decimal digits (or 23 binary digits). The double, on the other hand, represents the real numbers with 17 decimal digits.

This means that the double error UPPER BOUND is (0.00000000000000001)/2 per operation. But in reality the error is lower because of the rounding operations.

Also, it is posssible to extend the range using denormals, but most (all?) compilers disable them when compiling with anything other than O0 to avoid performance degradation.

The overheads associate with dealing with non-float types for most applications might not be worth it the cost and risk. If course, if the language are working with provides a currency type, go for it. But if doesn't , there is no need to worry.

I agree with your overall point: it most likely does not matter when the values are close enough. However :)

There can be two companies with 100M market cap. Corp A has issued 10M shares @ 10 each, Corp B has 10B shares priced at 0.01

A +/-0.001 change in Corp A share price is just 0.01% and moves the market cap by +/-10k, so probably nothing significant. The same nominal change in Corp B amounts to 10%, or +/- 10M in the company value, which is quite a big deal.

Also I think there may be some money to be made in changes at the 7th decimal place with large enough volume of high frequency transactions.

And that roughly captures the spot where I was seeing doubles used.

Yes, they could have used fixed point. I am guessing that what happened is that someone who had thought way more deeply about this than I ever needed to (I worked on the accounting side, where, yep, we always used decimals) either determined that, where the modeling was concerned, floating point errors were not worth worrying about, or estimated that the expected cost to the company stemming from bugs due to to fixed point math being easier to goof up on would have been smaller than the expected cost to the company due to floating point error.

My day job is high performance financial model implementation. Floats storing dollar amounts are the norm for predictions. Operating on values that are linear combinations of integer fractions multiplied by irrational constants (such as Euler’s number) is perfectly possible, but it’s much more performant to be aware of floating point epsilon when writing modeling code.

Financial models are predictive, they don't have to be accurate to a penny, right? Unlike processing actual money people own.

(I do some work with predictive simulations about money, but outside finance, and there we care that the result has accurate order of magnitude. Floats were used extensively in the project; I actually upgraded them to doubles for the sake of handling larger order of magnitude spans.)

I stand corrected, thanks for this example.

> If speed is an issue, store it in integer types representing the smallest unit in the currency, e.g. pennies

More typically, mills[1] (tenth of a cent).

[1]: https://en.m.wikipedia.org/wiki/Mill_(currency)

Amazon's EC2 hourly prices are rounded to mils ($0.011/hour).


Azure has some hourly prices with ten-thousandths of a cent ($0.0102/hour):


Microsoft should use gas station 9/10 pricing conventions to just barely undercut Amazon's lowest price $0.011 with $0.0109.


>“They found out that if you priced your gas 1/10 of a cent below a break point, let’s say 40 cents a gallon, ‘.399’ just looked to the public like 39 cents…”

Storing money in floating point is fine. Just round to the nearest atomic unit when displaying. Sometimes this is a necessity when working with money in e.g. existing JSON APIs. You lose a few bits of range relative to fixed point storage but it's almost never a practical issue.

Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit.

> Performing arithmetic operations against money in floating point is the dangerous part, as error can accumulate beyond an atomic unit.

A good example of this is trying to compute the sales tax on $21.15 given a tax rate of 10%. The exact answer would be $2.115, which should round to $2.12.

IEEE 64-bit floating point gives 2.1149999999999998, which is hard to get to round to 2.12 without breaking a bunch of other cases.

Here are three functions that try to compute tax in cents given an amount and a rate, in ways that seem quite plausible:

  def tax_f1(amt, rate):
    tax = round(amt * rate,2)
    return round(tax * 100)
  def tax_f2(amt, rate):
    return round(amt*rate*100)
  def tax_f3(amt, rate):
    return round(amt*rate*100+.5)
On these four problems:

   1% of $21.50
   3% of $21.50
   6% of $21.50
  10% of $21.15
the right answers are 22, 65, 129, and 212. Here are what those give:

  tax_f1:  21  65 129 211
  tax_f2:  22  64 129 211
  tax_f3:  22  65 130 212
Note that none of the get all four right.

I did some exhaustive testing and determined that storing a money amount in floating point is fine. Just convert to integer cents for computation. Even though the floating point representation in dollars is not exact, it is always close enough that multiplying by 100 and rounding works.

Similar for tax rates. Storing in floating point is fine, but convert to an integer by multiplying by an appropriate power of 10 first. In all the jurisdictions I have to deal with, tax rate x 10000 will always be an integer so I use that.

Give amt and rate, where amt is the integer cents and rate is the underlying rate x 10000, this works to get the tax in cents:

  def tax(amt, rate):
    tax = (amt * rate + 5000)//10000
    return tax
I'm not fully convinced that you cannot do all the calculations in floating point, but I am convinced that I can't figure it out.

> Storing money in floating point is fine. Just round to the nearest atomic unit when displaying.

Well, it's not just a display issue. In accounting, associativity and commutativity are important. People do care that `a + b + c - a == c + b` should evaluate to “true”.

There's very little point in storing money in floats if you're not going to do arithmetic in floats; about the only use case I can think of is JavaScript and JSON APIs.

Pennies (or any equivalents) are not the smallest unit in any currency. Fractions of it are perfectly acceptable and even common.

Even decimal floating point is a bad idea (for dealing with money) since you still can't represent a subset of rational numbers without approximation and without introducing rounding error during some calculations. It's just a different subset than what binary floating point can represent without approximation.

Well, this is one of those things where context matters.

In trading, it's super common to use floating point arithmetic for decision logic since it's very fast and straightforward to write. The actual trade execution, however, almost always relies on integer arithmetic because then money is actually being used (and hence must be tracked properly).

It's not therefore inherently incorrect to do currency conversions with floats in some situations provided that the actual transaction execution relies on fixed precision or decimal arithmetic.

When I was in college the professor of my software engineering class explicitly warned us to never use floating point numbers for money. He went on at length of the dangers of floating points for dealing with money and warned us that people can get really upset if they feel like they've been screwed out of money.

He had decades of experience in the software development industry and I got the feeling that he'd seen the effect of this issue personally.

I still remember that warning well.

I haven't worked in fintech but I've read that money is often represented (at least in storage) as plain integers, since for example US currency only ever goes to two decimal places. But I guess once you start operating on it you run into potential truncation unless you use rationals.

In finance, US dollars are generally stored to four decimal places, because you need to deal with stuff like compounding interest or stock splits.

COBOL has a built in fixed point integer type, which makes defining a 4 digit decimal and doing math on it easy. (IBM designed it from the ground up to cater to people with a lot of money, who spend a lot of money, to work with lots of money, ie banks) Java has the BigDecimal type, which is a class in the class library, which means you need to import it. And because Java lacks operator overloading, doing calculations is tedious.

In the 90s, there was a huge push to replace COBOL with <something else>, and Java was the Rust of its day, so that's what everyone got behind. However, 4 digit COBOL decimals apparently round differently than 4 digit Java BigDecimals, so all the tests failed. And all the stuff like a\x+b had to be written like BigDecimal.add(BigDecimal.multiply(a,x),b) so development was taking forever.

Eventually they said "fuck it" and 20 years later we're still stuck with COBOL and everyone who remembers the original death march says "never again".

I have a feeling a lot of the problems came down to computer science people thinking money has two decimal digits but domain knowledge people knowing it has four. We programmers, as a group, make a lot of assumptions about other peoples' domains and we're wrong a lot*.

I've had the thought that programmers should note assumptions in flagged comments, and those comments should be automatically collected, and then reviewed occasionally. Assumptions might be sustainable, so to speak, but they can also create one kind of technical debt.

> make a lot of assumptions about other peoples' domains and we're wrong a lot

What do you mean this person has no surname? That's unpossible, surname is never null, error error.

US currency can go to more than two decimal places...


I guess it's time for someone to write an "Assumptions Programmers make about money" post.

Falsehoods programmers believe about prices: https://gist.github.com/rgs/6509585

Interesting list, though I'm not sure what do they mean by n. 7

1. Money in a brokerage account is not US currency.

While it isn't physical US currency, my brokerage account represents the value of the account in units of USD- therefore any rules about how US currency works should apply.

Additionally, fractional cents are often presented to the consumer when purchasing gas/fuel.

Money is not as wierd as anyone might guess. I work on a financial application, and money is almost always just a BigDecimal with the scale set to 2 (and stored in a database as a bigint type or equivalent). When its not, its just a higher scale (for say, compound daily interest on small amounts for a significant period of time).

How do you store Bitcoin?

No it can't. There are systems that track things worth less than a penny for later billing, but at the end of the month when they bill someone, they do some sort of rounding.

If you are earning interest at a bank, and you've earned a fraction of a penny, they will eventually pay it to you once you've earned enough for a whole penny.

i.e. they track your account balance to more than 2 digits, they just only show you 2 digits.

Someone should tell that to everyone who ever used a ½¢ coin in the US. Also, US law explicitly states (31 USC §5101) that the unit of 1/1000th of a dollar is a mill.

>>...since for example US currency only ever goes to two decimal places

That is not correct. Stock settlement transactions often list four decimal places.

> Stock settlement transactions often list four decimal places.

That's not a significant difference compared to two decimal places, so brundolf's point still stands. There's no need for arbitrary precision.

Just store all dollars in PIPs so 5$ will be stored as 50000.

You can still get reasonable enough approximations with more than two decimals if you do something like `int64 myWorkingMoneyVal = currentMoney * 100000`, do your work, then divide the final result by 100000. You still risk some potential truncation if your work involves division, but the larger your multiplier that you're working with, the larger divisor at the end, which will help minimize how much of an error this ends up being. The 64 bit integer space is pretty darn big, so you typically don't risk an overflow, and you will typically get better performance than using a regular "decimal" type, since on-chip integer operations are usually very fast.

EDIT: Just a note, there's nothing special about the number 100000; pick the largest exponent of 10 that you can get away with a reasonable assurance that no overflow is possible. For a vast majority of money applications, I seriously doubt you're going to be hitting the limits of int64, so you could probably even get away with something like 1000000000.

Google uses 1000000 as multiplier in their APIs.

Edit: And they forbid equality comparisons for rationals. For some reason even >= is not allowed.

I didn't know that, but it doesn't surprise me (I suspected I wasn't the first person to come to the realization that there's no reason not to choose a giant number :) ).

I have developed a payment plan calculator for asset based finance and you would be amazed how many different rounding schemes and day counters (for fractional periods) exist and are actively used.

Counterexample: gas prices in the US are frequently displayed with 3 decimals (tenth of cents)

It isn't really a price though, it is a rate for an infinitely divisible good, i. e. $/L. You get the price when you multiply with the quantity purchased.

So how much do 2 CCs of gasoline cost?

The basic representation is usually good enough at 2 decimals (so a plain int), but it is often needed to have a transient representation during calculations.

For instance if one needs to apply discounts, add taxes, split in equal parts, all of the above one after the other, there will be a more precise intermediary representation before rounding everything in a way that keeps the total amount consistent with the original amount.

> It's actually in use in many places, for things like handling currency and money

Hm, are you sure? I don't believe "rational" types which encode numbers as a numerator and denominator are typically used for currency/money.

If they were, would the denominator always be 100 or 1000? I guess you could use a rational type that way, although it'd be a small subset of what rational data types are intended for. But I guess it'd be "safe"? Not totally sure actually, one question would be if rounding works how you want when you do things like apply an interest percentage to a monetary amount. (I am not very familiar with rational data types, and am not sure how rounding works -- or even if they do rounding at all, or just keep increasing the magnitude of the denominator for exact precision, which is probably _not_ what you'd want with currency, for reasons not only of performance but of desired semantics).

You are correct an IEEE-754 floating point type is inappropriate for currency. I believe for currency you would generally use a fixed-point type (rather than floating point type), or non-IEEE "arbitrary precision floating point" type like ruby's BigDecimal (ruby also offers a Rational type. https://ruby-doc.org/core-2.5.0/Rational.html . This is a different thing than the arbitrary-precision BigDecimal. I have never used Rational or seen it used. It is not generally used for money/currency.) Or maybe even a binary-coded decimal value? (Not sure if that's the same thing as "arbitrary-precision floating point" of ruby's BigDecimal or not).

There are several possible correct and appropriate data encodings/types for currency, that will have the desired precision and calculation semantics... I am not sure if rational data type is one of them, and I don't believe it is common (and it would probably be much less performant than the options taht are common). Postgres, for instance, does not have a "rational" type built in, although there appears to be a third-party extension for it. Yet postgres is obviously frequently used to store currency values! I believe many other popular rdbms have no rational data type support at all.

I'm not actually sure what domains rational data types are good for. Probably not really anything scientific measurement based either (the IEEE-754 floating point types ARE usually good for that, that is their use case!) The wikipedia page sort of hand-wavily says "algebraic computation", which I'm not enough about math to really know what that means. I have never myself used rational data types, I don't think! Although I was aware of them; they are neat.

Good catch! I'm thinking of fixed-point number types. Ruby's Rational was/is cool, but looks like an inherently difficult number-type to work with and keep sanity high.

For currency, business side should decide the rules (* 100 or * 1000000), and where to funnel the pennies ;) Fixed-point has it's own sort of gotchas, ie. multiplication, power, division, sqrt, etc. So there are fancy techniques to work with the numbers, like https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm

When you start looking into all of this, it's interesting to see how many ways there are to represent numbers in a computer. It isn't actually obvious or trivial at all, there is no one "true" or "accurate" representation, and they all had to be invented by past computer scientists!

If you're not a bank or similar but just dealing with currency to buy and sell things like for ecommerce, the default semantics of a fixed point type (like postgres 'money' type) or "arbitrary precision floating point" type like ruby's BigDecimal or are probably good enough, and just fine in a way that IEEE-754 floating point definitely are NOT. And probably don't require any additional business side decisions or involve any significant gotchas. Just using them instead of IEEE-754 floating point and not thinking too hard after that is probably just fine.



If you ARE a bank or something similar -- I wouldn't know, I haven't done that! A relevant question: Am I concerned with specifying exactly how fractional pennies get rounded?

Ask the COBOL guys and gals for the true answer ;)

There are accounting, balancing, laws, regulations and reconciliation issues where for the really serious stuff, you use whatever fit spec and requirements, not the other way around. Ruby's BigDecimal will be fine, if you implement the detailed specification about how to calculate each operation every step of the way, with designated precision at various steps, together with truncations along the way that may not make much sense to the developer (or anyone else, but are required to get correct numbers).

Point is, sometimes other parties need to be able to replicate the exact numbers, unrelated to any internal library or coding standards. Code using just plain integers could be easier to certify than a library dependency.

In such cases, you don't just round to make numbers prettier, but may even keep the truncated part of the equation. It's then nice to use simple stuff that are proven to work and not change over time.

>If they were, would the denominator always be 100 or 1000?

The numerator and denominator get automatically reduced to lowest terms (just like you learned in elementary school, so 15/100 becomes 3/20) internally by every implementation I know of. This comes at a performance cost for every operation, but it helps keeping the numerator and denominator from blowing up.

>not sure how rounding works -- or even if they do rounding at all

They do not. The point of a Rational type is to keep precise values, so it's up to the programmer to decide when and how values are rounded.

>"arbitrary precision floating point" type like ruby's BigDecimal

Not sure how Ruby implements BigDecimal, but Java internally represents it as an BigInteger of digits, and a second integer that represents where in the number the decimal point should go. This means that BigDecimal still can't truly represent a value such as 1/3, since you can't have an infinite amount of 3's, but a Rational can.

>I'm not actually sure what domains rational data types are good for.

I'll be honest and say I've never had to use them either, but it's nice to know they exist. The intended use case is when you need to perform calculations and maintain as much precision and accuracy as possible in the intermediate values and such accuracy is more important than speed.

> If they were, would the denominator always be 100 or 1000?

Only if you never use anything but addition and subtraction. So, no currency coversions, interest rates, complex taxes or rebate schemes or the like.

Currency in banking is handled with bigints. Not rationals, just bigints of the smallest unit (i.e., 1 cent). This forces you to order operations so that divisions are done last or not at all.

The bigint you describe is just a poor man's rational, given that no computer architecture or mainstream language support them natively.

There is a slight difference where it forces decision-making about rounding at every step. Most importantly, it makes errors that print money (or burn money) impossible states. I'm aware of BigRational libraries or more clever currency abstractions, but this is the least magical way of doing it, kind of like representing datetimes as 64-bit UNIX epochs.

Well, you could look at it as x/100 rationals representing a dollar value. But you could also look at it as an integer amount of a smaller unit (cents). The difference is insignificant; computers support it natively.

> It's actually in use in many places, for things like handling currency and money

Which specific places have you seen it used in?

Reminds me of this Inigo Quilez article on experimenting with rendering using rational numbers: https://iquilezles.org/www/articles/floatingbar/floatingbar....

I actually ran into a bug recently while implementing my first raytracer, where the point calculated from the sphere-intersect test would just occasionally end up inside the sphere due to floating point imprecision, so the diffuse sample rays would have their origins completely in the dark, leading to randomly black pixels. Solved it by bumping every intersection out by 0.01 in the direction of its normal.

And then of course there have been several other "x.abs() < 0.01" cases for various purposes. So I could definitely see that being an interesting experiment.

Here's some good reading on robust ways to fix this without an arbitrary epsilon bump: http://www.pbr-book.org/3ed-2018/Shapes/Managing_Rounding_Er...

This phenomenon has a name, by the way: "shadow acne" (which is actually a more general phenomenon, but this is an example of it).

That's really interesting - hadn't thought of that before. To fix that, would you be able to do a square of the magnitude comparison with the radius and just bump the borderline cases, or is it more efficient without the extra branching?

I just did it across the board; since the error is in the floating-point noise I don't know if I'd even trust a comparison on that. Plus, the discrepancy between "bumped" and "unbumped" samples might cause some visible artifacts.

Direct3D used to have a Z-bias for a similar problem: rendering pictures hanging on walls at a far Z depth. Their whql tests even tested for it.

It was fun discovering all the corner cases while debugging drivers.

"Why don't we just" because it's harder than one thinks.


and gets harder when you want exact irrationals too https://www.google.com/search?q=exact+real+arithmetic

Although this does make me wonder what happens if you round the rational once the numerator/denominator becomes too big.

But maybe that just results in all the floating point weirdness again, just not for small rationals.

The result is that your number system (a) makes many common operations dramatically more computationally expensive, (b) has less predictable rounding which is very tricky to reason about, (c) generally gives worse results for the same bit budget. Instead of e.g. evenly spaced numbers, you get spacing like https://en.wikipedia.org/wiki/Minkowski%27s_question-mark_fu...

One thing you can try is storing a floating point numerator and a floating point denominator, and renormalizing them by bit shifts instead of finding GCDs. This lets you avoid rounding errors for small ratios. For general purposes this advantage isn’t really worth doubling the number of bits and complicating arithmetic for though.

See e.g. https://observablehq.com/@jrus/qang

Also, it's a lot less efficient, so it should only be used if absolutely necessary.

But rationals are more expensive to compute with (compared to floating-point; this is another example of the trade-off between performance and accuracy.)

it's also a range-storage trade-off. if you use two fixed width integers to represent a rational, the minimum and maximum values are the same as that of the integer type. floating point gives a far wider range for the same number of bits.

I'm sure there's some subtlety I'm missing, but isn't it actually the same trade-off? A 64-bit float can only represent 52-bit integers exactly. Anything above that, and you don't even have integer-level precision on the number anymore... This sliding scale of precision is exactly why floats are terrible at the kinds of operations that would cause you to use a rational instead.

> I'm sure there's some subtlety I'm missing, but isn't it actually the same trade-off?

not exactly, unless you consider space efficiency to be an aspect of performance (which is certainly reasonable). a naive implementation of rationals using two int32_t's only covers the range of a single int32_t, despite using as many bits as the double. it's also a trade-off between range and consistent precision, of course.

this certainly isn't some deep insight into number representation, just a quick point for the benefit of people who haven't thought much about rational data types before.

Once you care about that level of performance, you can surely optimise your representation to have a greater range (use more bits for the numerator) or greater precision (more bits for the denominator) or some boutique solution like using three integers to store the number a + b/c.

You can store slightly fewer numbers with rationals, because it's hard to avoid having a representation for both 2/4 and 3/6. But the loss of range or precision due to that is pretty small.

it's not just that they are expensive, it's that there is a nondetermistic compute time.

Let's say we need to do a comparison. Set

    a = 34241432415/344425151233 

    b = 45034983295/453218433828
Which is greater?

Or even more feindish, Set

    a = 14488683657616/14488641242046

    b = 10733594563328/10733563140768
which is greater?

By what algorithm would you do the computation, and could you guarantee me the same compute time as comparing 2/3 and 4/5?

I’m not sure I follow. Isn’t it just two integer multiplications followed by a comparison.

a/b > x/y is the same as ay > xb

Assuming you don’t overflow your integer type.

> Assuming you don’t overflow your integer type.

There's your answer :)

It's far too easy to overflow your integer type by simply adding a bunch of rationals whose denominators happen to be coprime, or just by multiplying rationals. For this reason, the vast majority of rational implementations use arbitrary precision integers, and of course arithmetic on those isn't constant time.

One approach would be to hold on to rationals for as long as possible, to eliminate drift, and then dump them out to the nearest floating-point at the very last moment

IEEE floats are pretty complicated, but today’s CPUs have dedicated support for those and not for rationals, so we use them where we probably shouldn’t.

IEEE floats are absolutely great for many applications where rationals would be overkill or even inappropriate. A videogame doesn't care if the result is 0.3 or 0.30000000000000004. Even some scientific applications can use floats if the coder knows what they're doing.

The problem is devs who don't understand what they're doing and just think that they can use floats in every situation and it'll just work out fine. This is not helped by many popular scripting languages who just default to floats when a result doesn't fit an integer (something more reasonable languages like Common Lisp don't do for instance).

For instance, to speak of videogames again, very tight precision isn't usually an issue but loss of granularity when numbers get very big can cause problems, especially if you have very large levels. That being said rationals wouldn't really help you here, you'd have the same problem except now you have to keep two numbers within bound instead of one. Imagine having a very small offset in a complex operation and ending up with a number like 100000000000000000000000000/100000000000000000000000001 !

I love this demonstration of that phenomenon: https://twitter.com/schteppe/status/1143111757751357440?s=20

Why 'even some scientific applications'? Don't nearly all scientific applications use floats?

Yeah, I assumed as much. I wasn't really thinking about that at the time, but knowing now that it exists in the wild, the only conceivable reason for it not to be used everywhere would be some kind of performance penalty.

When I was reading about this I thought why don't the print functions just by default round to the nearest 10 decimal places or similar so 0.30000000000000004 prints as 0.3 unless you specify you don't want that. And I wrote a function in javascript to round like that though it was surprisingly tricky and messy to do so.

Some langs have that in their standard included batteries.

(Shameless Common Lisp plug: http://clhs.lisp.se/Body/t_ration.htm)

Yes including C++, the language mentioned in the parent post (`std::ratio`)

You either feel smart by wondering why people don't use rationals, or feel smart by wondering why people use rationals.

what do you mean by "rationals"? infinite precision? because finite precision rationals are not associative and much worse than floats in many senses.

Handling quantities with varying Unit of Measures is made quite a bit easier by using numerator and denominator pairs.

Not all numbers are rational.

And not all numbers are real. And IEE 754 floating point numbers do not even cover all real numbers.

All floating point numbers are rational numbers.

FWIW, both of those can be expressed exactly by floating-point numbers ;)

Whew. I almost put some non-zeros in there.

Right all integers up to 2^53 (or something like that) can be (in double precision).

I assume that's the reason they made the mantissa linear, even though having the whole thing logarithmic makes more sense.

Addition/subtraction are also much simpler/cheaper than they would be in an entirely logarithmic model. If floats were just 2^x with some 64 bit fixed point x, it's not clear to me how to do addition efficiently.

What. Mantissa is already logarithmic, bit number n has value 2^(n - N-1) for an N-bit mantissa.

This is how positional number systems work at all.

The mantissa is linear. It's unrelated to how positional number systems work. A floating point value is split into two numbers - the exponent and the mantissa. Normally they are used to represent a final number like:

    x = 2^e * (1 + m)
Where e is the exponent and m is the mantissa (varying linearly from 0 to 1).

But you could have a fully exponential number format:

    x = 2^(m + o)
As pointed out though, it makes addition much more complicated, you can't exactly represent integers, and someone told me it makes quantisation noise worse too. Bad idea.

I'd always wondered how automated these comments were

Also the subject of one of the most popular questions on StackOverflow: https://stackoverflow.com/q/588004/5987

While it's true that floating point has its limitations, this stuff about not using it for money seems overblown to me. I've worked in finance for many years, and it really doesn't matter that much. There are de minimis clauses in contracts that basically say "forget about the fractions of a cent". Of course it might still trip up your position checking code, but that's easily fixed with a tiny tolerance.

when the fractions actually dont matter... its so painless just to just store everything in pennies rather than dollars (multiply everything by 100)

It’s not painless. E.g. dividing $100.00 by 12 month in integer cents requires 11 $8.33 and one $8.37 (or better 4x(2x8.33+8.34), depending on definition of ‘better’). You can forget this $0.04, but it will jump around in reports until you get rid of it – it requires someone’s attention anyway, no matter how small it is. Otoh, in unrounded floating point that will lead to a mismatch between (integer) payments and calculations. In rounded fp it’s the same problem, except when you’re trying very hard for error bits to accumulate (like cross-multiplying dataset sums with no intermediate rounding, which is nonsense in financial calc and where regular fixpoint integers will overflow anyway).

What I’m trying to show here is that both integers and floating point are not suitable for doing ‘simple’ financial math. But we get used to this Bresenhamming in integers and do not perceive it as solving an error correction problem.

This struck home with me when one day a friend and I bought the same thing and he paid a penny more.

I realized something I didn't ever notice or appreciate in 20+ years: oh yeah, they can't just round off the penny in their favour every time. And the code that handles tracking when to charge someone an extra penny must be irritating to have developed and manage. All of a sudden you've got state.

What kind of thing and store was it?

That's one of the worst domain name ever. When the topic comes along, I always remember about "that single-serving website with a domain name that looks like a number" and then take a surprisingly long time searching for it.

I have written a test framework and I am quite familiar with these problems, and comparing floating point numbers is a PITA. I had users complaining that 0.3 is not 0.3.

The code managing these comparisons turned out to be more complex than expected. The idea is that values are represented as ranges, so, for example, the IEEE-754 "0.3" is represented as ]0.299~, 0.300~[ which makes it equal to a true 0.3, because 0.3 is within that range.

> That's one of the worst domain name ever.

Maybe the creator's theory is that people will search for 0.30000000000000004 when they run into it after running their code.

It may be the worst domain name ever, but the site only exists because I thought that using "0" as a subdomain was a neat trick, and worked back from there to figure out what to do with it.

FWIW - the only way I can ever find my own website is by searching for it in my github repositories. So I definitely agree, it's not a terribly memorable domain.

It's the first result for "floating point site" on Google. Sure the domain itself is impossible to remember, but you don't have to remember the actual number, just what it stands for.

Remember filter bubble. My first result is not your first result. (although in this case it happens to be, but we both probably search a lot on programming)

Also did it in an InPrivate window to confirm, which is still somewhat targeted but far less so than on my actual account. It's still first.

And, at the end of the day, even if there's a filter bubble and it's the reason I see it first, then so what? The people looking for this site are likely going to fit into the same set of targeted demographics as you and me and most people on this site. So unless you also want to cater to 65-year old retirees that don't care about computer science and what floating numbers are, then why does the filter bubble even matter?

> My first result is not your first result.

It would be if you both used DuckDuckGo, though :)

> That's one of the worst domain name ever. When the topic comes along, I always remember about "that single-serving website with a domain name that looks like a number" and then take a surprisingly long time searching for it.

That's why we need regular expressions support in every search box, browser history, bookmarks and Google included.

just add 0.1 and 0.2 in fp32 (?) accuracy if you can't remember the name :)

This is the double-precision IEEE sum. A single-precision result would have (slightly less than) half as many digits.

This is a good thing to be aware of.

Also the "field" of floating point numbers is not commutative†, (can run on JS console:)

x=0;for (let i=0; i<10000; i++) { x+=0.0000000000000000001; }; x+=1

--> 1.000000000000001

x=1;for (let i=0; i<10000; i++) { x+=0.0000000000000000001; };

--> 1

Although most of the time a+b===b+a can be relied on. And for most of the stuff we do on the web it's fine!††

† edit: Please s/commutative/associative/, thanks for the comments below.

†† edit: that's wrong! Replace with (a+b)+c === a+(b+c)

Note that the addition is commutative [1], i.e. a+b==b+a always.

What is failing is associativity, i.e. (a+b)+c==a+(b+c)

For example

(.0000000000000001 + .0000000000000001 ) + 1.0

--> 1.0000000000000002

.0000000000000001 + (.0000000000000001 + 1.0)

--> 1.0

In your example, you are mixing both properties,

(.0000000000000001 + .0000000000000001) + 1.0

--> 1.0000000000000002

(1.0 + .0000000000000001) + .0000000000000001

--> 1.0

but the difference is caused by the lack of associativity, not by the lack of commutativity.

[1] Perhaps you must exclude -0.0. I think it is commutative even with -0.0, but I'm never 100% sure.

I tried to determine how to perform IEEE 754 addition (in order to see whether it's commutative) by reading the standard: https://sci-hub.tw/10.1109/IEEESTD.2019.8766229

(Well, it's a big document. I searched for the string "addition", which occurs just 41 times.)

I failed, but I believe I can show that the standard requires addition to be commutative in all cases:

1. "Clause 5 of this standard specifies the result of a single arithmetic operation." (§10.1)

2. "All conforming implementations of this standard shall provide the operations listed in this clause for all supported arithmetic formats, except as stated below. Unless otherwise specified, each of the computational operations specified by this standard that returns a numeric result shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that intermediate result, if necessary, to fit in the destination’s format" (§5.1)

Obviously, addition of real numbers is commutative, so the intermediate result produced for addition(a,b) must be equal to that produced for addition(b,a). I hope, but cannot guarantee, that the rounding applied to that intermediate result would not then depend on the order of operands provided to the addition operator.

3. "The operation addition(x, y) computes x+y. The preferred exponent is min(Q(x), Q(y))." (§5.4.1). This is the entire definition of addition, as far as I could find. (It's also defined, just above this statement, as being a general-computational operation. According to §5.1, a general-computational operation is one which produces floating-point or integer results, rounds all results according to §4, and might signal floating-point exceptions according to §7.)

4. The standard encourages programming language implementations to treat IEEE 754 addition as commutative (§10.4):

> A language implementation preserves the literal meaning of the source code by, for example:

> - Applying the properties of real numbers to floating-point expressions only when they preserve numerical results and flags raised:

> -- Applying the commutative law only to operations, such as addition and multiplication, for which neither the numerical values of the results, nor the representations of the results, depend on the order of the operands.

> -- Applying the associative or distributive laws only when they preserve numerical results and flags raised.

> -- Applying the identity laws (0 + x and 1 × x) only when they preserve numerical results and flags raised.

This looks like a guarantee that, in IEEE 754 addition, "the representation of the result" (i.e. the sign/exponent/significand triple, or a special infinite or NaN value - §3.2) does not "depend on the order of the operands". §3.2 specifically allows an implementation to map multiple bitstrings ("encodings") to a single "representation", so it's possible that the bit pattern of the result of an addition may differ depending on the order of the addends.

5. "Except for the quantize operation, the value of a floating-point result (and hence its cohort) is determined by the operation and the operands’ values; it is never dependent on the representation or encoding of an operand."

"The selection of a particular representation for a floating-point result is dependent on the operands’ representations, as described below, but is not affected by their encoding." (both from §5.2)


6. §6, dealing with infinite and NaN values, implicitly contemplates that there might be a distinction between addition(a,b) and addition(b,a):

> Operations on infinite operands are usually exact and therefore signal no exceptions, including, among others,

> - addition(∞, x), addition(x, ∞), subtraction(∞, x), or subtraction(x, ∞), for finite x (§6.1)

> Also the "field" of floating point numbers is not commutative, (can run on JS console:)


    >> x = 0;
    >> for (let i=0; i<10000; i++) { x+=0.0000000000000000001; };
    >> x + 1
    >> 1 + x

You've identified a problem, but it isn't that addition is noncommutative.

Yeah, what is demonstrated here is that floating point addition is nonassociative.

Your example shows that floating-point addition isn't associative, not that it isn't commutative.

Isn't that more of an associativity problem than a commutativity problem, though?

1.0 + 1e-16 == 1e-16 + 1.0 == 1.0 as well as 1.0 + 1e-15 == 1e-15 + 1.0 == 1.000000000000001

however (1.0 + (1e-16 + 1e-16)) == 1.0 + 2e-16 == 1.0000000000000002, whereas ((1.0 + 1e-16) + 1e-16) == 1.0 + 1e-16 == 1.0

Yep. The TL;DR of a numerical analysis class I took is that if you're going to sum a list of floats, sort it by increasing numeric value first so that the tiny values aren't rounded to zero every time.

Really? It wasn't to use Kahan summation?


Hah! Well, yeah, that too. But if there's a gun to your head, sorting the list before adding will get you most of the way there with the least amount of work.

I feel like it should really be emphasised that the reason this occurs is due to a mismatch between binary exponentiation and decimal exponentiation.

0.1 = 1 × 10^-1, but there is no integer significand s and integer exponent e such that 0.1 = s × 2^e.

When this issue comes up, people seem to often talk about fixing it by using decimal floats or fixed-point numbers (using some 10^x divisor). If you change the base, you solve the problem of representing 0.1, but whatever base you choose, you're going to have unrepresentable rationals. Base 2 fails to represent 1/10 just as base 10 fails to represent 1/3. All you're doing by using something based around the number 10 is supporting numbers that we expect to be able to write on paper, not solving some fundamental issue of number representation.

Also, binary-coded decimal is irrelevant. The thing you're wanting to change is which base is used, not how any integers are represented in memory.

Agree. All of these floating point quirks are not actually problems if you think of them as being finite precision approximations to real numbers, not in any particular base. Just like physical measurements of continuous quantities. You wouldn't be surprised to find an error in the 15th significant figure of some measurement or attempt to compare them for equality or whatever. So don't do it with floating point numbers either and everything will work perfectly.

Yes, there are some exceptions where you can reliably compare equality or get exact decimal values or whatever, but those are kind of hacks that you can only take advantage of by breaking the abstraction.

If you only use decimals in your application, it actually is a fix because you can store the numbers you care about in exact precision. Of course it's not really a fix if you're being pedantic but for a lot of simple UI stuff it's good enough.

One small tip about printf for floating point numbers. In addition to "%f", you can also print them using "%g". While the precision specifier in %f refers to digits after the decimal period, in %g the precision refers to the number of significant digits. The %g version is also allowed to use exponential notation, which often results in more pleasant-looking output than %f.

   printf("%.4g", 1.125e10) --> 1.125e+10
   printf("%.4f", 1.125e10) --> 11250000000.0000

And %e always uses exponential notation. Then there's %a, which can be exact for binary floats.

One of my favorite things about Perl 6 is that decimal-looking literals are stored as rationals. If you actually want a float, you have to use scientific notation.

Edit: Oh wait, it's listed in the main article under Raku. Forgot about the name change.

That’s only formatting.

The other (and more important) matter, — that is not even mentioned, — is comparison. E. g. in “rational by default in this specific case” languages (Perl 6),

  > 0.1+0.2==0.3
Or, APL (now they are floats there! But comparison is special)

      ⎕PP←20 ⋄ 0.1+0.2
      (0.1+0.2) ≡ 0.3

Please note that Perl 6 has been renamed to "Raku" (https://raku.org using #rakulang as a tag for social media).

In Raku, the comparison operator is basically a subroutine that uses multiple dispatch to select the correct candidate for handling comparisons between Rat's and other numeric objects.

Exactly what are the rules for the "special comparison" in APL? That sounds horrifying to me.

Assume the values could be equal if the relative error of the operation is greater than a small predefined value (called “⎕ct”, comparison tolerance, and you can change it).

but this is not an equivalence relation. You may have a=b and b=c but a!=c

it's horrifying!

The runner up for length is FORTRAN with: 0.300000000000000000000000000000000039

And the length (but not value) winner is GO with: 0.299999999999999988897769753748434595763683319091796875

Those look like the same length

Huh? The fortran one is 38 characters long with 33 0s after the 3. The go one is 56 characters long with 15 9s after the 2.

I’m on mobile. Must be the issue.

> It's actually pretty simple

The explanation then goes on to be very complex. e.g. "it can only express fractions that use a prime factor of the base".

Please don't say things like this when explaining things to people, it makes them feel stupid if it doesn't click with the first explanation.

I suggest instead "It's actually rather interesting".

For including words in a sale pitch, I'd agree.

But this isn't a sales pitch. Some people are just bad at things. The explanation on that page require grade school levels of math. I think math that's taught in grade school can be objectively called simple. Some people suck at math. That's ok.

I'm very geeky. I get geeky things. Many times geeky things can be very simple to me.

I went to a dance lesson. I'm terribly uncoordinated physically. They taught me a very 'simple' dance step. The class got it right away. The more physically able got it in 3 minutes. It took me a long time to get, having to repeat the beginner class many times.

Instead of being self absorbed and expect the rest of the world to anticipate every one of my possible ego-dystonic sensibilities, I simply accepted I'm not good at that. It makes it easier for me and for the rest of the world.

The reality is, just like the explanation and the dance step, they are simple because they are relatively simple for the field.

I think such over-sensitivity is based on a combination of expecting never to encounter ego-dystonic events/words, which is unrealistic and removes many/most growth opportunities in life, and the idea that things we don't know can be simple (basically, reality is complicated). I think we've gotten so used to catering to the lowest common denominator, we've forgotten that it's ok for people to feel stupid/ugly/silly/embarrassed/etc. Those bad feelings are normal, feeling them is ok, and they should help guide us in life, not be something to run from or get upset if someone didn't anticipate your ego-dystonic reaction to objectively correct usage of words.

When faced with criticism about your lack of inclusivity, what's to gain by doubling down in order to intentionally exclude people? The argument you are presenting always feels disingenuous because you imply that there is something lost in the efforts to be more inclusive.

The idea that you care about the growth of people you are actively excluding doesn't make a whole lot of sense. In this example we're talking about word choice. The over-sensitivity from my point of view is in the person who takes offense that someone criticized their language and refuses to adapt out of some feigned interest for the disadvantaged party. The parent succinctly critiqued the word choice of the author and offered an alternative that doesn't detract from the message in the slightest.

The lowest common denominator is the person who throws their arms up when offered valid criticism.

> because you imply that there is something lost in the efforts to be more inclusive

Yes there is something lost. I included it in my post but I'll repeat it: People who aren't good at math are 'shielded from the truth' (they objectively suck at math because they can't grasp something that is objectively simple in the domain of math). Again, feeling bad about not grasping something simple is the necessary element for a humbling experience. Humbling experiences aren't meant to feel great. For me, I've learned the most with humbling experiences. I honestly believe most people in the first world need more of them.

The suggested language is more inclusive, that's an advantage to sales, but less clear, that's a disadvantage to communication/learning. Personally, I like learning and want to see things optimized for that.

BTW; I loved the sly way of you implying that: A- I took offense (I am not, nor do I see anything in my comment that says I'm offended) and B- That I'm the lowest common denominator because of A. It's a subtle way of attacking me and not my point. It says a lot about both the person doing the attack and the strength of their argument that they have to resort to ad-hominems. Though I will credit you with using a smartly disguised one.

Also, you are speaking to me as if I was the website author. I'm not the OP of the article, which if you read TFA you would see the actual author changed in favor of the suggestion.

The problem is that almost everything is simple once you understand it. Once you understand something, you think it's pretty simple to explain it.

On the other hand, people say "it's actually pretty simple" to encourage someone to listen to the explanation rather than to give up before they even heard anything, as we often do.

I understand prime factors just fine, but I'd never think it's "simple" to bring them up when I'm explaining how decimal points work.

Thanks for this.

Yep, I've thrown 10,000 round house kicks and can teach you to do one. It's so easy.

In reality, it will be super awkward, possibly hurt, and you'll fall on your ass one or more times trying to do it.

Ditto as I now feel stupid.

I read the rest of your reply but I also haven’t let go of the possibility that we’re both (or precisely 100.000000001% of us collectively) are as thick as a stump.

To be fair, this is also done in every other STEM field, and CS is no exception.

We could all learn a lot more from each other if everything wasn't a contest all the time.

I learned it in college, but immediately forgot it after the exam. Why? It wasn't simple.

I had to use google translate for this one, because I didn't suspect the translation to my language to be so literal.

My take is that this sentence is badly worded. How do these fractions specifically use those prime factors?

Apparently the idea is that a fraction 1/N, where N is a prime factor of the base, is rational in that base.

So for base 10, at least 1/2 and 1/5 have to be rational.

And given that a product of rational numbers is rational, no matter what combination of those two you multiply, you'll get a number rational in base 10, so 1/2 * 1/2 = 1/4 is rational, (1/2)^3 = 1/8 is rational etc.

Same thing goes for the sum of course.

So apparently those fractions use those prime factors by being a product of their reciprocals, which isn't mentioned here but should have been.

Thanks for the suggestion. I've updated the text.

Awesome, thanks!

>Why does this happen? It's actually rather interesting.

Did the text change in the last 15 minutes?

It's nice but it's extremely overkill for understanding this particular problem.

there are only two kind of problems, trivial problems and those that you don't know how to solve (yet).

Postgresql figured this out many years ago with their Decimal/Numeric type. It can handle any size number and it performs fractional arithmetic perfectly accurately - how amazingly for the 21st Century! Is comically tragic to me that all of the mainstream programming languages are still so far behind, so primitive that they do not have a native accurate number type that can handle fractions.

> how amazingly for the 21st Century!

Most languages have classes for that, some had them for decades in fact. Hardware floating point numbers target performance and most likely beat any of those classes by orders of magnitude.

I still remember when I encountered this and nobody else in the office knew about it either. We speculated about broken CPUs and compilers until somebody found a newsgroup post that explained everything. Makes me wonder why we haven't switched to a better floating point model in the last decades. It will probably be slower but a lot of problems could be avoided.

Unless you have a floating point model that supports arbitrary bases, you're always going to have the issue. Binary floats are unable to represent 1/10 just as decimal floats are unable to represent 1/3.

And in case anyone's wondering about handling it by representing the repeating digits instead, here's the decimal representation of 1/12345 using repeating digits:


Nice example. For those who do not understand why it is so long, a denominator multiplied by a period must be all-nines. E.g. 1/7 = 0.(142857), because 142857x7 = 999999, so that 0.(142857)x7 = 0.(999999) = 1 back again. For some simple numbers N their nearest 999...999/N integer counterpart is enormously huge.

> Unless you have a floating point model that supports arbitrary bases

See also binary coded decimals.


That's not a floating point.

From the article:

> Programmable calculators manufactured by Texas Instruments, Hewlett-Packard, and others typically employ a floating-point BCD format, typically with two or three digits for the (decimal) exponent.

Then that's how they're encoding the components of the float. BCD itself is not a floating-point, it's just a different way of encoding a fixed-point or integer. If all you want to do is use floating point but expand the logarithm and mantissa then that's completely tangential to whether or not they're stored as BCD or regular binary values.

> Binary floats are unable to represent 1/10 just as decimal floats are unable to represent 1/3.

That is true, but most humans in this world expect 0.1 to be represented exactly but would not require 1/3 to be represented exactly. Because they are used to the quirks of the decimal point (and not of the binary point).

This is a social problem, not a technical one.

Decimal floating point is standardized since 2008:


But it's still not much used. E.g. for C++ it was proposed in 2012 for the first time


then revised in 2014:


...and... silence?


It's not important to most people, because decimal floating point only helps if your UI precision is exactly the same as your internal precision, which almost never happens.

Seeing the occasional 0.300000000000004 is a good reminder that your 0.3858372895939229 isn't accurate either.

> It's not important to most people

One can argue that nothing is important to most people.

The correct calculations involving money, up to the last cent, are in fact important for people who do them or who are supposed to use them. I've implemented them in the software I've made to preform some financial stuff even in eighties, in spite of all the software which used binary floating point routines. And, of course, from the computer manufacturers, at least IBM cares too:


Apparently, there are even processors which supports these formats in hardware. It's just still not mainstream.

There is no "better floating point model" because floating point will always be floating point. Fixed point always has been and always will be an option if you don't like the exponential notation.

> Fixed point always has been and always will be an option

Not really. It would be really cool if fixed point number storage were an option... but I'm not aware of any popular language that provides it as a built-in primitive along with int and float, just as easy to use and choose as floats themselves.

Yes probably every language has libraries somewhere that let you do it where you have to learn a lot of function call names.

But it would be pretty cool to have a language with it built-in, e.g. for base-10 seven digits followed by two decimals:

  fixed(7,2) i;
  i = 395.25;
  i += 0.01;
And obviously supporting any desired base between 2 and 16. Someone please let me know if there is such primitive-level support in any mainstream language!

COBOL was created to serve the interests of the financial industry, therefore COBOL has fixed point as a first class data type.

Every programming language that has come since has been designed to be a general purpose programming language, therefore they don't include fixed point as a first class data type.

Therefore the financial industry continues to use COBOL.

Every time someone some tries to rewrite some crusty COBOL thing in the language de jure, they'll inevitably fuck up the rounding somewhere. The financial industry has complicated rounding rules. Or better yet, the reference implementation is buggy and the new version is right, but since the answers are different it's not accepted.

You don't need special support from the language, fixed-point arithmetic is effectively the same as integer.

DOOM's implementation of fixed-point is a good example which should work on any platform as long as sizeof(int) >= 4.

https://github.com/id-Software/DOOM/blob/master/linuxdoom-1.... https://github.com/id-Software/DOOM/blob/master/linuxdoom-1....

Addition and subtraction will work normally. Multiplication also works normally except you need to right-shift by FRAC_BITS afterwards (and probably also cast to a larger integer type beforehand to protect against overflow).

Division is somewhat difficult since integer division is not what you want to do. DOOM's solution was to cast to double, perform the division, and then convert that back to fixed-point by multiplying by 1.0 before casting back to integer. This seems like cheating since it's using floating-point as an intermediate type, but it is safe to do because 64-bit floating point can represent all 32-bit integers. As long as you're on a platform with an FPU it's probably also faster than trying to roll your own division implementation.

Writing a fixed point class in C++ is pretty trivial [1]. It would have semantics and performance comparable to integer.

Fixed point classes have been proposed many times as addition to the standard library, but have yet to be voted in.

[1] see my comment elsethread about trivial problems.

Floating point is fundamentally a trade off between enumerable numbers (precision) and range between minimum/maximum numbers, it exists because fast operations on numbers are not possible with arbitrary precision constructs (you can easily have CPU/GPU operations where floating point numbers fit in registers, arbitrary precision by its very nature is arbitrarily large).

With many operations this trade off makes sense, however its critical to understand the limitations of the model.

> Makes me wonder why we haven't switched to a better floating point model in the last decades. It will probably be slower but a lot of problems could be avoided.

Pretty much all languages have some sort of decimal number. Few or none have made it the default because they're ignominiously slower than binary floating-point. To the extent that even languages which have made arbitrary precision integers their default firmly keep to binary floating-point.

> Few or none have made it the default because they're ignominiously slower than binary floating-point.

You can strike the "none". Perl 6 uses rationals (Rat) by default, someone once told me Haskell does the same, and Groovy uses BigDecimal.

Many languages have types for infinite-precision rational numbers, for example Rational in Haskell.

Wait, an entire office (presumably full of programmers) didn’t understand floating point representation? What office was this? Isn’t this topic covered first in every programming book or course where floating point math is covered?

This was in the 90s. We were all self taught and I had never met anyone who had formal CS education.

> Makes me wonder why we haven't switched to a better floating point model in the last decades.

The opposite.

Decimal floating points have been available in COBOL from the 1960s, but seem to have fallen out of favor in recent days. This might be a reason why bankers / financial data remains on ancient COBOL systems.

Fun fact: PowerPC systems still support decimal-floats natively (even the most recent POWER9). I presume IBM is selling many systems that natively need that decimal-float functionality.

Decimal floats are a lot older than COBOL. Many early relay computers (to the extent there were many such machines) used floating-point numbers with bi-quinary digits in the mantissa. https://en.wikipedia.org/wiki/Bi-quinary_coded_decimal

Being a lot slower is a worse problem than being off by an error of 2^60. And if it isn't, then you simply choose a different numeric type.

In JavaScript, you could use a library like decimal.js. For simple situations, could you not just convert the final result to a precision of 15 or less?

  > 0.1 + 0.2;
  < 0.30000000000000004

  > (0.1 + 0.2).toPrecision(15);
  < "0.300000000000000"
From Wikipedia: "If a decimal string with at most 15 significant digits is converted to IEEE 754 double-precision representation, and then converted back to a decimal string with the same number of digits, the final result should match the original string." --- https://en.wikipedia.org/wiki/Double-precision_floating-poin...

That is why I only used base 2310 for my floating point numbers :-). FWIW there are some really interesting decimal format floating point libraries out there (see http://speleotrove.com/decimal/ and https://github.com/MARTIMM/Decimal) and the early computers had decimal as a native type (https://en.wikipedia.org/wiki/Decimal_computer#Early_compute...)

The multiplication of the first 5 primes ;)

This is part of the reason Swift Numerics is helping to make it much nicer to do numerical computing in Swift.


Swift also has decimal (so does objective-c) which handles this properly. See https://lists.swift.org/pipermail/swift-users/Week-of-Mon-20... to see how swift's implementation of decimal differs from obj-c.

what is the number representation in swift? Looking at your link it seems to be plain ieee floats. In that case, would't it have the same behavior?

This is a great shibboleth for identifying mature programmers who understand the complexity of computers, vs arrogant people who wonder aloud how systems developers and language designers could get such a "simple" thing wrong.

" vs arrogant people who wonder aloud how systems developers and language designers could get such a "simple" thing wrong."

I never heard anyone complain that it would be simple to fix. But complaining? Yes - and rightfully so. Not every webprogrammer need to know the hw details and don't want to, so it is understandable that this causes irritation.

Interesting, I searched for "1.2-1.0" on google. The calculator comes up and it briefly flashes 0.19999999999999996 (and no calculator buttons) before changing to 0.2. This happens inconsistently on reload.

Swi-Prolog (listed int he article) also supports rationals:

  ?- A is rationalize(0.1 + 0.2), format('~50f~n', [A]).
  A = 3 rdiv 10.

This specific issue nearly drove me insane trying to debug a SQL -> C++/Scala/OCaml transpiler years ago. We were using the TPC-H benchmark as part of our test suite, and (unbeknownst to me), the validation parameters for one of the queries (Q6) triggered this behavior (0.6+0.1 != 0.7), but only in the C/Scala targets. OCaml (around which we had built most of our debugging infrastructure) handled the math correctly...

Fun times.

When did RFC1035 get thrown under the bus? According to it, with respect to domain name labels, "They must start with a letter" (2.3.1).

Long, long ago. 3com.com wanted to exist.

Amazingly, 3.com apparently didn't want to exist.

All-digit host names have been allowed since 1989.


One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit. Host software MUST support this more liberal syntax.

Huh. Thanks! I really missed the memo there. I wonder why 1035 doesn’t mention that it is updated-by 1123.

The same document defines `in-addr.arpa` domains that have numeric labels.

The mandate of a starting letter was for backwards compatibility, and mentions it in light of keeping names compatible with email servers and HOSTS files it was replacing.

Taking a numeric label risks incompatibility with antiquated systems, but I doubt it will effect any modern browser.

Ages ago I guess. 1password doesn't start with a letter either.

I wish high level languages (specifically python) would default to using decimal, and only use a float when cast specifically. From what I understand that would make things slower, but as a higher level language you're already making the trade of running things slower to be easier to understand.

That said, it's one of my favorite trivia gotchas.

Fixed-point calculations seem to be somewhat of a lost art these days.

It used to be widespread because floating point processors were rare and any floating point computation was costly.

That's not longer the case and everyone seems to immediately use floating point arithmetic without being fully aware of the limitations and/or without considering the precision needed.

As soon as I've started developing real-life business apps I've started to dream about a POWER which is said to have hardware decimal type support. Javs's BigDecimal solves the problem on x86 but it is at least an order of magnitude more slow than FPU-accelerated types.

Well, if your decimals are fixed-point decimals, which is the case in finance, decimal calculations are very cheap integer calculations (with simple additional scaling in multiplication/division).

I just use Zarith (bignum library) in OCaml for decimal calculation, and pretty content with performance.

I don't think much domains needs decimal floating point that much, honestly, at least in finance and scientific calculations.

But I could be wrong, and would be interested in cases where decimal floating-point calculations are preferable over these done in decimal fixed-point or IEEE floating-point ones.

Why doesn't everybody do it this way then? We would probably have a transparent built-in decimal type in every major language by now if there were no problems with this.

> Why doesn't everybody do it this way then?

Why? Fintech uses decimal fixed-point all the way, there are libraries for them for any major language. Apps like GnuCash or ledger use them as well.

But Java has BigDecimal in its standard library and it's soooo slow I doubt it is implemented this way.

In the Go example, can someone explain the difference between the first and the last case?

There's a link right below. It seems like

1. Constants have arbitrary precision 2. When you assign them, they lose precision (example 2) 3. You can format at as a arbitrary precision in a string (example 3)

In that last example, they are getting 54 significant digits in base 10.

Thanks. What I didn’t realize is that although the sum is done precisely, the resulting 0.3 will be represented approximately once converted to float64. In the first case formatting hides that, in the last it doesn’t.

I think in the last example, it's going straight from arbitrary precision to 54 significant digit, bypassing float64 entirely, hence why it looks different from the middle example.

Mods: Can we have a top level menu option called "Floating point explained"?

Not surprisingly Common Lisp gets it right. I don’t mean this is snark (I don’t mean to imply you are a weenie if you don’t use lisp) but just to show that it picked a different kind of region in the language design domain.

Computer languages should default to fixed precision decimals and offer floats with special syntax (eg “0.1f32”).

The status quo is that even Excel defaults to floats and wrong calculations with dollars and cents are widespread.

The thing that surprised me the most (because I never learned any of this in school) was not just the lack of precision to represent some numbers, but that precision falls off a cliff for very large numbers.

TL;DR - 0.1 in Base 2 (binary) is the equivalent of 1/3 in Base 10 meaning, it’s a repeating decimal that causes rounding issues (0.333333 repeating)

This is why you should never do “does X == 0.1” because it might not evaluate accurately

Whoo go Ada, one of the few to get it right. Must be the goto for secure programming for a reason.

Take that Rust and C ; )

Happy to see ColdFusion doing it right. Also, good for Julia for having the support for fractions.

I love how Awk, bc, and dc all DTRT. I wonder what postscript(/Ghostscript?) does.

for Smalltalk, the list is not complete, it has scalled decimals and fractions too: 0.1s + 0.2s = 0.3s . (1/10) + (2/10) = (3/10)

Those Babylonians were ahead of their time.

Use Int types for programming logic.

Except when, you know, you can’t.

Curious, when can't you?

My mental model of floating-point types is that they are useful for scientific/numeric computations where values are sampled from a probability distribution and there is inherently noise, and not really useful for discrete/exact logic.

Right; for the former integer arithmetic won't do.

Yep, absolutely (and increasingly often people are using 16-bit floats on GPUs to go even faster).

But the person you replied to said programming logic, not programming anything.

Honestly I think if you care about the difference between `<` and `<=`, or if you use `==` ever, it's a red flag that floating-point numbers might be the wrong data type.

Why is D different than the rest?!

bc actually computes this correctly, and returns 0.3 for 0.1 + 0.2

This has been posted here many times before. It even got mocked on n-gate in 2017 http://n-gate.com/hackernews/2017/04/07/

lol what

Please check some of the online papers on Posit numbers and Unum computing, especially by John Gustafson. In general, Unums can represent more numbers, with less rounding, and fewer exceptions than floating points. Many software and hardware vendors are starting to do interesting work with Posits.

Probably one of the more in depth technical discussions of the pros and cons of the various proposals that John Gustafson has made over the years:


IEEE floating-point is disgusting. The non-determinism and illusion of accuracy is just wrong.

I use integer or fixed-point decimal if at all possible. If the algorithm needs floats, I convert it to work with integer or fixed-point decimal instead. (Or if possible, I see the decimal point as a "rendering concern" and just do the math in integers and leave the view to put the decimal by whatever my selected precision is.)

Depends on the field. 99.9000000001% of the time, the stuff I do is entirely insensitive to anything after the third decimal point. And for my use cases, IEEE 754 is a beautiful stroke of genius that handles almost everything I ask from it. That's generally the case for most applications. If it wasn't, it wouldn't be so overwhelmingly universally used.

But again, there are clearly plenty of use cases where it's insufficient, as you can vouch. I still don't think you can call it "disgusting", though.

IEEE is deterministic and (IMO) quite well thought-out. What specifically do you not like about it?

The fact that the most trivial floating-point addition of 0.1 + 0.2 = 0.300000000000004 was insufficient to make this seem HUMAN-nondeterministic to you? (I mean sure, if you thoroughly understood the entire spec, you might not be surprised by this result, but many people would be! Otherwise the original post and website would not exist, no?)

It’s kind of a hallmark of bad design when you have to go into a long-winded explanation of why even trivial use-case examples have “surprising” results.

⅓ to 3 decimal places is 0.333. 0.333 + 0.333 = 0.666, which is not ⅔ (to 3 decimal places, that is 0.667). That is all that is happening with the 0.1 + 0.2.

The word you're looking for is "surprising," which is a far cry from non-deterministic. IEEE 754 is so thoroughly deterministic that there exists compiler flags whose sole purpose is to say "I don't care that my result is going to be off by a couple of bits from IEEE 754."

You don't need to thoroughly understand the entire spec, nor do you need to know that 0.1 + 0.2 = 0.300000000000004. "Computers can't really represent floating point numbers exactly" is generally good enough. (Also: you added "human" as a qualifier; you didn't have that before so I responded to your statement as it was written.)

you may dislike IEEE floats for many reasons, but not for being non-deterministic. Their operations are described by completely deterministic rules.

Fixed point is perfectly OK, if all your numbers are within a few orders of magnitude (e.g. money)

I agree with this view, there's nothing more disgusting than non-determinism. The way computers rely on assumptions for the accuracy of a floating number is one that's contrary to the principles of logical thinking.

> The way computers rely on assumptions

The way people rely on assumptions.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact