{"id":3839,"date":"2018-02-28T15:50:46","date_gmt":"2018-02-28T15:50:46","guid":{"rendered":"http:\/\/www.blopig.com\/blog\/?p=3839"},"modified":"2018-03-07T15:53:23","modified_gmt":"2018-03-07T15:53:23","slug":"a-short-intro-to-machine-precision-and-how-to-beat-it","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2018\/02\/a-short-intro-to-machine-precision-and-how-to-beat-it\/","title":{"rendered":"A short intro to machine precision and how to beat it"},"content":{"rendered":"<p>Most people who\u2019ve ever sat through a Rounding Error is Bad lecture will be familiar with the following example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">&gt; (0.1+0.1+0.1) == 0.3\r\nFALSE<\/pre>\n<p>The reason this is so unsettling is because most of the time we think about numbers in base-10. This means we use ten digits <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5C%7B0%2C+1%2C+%5Cdots%2C+9%5C%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;{0, 1, &#92;dots, 9&#92;}\" class=\"latex\" \/>, and we perform arithmetic based on this ten digit notation. This doesn\u2019t always matter much for pen and paper maths but it\u2019s an integral part of how we think about more complex operations and in particular how we think about accuracy. We see <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.1\" class=\"latex\" \/> as a finite decimal fraction, and so it\u2019s only natural that we should be able to do accurate sums with it. And if we can do simple arithmetic, then surely computers can too? In this blog post I\u2019m going to try and briefly explain what causes rounding errors such as the one above, and how we might get away with going beyond machine precision.<\/p>\n<p>Take a number <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x+%5Cin+%5B0%3B+1%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x &#92;in [0; 1)\" class=\"latex\" \/>, say <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3D1%2F3&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=1\/3\" class=\"latex\" \/>. The decimal representation of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x\" class=\"latex\" \/> is of the form <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3D%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D+a_i+%5Ctimes+10%5E%7B-i%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=&#92;sum_{i=1}^{&#92;infty} a_i &#92;times 10^{-i}\" class=\"latex\" \/>. The <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=a_i+%5Cin+%5C%7B0%2C+1%2C+%5Cdots%2C+9%5C%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"a_i &#92;in &#92;{0, 1, &#92;dots, 9&#92;}\" class=\"latex\" \/> here are the digits that go after the radix point. In the case of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3D1%2F3&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=1\/3\" class=\"latex\" \/> these are all equal <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=a_i%3D3&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"a_i=3\" class=\"latex\" \/>, or <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3D0.333%5Cdots+_%7B10%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=0.333&#92;dots _{10}\" class=\"latex\" \/>. Some numbers, such as our favourite <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x\" class=\"latex\" \/>, don\u2019t have a finite decimal expansion. Others, such as <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.3&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.3\" class=\"latex\" \/>, do, meaning that after some <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i+%5Cin+%5Cmathbb%7BN%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i &#92;in &#92;mathbb{N}\" class=\"latex\" \/>, all <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=a_%7Bi%2Bj%7D%3D0&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"a_{i+j}=0\" class=\"latex\" \/>. When we talk about rounding errors and accuracy, what we actually mean is that we only care about the first few digits, say <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i%5Cleq+5&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i&#92;leq 5\" class=\"latex\" \/>, and we\u2019re happy to approximate to <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%5Capprox+%5Csum_%7Bi%3D1%7D%5E%7B5%7D+a_i+%5Ctimes+10%5E%7B-i%7D%3D0.33333&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x&#92;approx &#92;sum_{i=1}^{5} a_i &#92;times 10^{-i}=0.33333\" class=\"latex\" \/>, potentially rounding up at the last digit.<\/p>\n<p>Computers, on the other hand, store numbers in base-2 rather than base-10, which means that they use a different series expansion <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3D%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D+b_i+%5Ctimes+2%5E%7B-i%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=&#92;sum_{i=1}^{&#92;infty} b_i &#92;times 2^{-i}\" class=\"latex\" \/>, <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=b_i+%5Cin+%5C%7B0%2C+1%5C%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"b_i &#92;in &#92;{0, 1&#92;}\" class=\"latex\" \/> to represent the same number. Our favourite number <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x\" class=\"latex\" \/> is actually stored as <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.1010101%5Cdots+_%7B2%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.1010101&#92;dots _{2}\" class=\"latex\" \/> rather than <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.3333333%5Cdots+_%7B10%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.3333333&#92;dots _{10}\" class=\"latex\" \/>, despite the fact it appears as the latter on a computer screen. Crucially, arithmetic is done in base-2 and, since only a finite number of binary digits are stored (<img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i%5Cleq+52&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i&#92;leq 52\" class=\"latex\" \/> for most purposes these days), rounding errors also occur in base-2.<\/p>\n<p>All numbers with a finite binary expansion, such as <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.25_%7B10%7D%3D0%5Ctimes+1%2F2%2B1%5Ctimes+1%2F4%3D0.01_%7B2%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.25_{10}=0&#92;times 1\/2+1&#92;times 1\/4=0.01_{2}\" class=\"latex\" \/> also have a finite decimal expansion, meaning we can do accurate arithmetic with them in both systems. However, the reverse isn\u2019t true, which is what causes the issue with <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.1%2B0.1%2B0.1%5Cneq+0.3&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.1+0.1+0.1&#92;neq 0.3\" class=\"latex\" \/>. In binary, the nice and tidy <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=0.1_%7B10%7D+%3D+0.00011001100%5Cdots+_%7B2%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"0.1_{10} = 0.00011001100&#92;dots _{2}\" class=\"latex\" \/>. We observe the rounding error because unlike us, the computer is trying to sum over infinite series.<\/p>\n<p>While it\u2019s not possible to do infinite sums with finite resources, there is a way to go beyond machine precision if you wanted to, at least for rational <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3Dp%2Fq&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=p\/q\" class=\"latex\" \/>, where <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=p%2C+q+%5Cin+%5Cmathbb%7BN%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"p, q &#92;in &#92;mathbb{N}\" class=\"latex\" \/>. In the example above, the issue comes from dividing by 10 on each side of the (in)equality. Luckily for us, we can avoid doing so. Integer arithmetic is easy in any base, and so<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">&gt; (1+1+1) == 3 \r\nTRUE<\/pre>\n<p>Shocking, I know. On a more serious note, it is possible to write an algorithm which calculates the binary expansion of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x%3Dp%2Fq&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x=p\/q\" class=\"latex\" \/> using only integer arithmetic. Usually binary expansion happens in this way:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-highlight=\"3,4,8\">set x, maxIter\r\ninitialise b, i=1\r\nwhile x&gt;0 AND i&lt;=maxIter {\r\n if 2*x&gt;=1\r\n    b[i]=1\r\n else\r\n    b[i]=0\r\n x = 2*x-b[i]\r\n i = i+1 \r\n}\r\nreturn b<\/pre>\n<p>Problems arise whenever we try to compute something non-integer (the highlighted lines 3, 4, and 8). However, we can rewrite these using <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=x+%3D+p%2Fq&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"x = p\/q\" class=\"latex\" \/> and\u00a0 shifting the division by <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=q&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"q\" class=\"latex\" \/> to the right-hand side of each inequality \/ assignment operator:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-highlight=\"3, 4, 8\">set p, q, maxIter\r\ninitialise b, i=1 \r\nwhile p&gt;0 AND i&lt;=maxIter { \r\n if 2*p&gt;=q \r\n    b[i]=1 \r\n else \r\n    b[i]=0 \r\n p = 2*p-b[i]*q \r\n i = i+1 \r\n} \r\nreturn b<\/pre>\n<p>Provided we&#8217;re not dealing with monstrously large integers (i.e. as long as we can safely double <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"p\" class=\"latex\" \/>), implementing the above lets us compute <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=p%2Fq&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"p\/q\" class=\"latex\" \/> with arbitrary precision given by <code class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">maxIter<\/code>. So we can beat machine precision for rationals! And the combination of arbitrarily accurate rationals and arbitrarily accurate series approximations (think Riemann zeta function, for example) means we can also get the occasional arbitrarily accurate irrational.<\/p>\n<p>To sum up, rounding errors are annoying, partly because it&#8217;s not always intuitive when and how they happen. As a general rule the best way to avoid them is to make your computer do as little work as possible, and to avoid non-integer calculations whenever you can. But you already knew that, didn&#8217;t you?<\/p>\n<p><em>This post was partially inspired by the undergraduate course on Simulation and Statistical Programming lectured by Prof Julien Berestycki and Prof Robin Evans. It was also inspired by my former maths teacher who used to mark us down for doing more work than necessary even when our solutions were correct. He had a point.<br \/>\n<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most people who\u2019ve ever sat through a Rounding Error is Bad lecture will be familiar with the following example: &gt; (0.1+0.1+0.1) == 0.3 FALSE The reason this is so unsettling is because most of the time we think about numbers in base-10. This means we use ten digits , and we perform arithmetic based on [&hellip;]<\/p>\n","protected":false},"author":40,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[15],"tags":[],"ppma_author":[529],"class_list":["post-3839","post","type-post","status-publish","format-standard","hentry","category-technical"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":529,"user_id":40,"is_guest":0,"slug":"lyuba","display_name":"Lyuba","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/e10fff614041616d2611ef048154e92d6d6a208f36f8634b41a9abcec271c8c1?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3839","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/40"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=3839"}],"version-history":[{"count":15,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3839\/revisions"}],"predecessor-version":[{"id":3862,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3839\/revisions\/3862"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=3839"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=3839"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=3839"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=3839"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}