Homework6_A

Create one or more simple sequences of numbers which clearly show the problem with the “naive” definition formula of the arithmetic mean, and explore possible ways to fix that.
Provide alternative algorithms to minimize problems with the floating point representation with simple demos with actual numbers.

Problems with floating representation

There are only a finite number of different floats and they are discrete but not equidistant.
The spacing between adjacent numbers jumps by a factor of two at every power of two larger than 0.25.
0.25 is the smallest normalised number in this representation (by modulus), and numbers between -0.25 and 0.25 are denormalised.

    'This Is one of the joke of numbers
    'The result of sum is 0.1 
    Public Function create_list() As List(Of Double)
        Dim possibleInput As New List(Of Double)
        possibleInput.Add(1E-350)
        possibleInput.Add(0.1)
        Return possibleInput
    End Function

    'This Is another one of the joke of numbers
    'The sum will be 0.19999
    Public Function create_list() As List(Of Double)
        Dim possibleInput As New List(Of Double)
        possibleInput.Add(10.2)
        possibleInput.Add(10.0)
        Return possibleInput
    End Function

Problems with naive alghoritm

vb.net

Public Class Form1

    Public CurrentArMean As Double = 0

    'not precision in naive method

    Public Function create_list() As List(Of Double)
        Dim possibleInput As New List(Of Double)
        possibleInput.Add(10000000000.0)
        possibleInput.Add(0.1)
        possibleInput.Add(-10000000000.0)
        possibleInput.Add(-0.0001)
        possibleInput.Add(0.1)
        possibleInput.Add(10000)
        Return possibleInput
    End Function

    'ok method also for naive

    'Public Function create_list() As List(Of Double)
    '    Dim possibleInput As New List(Of Double)
    '    possibleInput.Add(10000000000.0)
    '    possibleInput.Add(-10000000000.0)
    '    possibleInput.Add(0.1)
    '    Return possibleInput
    'End Function

    'This Is the joke of float numbers
    'Public Function create_list() As List(Of Double)
    '    Dim possibleInput As New List(Of Double)
    '    possibleInput.Add(10.2)
    '    possibleInput.Add(10.0)
    '    Return possibleInput
    'End Function

    Public Function do_data()
        Dim result As Double() = {
        naive_alg_mean(create_list),
        online_alg_mean(create_list),
        kahan_sum(create_list)
        }
        Return result
    End Function
    Public Sub print_data(result As Double())
        Me.RtxtB1.AppendText("Online Algo result: " & result(1) & Environment.NewLine & Environment.NewLine)
        Me.RtxtB1.AppendText("Kahan Algo result: " & result(2) & Environment.NewLine & Environment.NewLine)
        Me.RtxtB1.AppendText("Naive Algo result: " & result(0) & Environment.NewLine & Environment.NewLine)
    End Sub

    Private Sub BtnCalc_Click(sender As Object, e As EventArgs) Handles BtnCalc.Click

        Me.RtxtB1.Clear()
        Dim result As Double() = do_data()
        print_data(result)

    End Sub

    Private Function naive_alg_mean(Possibleinput As List(Of Double)) As Double
        Dim sum As Double = 0
        For Each i In Possibleinput
            Try
                sum += i
            Catch ex As OverflowException
                Me.RtxtB1.AppendText("Exception in naive method: " & ex.Message & Environment.NewLine)
            End Try
        Next
        Return sum / Possibleinput.Count
    End Function

    Private Function online_alg_mean(Possibleinput As List(Of Double)) As Double
        Dim count = 0
        For Each i In Possibleinput
            count += 1
            CurrentArMean = CurrentArMean + (i - CurrentArMean) / count
        Next
        Return CurrentArMean
    End Function


    Private Function better_alg_mean(Possibleinput As List(Of Double)) As Double
        Dim currentMean As Double = 0
        Dim c As Double = Possibleinput.Min
        For Each i In Possibleinput
            currentMean += i - c
        Next

        Return (currentMean / Possibleinput.Count) + c
    End Function

    Private Function kahan_sum(Possibleinput As List(Of Double)) As Double
        Dim sum As Double = 0
        Dim c As Double = 0
        For Each i In Possibleinput
            Dim y As Double = i - c
            Dim t As Double = sum + y
            c = (t - sum) - y
            sum = t
        Next
        Return sum / Possibleinput.Count
    End Function
End Class

The input in this case create problems in Naive Algoritm because two of the summands were much larger than the other.

Precision is lost most quickly when the magnitude of the two numbers to be added is most different. With summation, one of the operands is always the sum so far, and in general it is much larger than the numbers being added to it. This effect can be minimized by adding the numbers in order from the smallest to the largest.
(If the two numbers have sufficiently different exponents then some least significant digits will have to be discarded.)

Klein algorithm for sum is better of Kahan sum:

With this method, a correction is maintained along with the current running total, effectively increasing the precision. Each term is first corrected for the error that has accumulated so far. Then the new sum is calculated by adding this corrected term to the running total. Finally, a new correction term is calculated as the difference between the change in the sums and the corrected term.

 Public Function Klein_alg_mean(possibleInput As List(Of Double))
        Dim sum As Double = 0.0
        Dim cs As Double = 0.0
        Dim ccs As Double = 0.0
        Dim c As Double = 0.0
        Dim cc As Double = 0.0
        For Each i In possibleInput
            Dim t As Double = sum + i
            If Math.Abs(sum) >= Math.Abs(i) Then
                c = (sum - t) + i
            Else
                c = (i - t) + sum
            End If
            sum = t
            t = cs + c
            If Math.Abs(cs) >= Math.Abs(c) Then
                cc = (cs - t) + c
            Else
                cc = (c - t) + cs
            End If
            cs = t
            ccs = ccs + cc
        Next i
        Return (sum + cs + ccs) / possibleInput.Count
    End Function

There are also other methods for decrease loss of precision:

  1. sort your set
  2. split in groups of elements whose sum wouldn’t overflow – since they are sorted, this is fast and easy
  3. do the sum in each group – and divide by the group size
  4. do the sum of the group’s sum’s (possibly calling this same algorithm recursively) – be aware that if the groups will not be equally sized, you’ll have to weight them by their size

Moreover double can be divided by a power of 2 without loss of precision. So if your only problem if the absolute size of the sum you could try to pre-scale your numbers before summing them.

We can also use this trick:

mean(X) = sum(X[i] - c)  +  c
          -------------
                N

Such that rescale the value, c could be the min value in the input list.

https://www.volkerschatz.com/science/float.html
https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Further_enhancements
https://www.drdobbs.com/floating-point-summation/184403224
https://www.xspdf.com/help/50835306.html

https://www.soa.org/news-and-publications/newsletters/compact/2014/may/com-2014-iss51/losing-my-precision-tips-for-handling-tricky-floating-point-arithmetic/

Lascia un commento