#r "nuget: Plotly.NET, 4.0.0"
#r "nuget: Plotly.NET.Interactive, 4.0.0"
#r "nuget: FSharp.Stats"
open Plotly.NET
Numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral, typically by using values from the funcion range.
See also: https://en.wikipedia.org/wiki/Numerical_integration
The algorithms implemented in FSharp.Stats are:
Left endpoint rule (LeftEndpoint
)
Right endpoint rule (RightEndpoint
)
Midpoint rule (Midpoint
)
Trapezoidal rule (Trapezoidal
)
Simpson's rule (Simpson
)
You can either integrate a function (float -> float
) or observations. When estimating the integral of observations, Mid-values are calculated as the average of two observations
Any function with domain and range of float (float -> float
) can be numerically integrated for an interval $[a,b]$ with $n$ partitions, which will be evenly spaced in the interval (partition length = $\frac{(b-a)}n$)
Use the NumericalIntegration.definiteIntegral
function and pass the desired estimation method together with the integration interval start/endpoints and the amount of partitions(more on those methods in the chapters below).
the expected exact value for the definite integral of $f(x) = x^3$ is 0.25
open FSharp.Stats.Integration
let f (x: float) = x * x * x
// integrate f in the interval [0.,1.] with 100 partitions using the left endpoint method
f |> NumericalIntegration.definiteIntegral(LeftEndpoint, 0., 1., 100)
0.245025
// integrate f in the interval [0.,1.] with 100 partitions using the right endpoint method
f |> NumericalIntegration.definiteIntegral(RightEndpoint, 0., 1., 100)
0.255025
// integrate f in the interval [0.,1.] with 100 partitions using the midpoint method
f |> NumericalIntegration.definiteIntegral(Midpoint, 0., 1., 100)
0.2499875
// integrate f in the interval [0.,1.] with 100 partitions using the trapezoidal method
f |> NumericalIntegration.definiteIntegral(Trapezoidal, 0., 1., 100)
0.250025
// integrate f in the interval [0.,1.] with 100 partitions using the simpson method
f |> NumericalIntegration.definiteIntegral(Simpson, 0., 1., 100)
0.25
It should be noted that the accuracy of the estimation increases with the amount of partitions in the integration interval.
Instead of integrating a function by sampling the function values in a set interval, we can also calculate the definite integral of (x,y) pairs with these methods.
This may be of use for example for calculating the area under the curve for prediction metrics such as the ROC(Receiver operator characteristic), which yields a distinct set of (Specificity/Fallout) pairs.
Use the NumericalIntegration.definiteIntegral
function and pass the desired estimation method (more on those methods in the chapters below).
the expected exact value for the definite integral of $f(x) = x^2$ is $0.\overline3$
//x,y pairs of f(x) = x^2 in the interval of [0,1], with random values removed to show that this works with unevenly spaced data
let rnd = new System.Random(69)
let observations =
[|0. .. 0.01 .. 1.|]
|> Array.map(fun x -> x, x * x)
|> Array.filter (fun (x,y) -> rnd.NextDouble() < 0.85 )
observations.Length
87
// integrate observations using the left endpoint method
observations |> NumericalIntegration.definiteIntegral LeftEndpoint
No value returned by any evaluator
// integrate observations using the right endpoint method
observations |> NumericalIntegration.definiteIntegral RightEndpoint
No value returned by any evaluator
// integrate observations using the midpoint method
observations |> NumericalIntegration.definiteIntegral Midpoint
No value returned by any evaluator
// integrate observations using the trapezoidal method
observations |> NumericalIntegration.definiteIntegral Trapezoidal
No value returned by any evaluator
// integrate observations using the simpson method
observations |> NumericalIntegration.definiteIntegral Simpson
No value returned by any evaluator
In the following chapter, each estimation method is introduced briefly and visualized for the example of $f(x) = x^3$ in the interval $[0,1]$ using 5 partitions.
A large class of quadrature rules can be derived by constructing interpolating functions that are easy to integrate. Typically these interpolating functions are polynomials. In practice, since polynomials of very high degree tend to oscillate wildly, only polynomials of low degree are used, typically linear and quadratic.
The approximation of all these methods increase with the size of subintervals in the integration interval.
The interpolating function is a constant function (a polynomial of degree zero), passing the leftmost points of the partition boundaries of the interval to integrate.
For a single partition $[a,b]$ in the integration interval, the integral is estimated by
$$\int_a^b f(x)\,dx \approx (b-a) * f(a)$$The integral of the whole integration interval is obtained by summing the integral of n partitions.
leftEndpointChart
The interpolating function is a constant function (a polynomial of degree zero), passing the rightmost points of the partition boundaries of the interval to integrate.
For a single partition $[a,b]$ in the integration interval, the integral is estimated by
$$\int_a^b f(x)\,dx \approx (b-a) * f(b)$$The integral of the whole integration interval is obtained by summing the integral of n partitions.
rightEndpointChart
The interpolating function is a constant function (a polynomial of degree zero), passing the mid-points of the partition boundaries of the interval to integrate.
For a single partition $[a,b]$ in the integration interval, the integral is estimated by
$$\int_a^b f(x)\,dx \approx (b-a) * f(\frac{a+b}2))$$The integral of the whole integration interval is obtained by summing the integral of n partitions.
midpointChart
The interpolating function is a straight line (an affine function, i.e. a polynomial of degree 1) passing through the partition boundaries of the interval to integrate.
For a single partition $[a,b]$ in the integration interval, the integral is estimated by
$$\int_a^b f(x)\,dx \approx (b-a) (\frac{f(a) + f(b)}2)$$The integral of the whole integration interval is obtained by summing the integral of n partitions.
trapezoidalChart
Simpson
)¶For a single partition $[a,b]$ in the integration interval, the integral is estimated by
$$\int_a^b f(x)\,dx \approx \frac{b - a}6 [f(a) + 4f(\frac{a+b}2) + f(b)]$$The integral of the whole integration interval is obtained by summing the integral of n partitions.
This rule can be derived by constructing parabolas that have the value of $f(x)$ for the partition boundaries $a$ and $b$, and the midpoint $m = \frac{a+b}2$ and calculating their definite integral for $[a,b]$
Another possibility to derive this rule is the weighted average of the midpoint ($M$) and trapezoidal ($T$) rules $\frac{2M + T}3$