Let $ f(z) = 1 / \Gamma(z)$ where $ \Gamma(z) $ is the gamma function, and let $p(z)$ be the cubic polynomial that best approximates $f(z)$ on the unit disk in the supremum norm $\Vert{\cdot}\Vert_{\infty}$. What is $\Vert{f-p}\Vert_{\infty}$?
Recall that the supremum norm of a function $f$ defined on domain $D$ is defined by $\Vert{f}\Vert_\infty := \max_{x \in D} |f(x)| $.
If we let our optimal polynomial $p(x) := ax^3 + bx^2 + cx + d $, we have that $$ \epsilon := \Vert f-p \Vert_\infty = \max_{x \in D} |f(x)-p(x)| $$
where $D$ is the unit disk.
Now, note that because $1/\Gamma(x)$ and $p(x)$ are entire functions, we have that, by the maximum principle that the maximum lies on the boundary of $D$, i.e. the unit circle.
We therefore have $$\epsilon = \min_{a,b,c,d \in \mathbb{R}} \max_{\theta \in [0,2 \pi)} \left|\frac{1}{\Gamma(e^{i\theta})}- \left(a e^{3i\theta} + b e^{2i\theta} + c e^{i \theta} + d \right) \right| $$
Firstly, we attempt out-of-the-box minimization routines. For an initial guess, we use the coefficients of the Maclaurin series of $f(z)$, namely:
$$ f(z) \approx f_0(z) = f(0) + f'(0) z + \frac{f''(0)}{2!} z^2 + \frac{f'''(0)}{3!} z^3$$import sympy
from sympy import gamma as sp_gamma
from sympy.abc import z
sympy.series(1/sp_gamma(z), z, n=4)
We have that:
$$(a_0, b_0, c_0, d_0) = \left( \frac{\gamma^2}{2} - \frac{\pi^2}{12} , \gamma, 1, 0\right) \approx (-0.655878, 0.577216, 1, 0) $$We first use differential evolution to find a rough global minimum of the error:
from scipy.special import gamma
from scipy.optimize import differential_evolution, minimize
from numpy import cos, sin, pi
from tqdm import tqdm_notebook as tqdm
import numpy as np
cis = lambda t: cos(t) + 1j*sin(t)
a0, b0, c0, d0 = (-0.655878, 0.577216, 1, 0)
g = lambda a,b,c,d: lambda t: abs( a*cis(3*t) + b*cis(2*t) + c*cis(t) + d - 1 / gamma(cis(t)))
def fn_to_minimize(args, N=1000):
# print(args)
a,b,c,d = args
X = np.linspace(0, 2*pi, N)
Y = g(a,b,c,d)(X)
return Y.max()
rough_min = differential_evolution(fn_to_minimize,
bounds = [(-1.1,1.1) for i in range(4)],
tol=1e-6,
x0=(a0,b0,c0,d0),
seed = 42)
rough_min
fun: 0.2143352172675715 jac: array([0.86039537, 0.64650538, 0.09199849, 0.63088437]) message: 'Optimization terminated successfully.' nfev: 7550 nit: 121 success: True x: array([-0.60337732, 0.62514893, 1.01969876, 0.0055078 ])
To refine this estimate, we then use local minimization techniques to get a better answer:
from tqdm import tqdm_notebook as tqdm
def maximize_function(F, a, b):
F = np.vectorize(F)
i = 0
while b-a > 1e-16:
X = np.linspace(a, b, 1000)
Y = F(X)
X_max = X[Y.argmax()]
a, b = X_max - (b-a)/3, X_max + (b-a)/3
if (a == X_max - (b-a)/3): break
i += 1
# print(i)
return (F(a) + F(b)) / 2
min_F_seen = 10000
iterations = tqdm()
def fn_to_minimize2(args):
global min_F_seen, iterations
a,b,c,d = args
F = g(a,b,c,d)
out = maximize_function(F, 0, 2*pi)
if out < min_F_seen:
print("minimum F seen: ", args, out)
min_F_seen = out
iterations.update()
return out
minimize(fn_to_minimize2, x0 = rough_min.x, tol=1e-12) # 0.2143352345
/tmp/ipykernel_11995/992212994.py:17: TqdmDeprecationWarning: This function will be removed in tqdm==5.0.0 Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook` iterations = tqdm()
0it [00:00, ?it/s]
minimum F seen: [-0.60337732 0.62514893 1.01969876 0.0055078 ] 0.2143355714038389 minimum F seen: [-0.60337732 0.62514893 1.01969878 0.0055078 ] 0.21433556169152873 minimum F seen: [-0.60337732 0.62514893 1.01969876 0.00550781] 0.2143355586581085 minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.21433538060117682 minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.21433538060117652 minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.2143353806011763
fun: 0.21433538060117682 hess_inv: array([[ 1.06212648, 0.14313859, 0.29009927, 0.16611956], [ 0.14313859, 1.17835123, 0.14218146, 0.02721326], [ 0.29009927, 0.14218146, 0.52621387, -0.45964763], [ 0.16611956, 0.02721326, -0.45964763, 0.60953961]]) jac: array([0.24456594, 0.4963477 , 0.90874575, 0.4963477 ]) message: 'Desired error not necessarily achieved due to precision loss.' nfev: 272 nit: 1 njev: 52 status: 2 success: False x: array([-0.60337752, 0.62514878, 1.01969892, 0.00550799])
With these techniques, we are able to get 6 accurate digits. Note that this is especially difficult because we are trying to find the minimum of a maximum of a function - therefore, any numerical error will get compounded rapidly
We next follow the solution submitted by Kim McInturff and Peter Simon. Namely, this approach gets rid of the numerical instability of calculating the inner maximum above.
Let $f(z) = \sum_{k=0}^N c_k z^k$ and $p(z) = \sum_{k=0}^3 a_k z^k $. We therefore have that the error $ E(z) = f(z) - p(z) = \sum_{k=0}^N E_k z^k $ is a polynomial in $z$.
We find the maximum of $|E(z)|^2$ analytically (and therefore $|E(z)|$ since it is a non-negative quantity). We have:
$$ \begin{align*} |E(z)|^2 &= E(z)\overline{E(z)} \\ &= \left(\sum_{k=0}^N E_k z^k \right) \overline{\left( \sum_{k=0}^N E_k z^k \right)} \\ &= \left(\sum_{k=0}^N E_k z^k \right) \left( \sum_{k=0}^N E_k \left(\overline{z}\right)^k \right) \\ &= \sum_{0\le p,q \le N} E_p E_q z^p \left(\overline{z}\right)^q \\ &= \sum_{0\le p,q \le N} E_p E_q e^{(p-q)i\theta} & (z \mapsto e^{i\theta}) \end{align*} $$Therefore, a maximum must occur when $\frac{d}{d\theta} |E(z)|^2 = 0$. Taking the derivative, we have:
$$ \begin{align*} \frac{d}{d\theta} |E(z)|^2 &= \frac{d}{d\theta} \left( \sum_{0\le p,q \le N} E_p E_q e^{(p-q)i\theta} \right) \\ &= \sum_{0\le p,q \le N} E_p E_q \left(\frac{d}{d\theta} e^{(p-q)i\theta} \right) \\ &= \sum_{0\le p,q \le N} E_p E_q \left(i (p-q) e^{(p-q)i\theta}\right) \\ &= \sum_{0\le p,q \le N} E_p E_q \left(i (p-q) z^{p-q}\right) & (e^{i\theta} \mapsto z)\\ \end{align*} $$We therefore have:
$$ \frac{d}{d\theta} |E(z)|^2 = 0 \iff \sum_{0\le p,q \le N} E_p E_q (p-q) z^{p-q} = 0$$Note that this series has some negative coefficients of $z^i$, so we multiply through by $z^N$. We can solve for the roots of this polynomial $P(z) \equiv \sum_{0\le p,q \le N} E_p E_q (p-q) z^{N+p-q} $ for the critical points of $|E(z)|^2$ and then locate the maximum. This avoids the floating point issues of finding the maximum we were dealing with earlier.
We find the coefficients $c_k$ for $f(z)$ via the following recurrence:
$$\begin{cases} c_0, c_1, c_2 = 0, 1, \gamma \\ c_n = \frac{c_2 c_{n-1} + \sum_{j=2}^{n-1} (-1)^{j+1} \zeta(j) c_{n-j}}{n-1} \end{cases} $$import mpmath
import numpy as np
from numpy.polynomial import Polynomial
from collections import defaultdict
N = 30
mpmath.mp.dps = 50
gamma = mpmath.mp.euler
coeffs = [0, 1, gamma]
for n in range(3, N+1):
next_coeff = (coeffs[2]*coeffs[n-1] + sum([pow(-1, j+1)*mpmath.zeta(j)*coeffs[n-j] for j in range(2,n)])) / (n-1)
coeffs.append(next_coeff)
f_coeffs = [np.longdouble(str(i)) for i in coeffs]
a0, b0, c0, d0 = f_coeffs[:4][::-1]
min_F_seen = 10000
DIGITS = 0
def fn_to_minimize3(args):
global min_F_seen, DIGITS
p = list(args[::-1]) + [0 for i in range(N-3)]
# Calculate coefficients of error
E_coeffs = [f_i-p_i for f_i,p_i in zip(f_coeffs,p)]
# Calculate p
P_arr = np.zeros(2*N+1, dtype=float)
for p in range(N+1):
for q in range(N+1):
P_arr[N+p-q] += E_coeffs[p]*E_coeffs[q]*(p-q)
# Get roots of P
p = Polynomial(P_arr)
roots = p.roots()
# Get max from critical points
E = lambda z: sum([coeff*pow(z,n) for n,coeff in enumerate(E_coeffs)])
potential_maxs = [ abs(E(r)) for r in roots if 0.95<=abs(r)<=1.05]
out = max(potential_maxs)
if out < min_F_seen:
if abs(min_F_seen - out) <= pow(10, -(DIGITS+1)):
print(f"new minimum: E({args}) = {min_F_seen}")
DIGITS += 1
min_F_seen = out
return out
differential_evolution(fn_to_minimize3,
bounds=[(-1.1,1.1) for i in range(4)],
x0=(a0,b0,c0,d0),
tol=1e-12,
seed=42)
new minimum: E([-0.61709254 0.63484243 1.04677988 0.01128302]) = 0.2354690226759797 new minimum: E([-0.61392552 0.60879034 1.02295489 0.01537967]) = 0.23177731011348132 new minimum: E([-0.6154001 0.61806856 1.00238181 -0.00354295]) = 0.22808068663267253 new minimum: E([-0.6059147 0.61938119 1.02002916 0.00268617]) = 0.2207281019321637 new minimum: E([-0.60481986 0.62270918 1.01769605 0.00498882]) = 0.2148275797479694 new minimum: E([-0.60376176 0.62427757 1.01880487 0.0051676 ]) = 0.21440289839988855 new minimum: E([-0.60325694 0.62537509 1.0199267 0.00562997]) = 0.21433660211622288 new minimum: E([-0.60328167 0.62532152 1.01987165 0.00560401]) = 0.21433556274375193 new minimum: E([-0.60334874 0.62520381 1.01975375 0.00553643]) = 0.2143352441053902 new minimum: E([-0.60334201 0.62521689 1.01976683 0.00554315]) = 0.21433523787921516 new minimum: E([-0.60334067 0.62521834 1.01976828 0.0055445 ]) = 0.2143352362675276 new minimum: E([-0.60334324 0.62521188 1.01976182 0.00554194]) = 0.2143352345948142 new minimum: E([-0.60334318 0.62521201 1.01976195 0.00554199]) = 0.21433523459104564 new minimum: E([-0.60334322 0.62521191 1.01976185 0.00554195]) = 0.21433523459059878
fun: 0.21433523459048126 message: 'Optimization terminated successfully.' nfev: 17145 nit: 283 success: True x: array([-0.60334323, 0.6252119 , 1.01976184, 0.00554194])
With this approach, we are able to get more than 10 digits of the answer.
Note that this problem is a general case of finding the best approximation to a function in the supremum norm - in such cases, usually the Remez algorithm is used. However, I was unable to find a Python implementation of this algorithm and my own implementation of the algorithm failed to work.