#!/usr/bin/env python # coding: utf-8 # ## Improved estimation of Signal Noise Ratio via moments # # Let $\mu$, and $\sigma$ be the mean and standard deviation of the returns of an asset. Then $\zeta = \mu / \sigma$ is the "Signal to Noise Ratio" (SNR). Typically the SNR is estimated with the _Sharpe Ratio_, defined as $\hat{\zeta} = \hat{\mu} / \hat{\sigma}$, where $\hat{\mu}$ and $\hat{\sigma}$ are the vanilla sample estimates. Can we gain efficiency in the case where the returns have significant skew and kurtosis? # # Here we consider an estimator of the form # $$ # v = a_0 + \frac{a_1 + \left(1+a_2\right)\hat{\mu} + a_3 \hat{\mu}^2}{\hat{\sigma}} + a_4 \left(\frac{\hat{\mu}}{\hat{\sigma}}\right)^2. # $$ # The Sharpe Ratio corresponds to $a_0 = a_1 = a_2 = a_3 = a_4 = 0$. Note that we were inspired by # Norman Johnson's [work on t-tests under skewed distributions](http://www.jstor.org/stable/2286597). Johnson considered a similar setup, but with only $a_1, a_2,$ and $a_3$ free, and was concerned with the problem of hypothesis testing on $\mu$. # # Below, following Johnson, I will use the Cornish Fisher expansions of $\hat{\mu}$ and $\hat{\sigma}$ to approximate $v$ as a function of the first few cumulants of the distribution, and some normal variates. I will then compute the mean square error, $E\left[\left(v - \zeta\right)^2\right],$ and take its derivative with respect to $a_i$. Unfortunately, we will find that the first order conditions are solved by $a_i=0$, which is to say that the vanilla Sharpe has the lowest MSE of estimators of this kind. Our adventure will take us far, but we will return home empty handed. # # We proceed. # In[1]: # load what we need from sympy from __future__ import division from sympy import * from sympy import Order from sympy.assumptions.assume import global_assumptions from sympy.stats import P, E, variance, Normal init_printing() nfactor = 4 # define some symbols. a0, a1, a2, a3, a4 = symbols('a_0 a_1 a_2 a_3 a_4',real=True) n, sigma = symbols('n \sigma',real=True,positive=True) zeta, mu3, mu4 = symbols('\zeta \mu_3 \mu_4',real=True) mu = zeta * sigma # We now express $\hat{\mu}$ and $\hat{\sigma}^2$ by the [Cornish Fisher expansion](https://en.wikipedia.org/wiki/Cornish%E2%80%93Fisher_expansion). This is an expression of the distribution of a random variable in terms of its cumulants and a normal variate. The expansion is ordered in a way such that when applied to the _mean_ of independent draws of a distribution, the terms are clustered by the order of $n$. The Cornish Fisher expansion also involves the [Hermite polynomials](https://en.wikipedia.org/wiki/Hermite_polynomials). The expansions of $\hat{\mu}$ and $\hat{\sigma}^2$ are not independent. We follow Johnson in expression the correlation of normals and truncating: # In[2]: # probabilist's hermite polynomials def Hen(x,n): return (2**(-n/2) * hermite(n,x/sqrt(2))) # this comes out of the wikipedia page: h1 = lambda x : Hen(x,2) / 6 h2 = lambda x : Hen(x,3) / 24 h11 = lambda x : - (2 * Hen(x,3) + Hen(x,1)) / 36 # mu3 is the 3rd centered moment of x gamma1 = (mu3 / (sigma**(3/2))) / sqrt(n) gamma2 = (mu4 / (sigma**4)) / n # grab two normal variates with correlation rho # which happens to take value: # rho = mu3 / sqrt(sigma**2 * (mu4 - sigma**4)) z1 = Normal('z_1',0,1) z3 = Normal('z_3',0,1) rho = symbols('\\rho',real=True) z2 = rho * z1 + sqrt(1-rho**2)*z3 # this is out of Johnson, but we call it mu hat instead of x bar: muhat = mu + (sigma/sqrt(n)) * (z1 + gamma1 * h1(z1) + gamma2 * h2(z1) + gamma1**2 * h11(z1)) muhat # In[3]: addo = sqrt((mu4 - sigma**4) / (n * sigma**4)) * z2 # this is s^2 in Johnson: sighat2 = (sigma**2) * (1 + addo) # use Taylor's theorem to express sighat^-1: invs = (sigma**(-1)) * (1 - (1/(2*sigma)) * addo) invs # In[4]: # the new statistic; it is v = part1 + part2 + part3 part1 = a0 part2 = (a1 + (1+a2)*muhat + a3 * muhat**2) * invs part3 = a4 * (muhat*invs)**2 v = part1 + part2 + part3 v # That's a bit hairy. Here I truncate that statistic in $n$. This was hard for me to figure out in sympy, so I took a limit. (I like how 'oo' is infinity in sympy.) # In[5]: #show nothing v_0 = limit(v,n,oo) v_05 = v_0 + (limit(sqrt(n) * (v - v_0),n,oo) / sqrt(n)) v_05 # Now we define the error as $v - \zeta$ and compute the approximate bias and variance of the error. We sum the variance and squared bias to get mean square error. # In[6]: staterr = v_05 - zeta # mean squared error of the statistic v, is # MSE = E((newstat - zeta)**2) # this is too slow, though, so evaluate them separately instead: bias = E(staterr) simplify(bias) # In[7]: # variance of the error: varerr = variance(staterr) MSE = (bias**2) + varerr collect(MSE,n) # That's really involved, and finding the derivative will be ugly. Instead we truncate at $n^{-1}$, which leaves us terms constant in $n$. Looking above, you will see that removing terms in $n^{-1}$ leaves some quantity squared. That is what we will minimize. The way forward is fairly clear from here. # In[8]: # truncate! MSE_0 = limit(collect(MSE,n),n,oo) MSE_1 = MSE_0 + (limit(n * (MSE - MSE_0),n,oo)/n) MSE_0 # Now we take the derivative of the Mean Square Error with respect to the $a_i$. In each case we will get an equation linear in all the $a_i$. The first order condition, which corresponds to minimizing the MSE, occurs for $a_i=0$. # In[9]: # a_0 simplify(diff(MSE_0,a0)) # In[10]: # a_1 simplify(diff(MSE_0,a1)) # In[11]: # a_2 simplify(diff(MSE_0,a2)) # In[12]: # a_3 simplify(diff(MSE_0,a3)) # In[13]: # a_4 simplify(diff(MSE_0,a4)) # To recap, the minimal MSE occurs for $a_0 = a_1 = a_2 = a_3 = a_4 = 0$. We must try another approach.