import numpy as np
import matplotlib.pyplot as plt
from astropy.modeling.functional_models import AiryDisk2D
What's the minium distance between two objects which can individually resolved, and how do optical telescopes work?
There are a few key phrases that get used when discussing telescope design that are useful to define. First, we have the focal length ($F$), which is the distance from a lens to where an image is created. Second, we have the diameter of a lens, $D$. Often photographers refer to the focal ratio, $f$, which is $F/D$.
Now imagine we have 2 stars that are separated by an angle of $\theta$ on the sky, and a distance $d$ at the focal plane.
If we define the plate scale to be $s$ and that its units should be arcsec/mm, then the relation to $theta$ and $d$ is $$ \theta[{\rm arcsec}] = s[{\rm arcsec}/ {\rm mm}] \: \: d[ {\rm mm}] $$ Using the small angle approximation, we have that $$ \theta[{\rm rad}] = \frac{d}{F} $$ Converting $\theta$ from radians to arcsec and combining with the above gives that $$ s[{\rm arcsec}/ {\rm mm}] = \frac{206.265}{F[{\rm m}]}. $$
Recall from your optics lecture that a wave with wavelength $\lambda$ which passes through a slit of width $a$, where $a$>>$\lambda$, will produce a pattern given by: $$ I = I_0 \frac{\sin^2(\frac{\pi a \sin(\theta)}{\lambda})}{\left[\frac{\pi a \sin(\theta)}{\lambda}\right]^2} $$ This is shown in the below figure. This pattern has a minima at $\sin\theta=\frac{m\lambda}{a}$ where m is an integer. The interesting thing here is to think about what the mathematical relationship between the slit (which is a boxcar function) and the Intensity is. The Fourier transform of a boxcar is a sinc, and our function is a sinc$^2$. This is a branch of optics known as Fourier optics, which allows us to use Fourier methods to establish what various optical elements are doing.
If the slit is 2 dimensional, like the aperture of a telescope, then the pattern produce is this function rotated around it's axis, and is called an Airy Disc. In this case, the difference between the minima can still be given by $\sin\theta=\frac{m\lambda}{a}$, but m is no longer integer. The first minimum occurs when $m=1.22$ (the derivation is tough and beyond the scope of this current module).
So, if we have 2 sources in our image, they will both produce Airy Discs. In order to not have their primary peaks overlapping, we want their angular separation to be: $$ \sin\theta>\frac{1.22\lambda}{a} $$ or, for small angles $$ \theta>\frac{1.22\lambda}{a}. $$ This is the Rayleigh criteria, and is very important when considering what wavelength to observer your source at.
For a 10m aperture optical telescope (observing at $\lambda=5000$Å) we find that
$$ \theta=\frac{1.22\times0.5\mu m}{10m}=0.01{\rm''} $$Again, this is much smaller than the seeing we're used because of Earth's atmosphere. As such, it makes sense to put optical telescopes into space where they can make full use of their resolution.
For example, consider the HST. It's mirror is 2.5 meters in diameter so $\theta=0.05{\rm''}$.
x = np.arange(-5,10,0.05)
y = np.arange(-5,5,0.05)
X, Y = np.meshgrid(x, y)
disc_func = AiryDisk2D()
z1 = disc_func.evaluate(X,Y,amplitude=1,x_0=0,y_0=0,radius=1.22)
z2 = disc_func.evaluate(X,Y,amplitude=1,x_0=5,y_0=0,radius=1.22)
z3 = disc_func.evaluate(X,Y,amplitude=1,x_0=1,y_0=0,radius=1.22)
fig,ax = plt.subplots(nrows=1,ncols=3,figsize=[20,5])
ax[0].pcolormesh(x,y,z1**0.4,cmap='magma')
ax[1].pcolormesh(x,y,(z1+z2)**0.4,cmap='magma')
ax[2].pcolormesh(x,y,(z1+z3)**0.4,cmap='magma')
for axs in ax:
axs.set_xticks([])
axs.set_yticks([])
ax[0].set_title("Single Airy Disc")
ax[1].set_title("Resolved Airy discs")
ax[2].set_title("Unresovled Airy discs")
plt.subplots_adjust(wspace=0)
plt.savefig("Images/Airy_discs.png")
plt.show()
Now, consider the folllowing three wavelength regimes:
So, single dish radio telescopes have very poor resolution (we'll get to this next lecture), while X-ray imaging should have outstanding resolution. From our viewpoint on the Earth's atmosphere, we have access to a limited window, covering the optical window and the radiow window.
So how do we design optical telescopes to maximise their effectiveness? There are three optical designs which telescopes follow: Prime focus, Newtonian, and Cassigrian. Each of them is capable of having active supporting units under their primary mirrors, which allows mirrors to keep their shape when at an angle to Earth's gravitational field.
In combination to the optical design, there is also the question of mount design. There are two primary designs: equatorial, and Altitude-Azimuth design (Alt-Az). For equatorial telescopes, one of the axe's is aligned parllel to the Earth's rotational axis. This means that when the telescope is point at a star, it then only requires movement around one axis in order to track stars. Very common for small, commerical telescopes, but for large telescopes, they require huge counterweights, and so becomes expensive The Hale telescope for example: https://sites.astro.caltech.edu/palomar/about/telescopes/hale.html. It's a 5m telescope, but requires a Dome which is 41 m tall and 42 m in diameter
Alt-Az telescopes are significantly more compact in design. One axis is perpendicular to the Zenith, the otheris parallel to it. As such, they require motion around 2 axes in order to track stars. See, for example, the VLT (https://www.eso.org/public/images/potw1036a/) These 10m telescope only need a dome with is 30 m tall and 28 m in diameter.
The human eye has a very low quantum efficiency (QE) of about 1% (this means around 1 in every 100 photons is actually detected). At optical wavelengths, we typically use CCDs, or charged-coupled devices, for observations. This is due to their very high quantum efficiency, meaning their response is nearly linear - that is, the number of counts you observe is directly proportional to the intensity of light.
They also have very high dynamic ranges - each pixel in a CCD is capable of registering up to ~65,000 counts accurately. Typically, CCD's are Si based. As a photon interacts with a given pixel, it causes excitation of an electron into a conduction band. When finished observing, the accumulation of charge within a pixel can be shifted to an adjacent pixel. As such, an image is read out pixel-by-pixel, as shown below.
Now, assume the number of photons reaching our detector over a fixed interval of time is $N$, but the arrival time of each photon is randomly distributed. The probability of detecting $k$ photons over a fixed time interval is then given by a Poisson distribution $$ P(k)=\frac{N^k e^{-N}}{k!} $$
It is easy enough to show that for such a distribution, the standard deviation is given by $\sigma=\sqrt N$. For large N (>10), this becomes a Gaussian with $\sigma=\sqrt N$ (see below)
$$ P(k)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-(k-N)^2/2\sigma^2} $$As such, any process which involves a large number of events has an ideal error on it of $\sqrt N$ - this is called shot noise.
k = np.arange(0.0,30.0,1)
#For N = 3
N=3.0
P_k=[]
for k_i in k:
P_k.append(N**k_i*np.exp(-N)/np.math.factorial(k_i)) # Poisson Distribution
G_k=1/np.sqrt(2*np.pi*N)*np.exp(-(k-N)**2/(2*N)) # Gaussian Distribution
plt.figure(figsize=[4,3],dpi=150)
plt.plot(k,P_k,'C1-',label='Poisson')
plt.plot(k,G_k,'C0-',label='Gaussian')
#For N = 10
N=10.0
P_k=[]
for k_i in k:
P_k.append((N**k_i)*np.exp(-N)/np.math.factorial(k_i)) # Poisson Distribution
G_k=1/np.sqrt(2*np.pi*N)*np.exp(-(k-N)**2/(2*N)) # Gaussian Distribution
plt.plot(k,P_k,'C1--')
plt.plot(k,G_k,'C0--')
#For N = 15
N=15.0
P_k=[]
for k_i in k:
P_k.append((N**k_i)*np.exp(-N)/np.math.factorial(k_i)) # Poisson Distribution
G_k=1/np.sqrt(2*np.pi*N)*np.exp(-(k-N)**2/(2*N)) # Gaussian Distribution
plt.plot(k,P_k,'C1-.')
plt.plot(k,G_k,'C0-.')
plt.xlabel("k")
plt.ylabel("P(k)")
plt.title("Probability of observing k events for N=3,10,15")
plt.legend()
plt.tight_layout()
plt.savefig("Figures/Poisson_vs_Gaussian.jpg")
plt.show()