Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.
To use Numpy, you first need to import the numpy
package, with the optional as np
that declares the alias.
import numpy as np
A numpy
array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
We can initialize numpy arrays from nested Python lists, and access elements using square brackets:
a = np.array([1, 2, 3]) # this creates a 1d array
b = np.array([[4, 5, 6], [7, 8, 9]]) # this creates a 2d array
type(a), type(b)
(numpy.ndarray, numpy.ndarray)
Usual slicing by :
also applies for numpy
ndarray elements.
print(a[0]) # accessing an element in a 1d array
print(a[1:]) # accessing elements in a 1d array
print(b[1,1]) # accessing an element in a 2d array
print(b[:,::2]) # accessing elements in a 2d array
1 [2 3] 8 [[4 6] [7 9]]
You can also change its elements.
b[1,1] = -8
print(b)
[[ 4 5 6] [ 7 -8 9]]
Be careful. The following three are all different, though they contain the same numbers. Using 1d arrays for defining vectors, for example $x\in\mathbb{R}^n$, is desirable.
a_1 = np.array([1, 2, 3]) # this creates a 1d array
a_2 = np.array([[1, 2, 3]]) # this creates a 2d array, which is different from the above
a_3 = np.array([[1], [2], [3]]) # this creates a 2d array, which is different from the above
print(a_1)
print(a_2)
print(a_3)
[1 2 3] [[1 2 3]] [[1] [2] [3]]
print(a_1.shape)
print(a_2.shape)
print(a_3.shape)
(3,) (1, 3) (3, 1)
Numpy module also provides many functions to create arrays.
a = np.ones(5) # this creates a 1d array with all ones
print(a)
print(a.shape)
[1. 1. 1. 1. 1.] (5,)
b = np.ones((5,3)) # this creates a 2d array with all ones
print(b)
print(b.shape)
[[1. 1. 1.] [1. 1. 1.] [1. 1. 1.] [1. 1. 1.] [1. 1. 1.]] (5, 3)
np.zeros((3,2)) # this creates a 2d array with all zeros
array([[0., 0.], [0., 0.], [0., 0.]])
np.eye(4) # this creates a 4x4 identity matrix
array([[1., 0., 0., 0.], [0., 1., 0., 0.], [0., 0., 1., 0.], [0., 0., 0., 1.]])
When creating arrays, you can specify data types.
np.eye(4, dtype=int) # this creates an integer identity matrix
array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
Matrix transpose.
x = np.random.randn(4,2)
print (x)
print (x.T)
[[-1.36184302 1.47140008] [-0.30969926 1.64302166] [ 0.76424765 -0.29306326] [ 0.23602831 -0.36898313]] [[-1.36184302 -0.30969926 0.76424765 0.23602831] [ 1.47140008 1.64302166 -0.29306326 -0.36898313]]
numpy.random
)¶Arrays can be filled with random variables.
Specifying the seed is optional; it can be called for repeatibility in your results.
np.random.seed(0)
np.random.rand(1,2)
array([[0.5488135 , 0.71518937]])
The following creates a random 1d array filled with uniform random samples from [0,1).
np.random.rand(5)
array([0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721])
or the following creates a $5\times 4$ 2d array filled with random samples from standard normal distribution.
np.random.randn(5,4)
array([[ 1.26611853, -0.50587654, 2.54520078, 1.08081191], [ 0.48431215, 0.57914048, -0.18158257, 1.41020463], [-0.37447169, 0.27519832, -0.96075461, 0.37692697], [ 0.03343893, 0.68056724, -1.56349669, -0.56669762], [-0.24214951, 1.51439128, -0.3330574 , 0.04736482]])
You can check for yourself if you don't believe me.
A = np.random.randn(1000000) # generates a million random samples from standard normal distribution
print(np.mean(A)) # computes mean of A
print(np.std(A)) # computes standard deviation of A
0.001303836981544996 1.0001491820084765
Below looks a bit more complicated, but still pretty intuitive.
np.random.randint(12, size=(10,2)) # generate 10 random nonnegative integers less than 12
array([[ 6, 4], [10, 6], [ 1, 11], [ 8, 10], [ 3, 1], [ 8, 1], [ 1, 11], [ 7, 11], [ 2, 7], [11, 6]])
This too.
np.random.randint(12, size=(4,6))
array([[ 8, 10, 7, 3, 3, 1], [ 2, 0, 8, 8, 4, 9], [11, 0, 11, 6, 0, 1], [ 4, 7, 5, 11, 3, 10]])
numpy.linalg
)¶A variety of linear algebra functions are provided in numpy.linalg
submodule.
The following computes the dot product of the two vectors in $a,b\in\mathbb{R}^{100}$, for example, $c=a^Tb$.
a = np.random.randn(100)
b = np.random.randn(100)
c = np.dot(a,b)
print(f'a: {a}')
print(f'b: {b}')
print(f'c: {c}')
a: [-1.9009238 0.28935549 -1.28437573 0.17780921 -1.92589813 0.43802726 0.37781957 -0.08647453 -1.06238588 0.46165237 1.296416 1.84641107 0.67483867 0.41427821 1.96942914 -0.83681911 -1.63608965 2.42771526 1.99936621 2.34797529 -0.5755259 -0.16876416 1.83576466 -0.16462552 -0.52701926 0.87596903 1.68727136 -0.9951795 1.00509845 -1.33275259 0.77301917 -0.64882116 -0.02091705 -1.71040392 -1.54878897 0.64700171 -0.33910418 -1.03807702 1.51193532 -0.11534194 0.54318385 1.40493927 0.01721471 -1.31887513 -0.27496354 0.61146012 0.01442526 -1.88102142 -0.0270724 -0.34359156 0.68952659 -0.8445424 -0.48438966 0.63409559 -0.31550132 -0.12581501 1.30986638 -1.95836066 -1.70325382 -0.51544332 0.35293091 -0.1427622 0.18339413 0.34349156 0.22517099 -1.5019687 -0.85900378 -0.5454091 -0.1810538 -0.38693739 1.02462647 0.19025948 0.78362134 2.97775511 0.58458589 0.77620586 -0.17438896 0.27328699 -1.79178769 -1.59928359 1.14611776 0.75215148 -1.96045463 -0.55303336 1.5051253 0.0406573 -0.48492032 1.07164754 -1.05458335 0.90964077 -2.67170598 -1.19189517 -1.94452581 -0.77592691 -1.53618178 -0.09778313 -0.2652586 1.6973941 1.83324467 -0.54164549] b: [ 1.8365915 0.17627841 -0.85977547 -0.82963771 -0.02134992 0.1659182 0.85760683 -1.01746145 0.18170317 -1.99180384 0.67804282 -0.61247192 -1.54066382 0.47028491 0.96914251 -0.2595392 -0.41291905 -0.0072431 -0.85847594 1.19571615 -1.31360737 0.46777281 0.91882365 0.60731274 1.37837684 -0.14308472 0.45472133 0.23403994 -3.12368094 -1.61458509 0.27701666 0.90695153 -0.64504311 0.89463209 0.70092891 0.00517023 -0.36834985 -0.22692995 0.28904216 -1.63460217 0.86267484 -0.75342835 1.2355697 -0.53850669 0.06830749 2.06096587 -1.16481003 0.55673546 0.59021362 -0.32343365 -0.49632719 -1.1366272 -0.45412708 0.89353374 0.20700112 -0.1389855 1.29390811 0.35338032 0.67559277 -0.71832563 -0.50777414 -0.10777544 0.60455513 1.06482939 -0.02313094 -0.15772943 -0.36773971 0.24611642 -0.94591027 -0.40794174 2.5521659 0.4456354 -0.95778694 -1.39748622 -0.79821171 0.76147543 0.82631412 -0.82814167 1.48352868 -1.5716616 -0.72630572 0.1590967 1.10017665 -0.21762467 0.52991978 1.26717238 0.01467085 1.20580692 -0.98941763 0.89692896 -0.86599882 0.55796482 0.02122712 0.87294012 0.31450078 0.47340551 1.44545047 -0.16223358 -0.86079762 -0.08157982] c: -1.42294678374963
If $A$ and $B$ were matrices, it means the matrix multiplication. In the example below, $A\in\mathbb{R}^{100\times 50}$, $B\in\mathbb{R}^{50\times 20}$, and $C = AB \in\mathbb{R}^{100\times 20}$.
A = np.random.randn(100,50)
B = np.random.randn(50,20)
C = np.dot(A,B)
print(C.shape)
(100, 20)
Matrix multiplication can be done by @
operator, from Python 3.
C2 = A@B
print(np.max(abs(C-C2))) # check if C and C2 are equal
0.0
Or if the second one is a vector, it also makes sense. Say, $A\in\mathbb{R}^{100\times 50}$, $b\in\mathbb{R}^{50}$, and $C = Ab \in\mathbb{R}^{100}$.
A = np.random.randn(100,50)
b = np.random.randn(50)
C = np.dot(A,b)
print(C.shape)
(100,)
C2 = A@b
print(np.max(abs(C-C2))) # check if C and C2 are equal
0.0
You might remember that there was an easier way to define the size of vectors (or matrices). The norms. The numpy.linalg
submodule presents some member functions computing them.
First of all, numpy.linalg.norm(x)
without any other additional argument, it means the 2-norm of $x$,
x = np.arange(-6,5,2)
print(x)
[-6 -4 -2 0 2 4]
np.linalg.norm(x)
8.717797887081348
This is equal to
np.linalg.norm(x,2)
8.717797887081348
so the second argument explains which norm are you interested in.
If you specify which, say numpy.linalg.norm(x,1)
, this implies the 1-norm of $x$, which is given by
np.linalg.norm(x,1)
18.0
which is nothing but
sum(abs(x))
18
and numpy.linalg.norm(x,np.inf)
stands for the infinity norm,
np.linalg.norm(x,np.inf)
6.0
which is equal to
np.max(abs(x))
6
Things get a tiny bit complicated if you are talking about the size of matrices.
An easy-to-understand matrix norm is probably the Frobenius norm defined by
$$ \|A\|_F = \left( \sum_{i}\sum_{j} |A_{ij}|^2 \right)^{1/2} $$A = np.array([[1,2],[3,4]])
norm_A_fro = np.linalg.norm(A,'fro')
print(norm_A_fro**2)
30.0
The 2-norm of a matrix $A$ is something more complicated; it is defined by
$$ \|A\|_2 = \sup_{\|x\|\ne0}\frac{\|Ax\|_2}{\|x\|_2} = \sigma_1(A) $$where $\sigma_1(A)$ stands for the largest singular value of $A$. This implies nothing but the maximum gain achievable from the linear mapping defined by $x \mapsto Ax$.
A = np.random.randn(10,5)
np.linalg.norm(A,2)
4.492111518378933
The singular value decomposition of $A$ is such that $$ A = U\Sigma V^T $$ with orthogonal $U$, $V$, and diagonal $\Sigma$. The diagonals of $\Sigma$ are called the singular values, and the orthogonal columns of $U$ and $V$ are corresponding output and input singular vectors.
U, S, VT = np.linalg.svd(A, full_matrices=False) # singular value decomposition of A
print(S) # print singular values of A
print(np.linalg.norm(U@np.diag(S)@VT-A)) # check if A=U*S*VT
[4.49211152 3.46800403 3.10044635 2.43457361 1.49252417] 4.8580632841451e-15
There are lots of different complicated norms which we won't talk about here. However one thing you will want to remember about the norms is that $\|x\|$ (any norm) being small implies that $x$ is small in some sense.
Most of you will be more familiar with eigenvalues than the singular values.
The eigenvalues and the eigenvectors can be found by numpy.linalg.eig()
A = np.random.randn(3,3)
D, Q = np.linalg.eig(A) # eigenvalues and eigenvectors of A
print(D) # eigenvalues of A
print(Q) # eigenvectors of A
[0.96313763+0.j 0.32569092+0.49397679j 0.32569092-0.49397679j] [[-0.97590063+0.j 0.66077361+0.j 0.66077361-0.j ] [ 0.10413783+0.j 0.3245128 -0.16154894j 0.3245128 +0.16154894j] [ 0.19176359+0.j -0.02675779+0.65670057j -0.02675779-0.65670057j]]
When $A$ is symmetric, these results correspond to the eigenvalue-eigenvector decomposition satisfying
$$ A = QDQ^T $$A = np.random.randn(10,10)
A = A + A.T # make A symmteric
D, Q = np.linalg.eig(A) # eigenvalues and eigenvectors of A
print(D) # print eigenvalues of A
print(np.linalg.norm(Q@np.diag(D)@Q.T-A)) # check if A=Q*D*Q.T
[-6.65809472 8.42846474 -3.87162692 5.82154576 4.93306479 3.54683678 2.30748122 -1.45080261 -0.78394975 -0.12886109] 1.8149887723345015e-14
One feature you will most frequently be using from numpy.linalg
is solving system of linear equations.
Suppose you are given $Ax=y$, with $A\in\mathbb{R}^{n\times n}$ and $y\in\mathbb{R}^n$ known, and you are to find $x\in\mathbb{R}^n$.
n = 1000;
x = np.random.randn(n);
A = np.random.randn(n,n)
y = np.dot(A,x)
x_hat = np.linalg.solve(A,y)
You can check that it's correct.
print(np.linalg.norm(x))
print(np.linalg.norm(x_hat-x))
31.908158783195507 1.6375161047167425e-10
x[0:10]
array([-0.51305379, 1.13880826, 1.31896878, -0.60254182, 1.08339105, -0.21921745, -1.4299178 , -0.97948736, -0.19321832, 2.65070217])
x_hat[0:10]
array([-0.51305379, 1.13880826, 1.31896878, -0.60254182, 1.08339105, -0.21921745, -1.4299178 , -0.97948736, -0.19321832, 2.65070217])
Broadcasting is a powerful mechanism that allows numpy
to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.
For example, suppose that we want to add a constant vector to each row of a matrix. We could do it like this:
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.random.randn(3) # random noise
y = np.zeros(x.shape) # creates a zero matrix with the same shape as x
print(f'v: {v}')
for i in range(x.shape[0]):
y[i, :] = x[i, :] + v
print(y)
v: [ 1.35548733 -0.50326222 -0.51925648] [[ 2.35548733 1.49673778 2.48074352] [ 5.35548733 4.49673778 5.48074352] [ 8.35548733 7.49673778 8.48074352] [11.35548733 10.49673778 11.48074352]]
This works; however when the matrix x
is very large, computing an explicit loop in Python could be slow. Note that adding the vector v
to each row of the matrix x
is equivalent to forming a matrix vv
by stacking multiple copies of v
vertically, then performing elementwise summation of x
and vv
. We could implement this approach like this:
vv = np.tile(v, (x.shape[0], 1)) # Stack 4 copies of v on top of each other
y = x + vv
print(y)
[[ 2.35548733 1.49673778 2.48074352] [ 5.35548733 4.49673778 5.48074352] [ 8.35548733 7.49673778 8.48074352] [11.35548733 10.49673778 11.48074352]]
The same result can be obtained by further simplified statement.
y = x + v
print(y)
[[ 2.35548733 1.49673778 2.48074352] [ 5.35548733 4.49673778 5.48074352] [ 8.35548733 7.49673778 8.48074352] [11.35548733 10.49673778 11.48074352]]
We will look at a couple more examples like this. Though they all look mathematically incorrect, they all work in python due to the broadcasting mechanism and will make your code more concise and faster.
# outer product
v = np.array([1,2,3]) # v has shape (3,)
w = np.array([4,5]) # w has shape (2,)
x = v.reshape(3,1)*w
print(x)
[[ 4 5] [ 8 10] [12 15]]
# adding a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
print(x+v)
[[2 4 6] [5 7 9]]
# adding a vector to each column of a matrix
print(x+w.reshape(2, 1))
[[ 5 6 7] [ 9 10 11]]