# execute only if you're using Google Colab
!wget -q https://raw.githubusercontent.com/ahug/amld-pytorch-workshop/master/binder/requirements.txt -O requirements.txt
!pip install -qr requirements.txt
import torch
print("PyTorch Version:", torch.__version__)
import numpy as np
Very similar to numpy framework (if that helps!)
A matrix is a grid of numbers, let's say (3x5). In simple terms, a tensor can be seen as a generalization of a matrix to higher dimension. It can be of arbitrary shape, e.g. (3 x 6 x 2 x 10).
For the start, you can think of tensors as multidimensional arrays.
X = torch.tensor([1, 2, 3, 4, 5])
X
X.shape
X = torch.tensor([[1, 2, 3], [4, 5, 6]])
X
X.shape
# numpy
np.eye(3)
# torch
torch.eye(3)
# numpy
5 * np.eye(3)
# torch
5 * torch.eye(3)
# numpy
np.ones(5)
# torch
torch.ones(5)
# numpy
np.zeros(5)
# torch
torch.zeros(5)
# numpy
np.empty((3, 5))
# torch
torch.empty((3, 5))
# numpy
X = np.random.random((5, 3))
X
# torch
Y = torch.rand((5, 3))
Y
# numpy
X.shape
# torch
Y.shape
torch.tensor
behaves like numpy arrays under mathematical operations. However, torch.tensor
additionally keeps track of the gradients (see next notebook) and provides GPU support.
X = np.random.rand(3, 5)
Y = torch.rand(3, 5)
# numpy (matrix multiplication)
X.T @ X
Y.shape
# torch (matrix multiplication)
Y.t() @ Y
Y.t().matmul(Y)
# CAUTION: Operator '*' does element-wise multiplication, just like in numpy!
# Y.t() * Y # error, dimensions do not match for element-wise multiplication
np.linalg.inv(X.T @ X)
torch.inverse(Y.t() @ Y)
np.arange(2, 10, 2)
torch.arange(2, 10, 2)
np.linspace(0, 1, 10)
torch.linspace(0, 1, 10)
Create the tensor:
$ \begin{bmatrix} 5 & 7 & 9 & 11 & 13 & 15 & 17 & 19 \end{bmatrix} $
# YOUR TURN
Each operation is also available as a function.
X = torch.rand(3, 2)
torch.exp(X)
X.exp()
X.sqrt()
(X.exp() + 2).sqrt() - 2 * X.log().sigmoid() # be creative :-)
Many more functions available: sin, cos, tanh, log, etc.
A = torch.eye(3)
A
A.add(5)
A
Functions that mutate (in-place) the passed object end with an underscore, e.g. add_, div_, etc.
A.add_(5)
A
A.div_(3)
A
A.uniform_() # fills the tensor with random uniform numbers in [0, 1]
A
Again, it works just like in numpy.
A = torch.randint(100, (3, 3))
A
A[0, 0]
A[2, 1]
A[1]
A[:, 1]
A[1:2, :], A[1:2, :].shape
A[1:, 1:]
A[:2, :2]
X = torch.tensor([1, 2, 3, 4])
X
X = X.repeat(3, 1) # repeat it 3 times along 0th dimension and 1 times along first dimension
X, X.shape
# equivalent of 'reshape' in numpy (view does not allocate new memory!)
Y = X.view(2, 6)
Y
Y = X.view(-1) # -1 tells PyTorch to infer the number of elements along that dimension
Y, Y.shape
Y = X.view(-1, 2)
Y, Y.shape
Y = X.view(-1, 4)
Y, Y.shape
Y = torch.ones(5)
Y, Y.shape
Y = Y.view(-1, 1)
Y, Y.shape
Y.expand(5, 5) # similar to repeat but does not actually allocate new memory
X = torch.eye(4)
Y = X[3:, :]
Y, Y.shape
Y = Y.squeeze() # removes all dimensions of size '1'
Y, Y.shape
Y = Y.unsqueeze(1)
Y, Y.shape
Create the tensor:
$ \begin{bmatrix} 7 & 5 & 5 & 5 & 5 \\ 5 & 7 & 5 & 5 & 5 \\ 5 & 5 & 7 & 5 & 5 \\ 5 & 5 & 5 & 7 & 5 \\ 5 & 5 & 5 & 5 & 7 \end{bmatrix} $
Hint: You can use matrix sum and scalar multiplication
# YOUR TURN
Create the tensor:
$ \begin{bmatrix} 4 & 6 & 8 & 10 & 12 \\ 14 & 16 & 18 & 20 & 22 \\ 24 & 26 & 28 & 30 & 32 \end{bmatrix}$
# YOUR TURN
Create the tensor:
$ \begin{bmatrix} 2 & 2 & 2 & 2 & 2 \\ 4 & 4 & 4 & 4 & 4 \\ 6 & 6 & 6 & 6 & 6 \\ 8 & 8 & 8 & 8 & 8 \end{bmatrix} $
# YOUR TURN
X = torch.randint(10, (3, 4)).float()
X
X.sum()
X.sum().item()
X.sum(0) # colum-wise sum
X.sum(dim=1) # row-wise sum
X.mean()
X.mean(dim=1)
X.norm(dim=0)
Compute the norms of the row-vectors in matrix X without using torch.norm().
Remember: $$||\vec{v}||_2 = \sqrt{x_1^2 + x_2^2 + \dots + x_n^2}$$
Hint: X**2 computes the element-wise square.
X = torch.eye(4) + torch.arange(4).repeat(4, 1).float()
# YOUR TURN
# SOLUTION: tensor([3.8730, 4.1231, 4.3589, 4.5826]
X = torch.randint(100, (5, 3))
X
mask = (X > 25) & (X < 75)
mask
X[mask] # returns all elements matching the criteria in a 1D-tensor
mask.sum() # number of elements that fulfill the condition
(X == 25) | (X > 60)
Get the number of non-zeros in X
X = torch.tensor([[1, 0, 2], [0, 6, 0]])
# YOUR TURN
Compute the sum of all entries in X that are larger than the mean of all values in X.
# YOUR TURN
x = torch.Tensor([[0,1,2], [3,4,5]])
print("x.shape: \n%s\n" % (x.shape,))
print("x.size(): \n%s\n" % (x.size(),))
print("x.size(1): \n%s\n" % x.size(1))
print("x.dim(): \n%s\n" % x.dim())
print("x.dtype: \n%s\n" % x.dtype)
print("x.device: \n%s\n" % x.device)
The nonzero
function returns indices of the non zero elements.
x = torch.Tensor([[0,1,2], [3,4,5]])
print("x.nonzero(): \n%s\n" % x.nonzero())
# press tab to autocomplete
# x.
X = np.random.random((5,3))
X
# numpy ---> torch
Y = torch.from_numpy(X) # Y is actually a DoubleTensor (i.e. 64-bit representation)
Y
Y = torch.rand((2,4))
Y
# torch ---> numpy
X = Y.numpy()
X
Using GPU in pytorch is as simple as calling .cuda()
on your tensor.
But first, you may want to check:
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.is_available()
torch.cuda.device_count()
x = torch.Tensor([[1,2,3], [4,5,6]])
print(x)
Note : If you don't have Cuda on the machine, the following examples won't work
x.cuda(0)
print(x.device)
x = x.cuda(0)
print(x.device)
x = x.cuda(1)
print(x.device)
x = torch.Tensor([[1,2,3], [4,5,6]])
# This will generate an error since you cannot do operation on tensor that are not on the same device
x + x.cuda()
# YOUR TURN
These kinds of if statements used to be all over the place in people's pytorch code. Recently, a more flexible way was introduced:
A torch.device
is an object representing the device on which a torch.tensor is or will be allocated.
You can easily move a tensor from a device to another by using the tensor.to()
function
cpu = torch.device('cpu')
cuda_0 = torch.device('cuda:0')
x = x.to(cpu)
print(x.device)
x = x.to(cuda_0)
print(x.device)
It can be more flexible since you can check if cuda exists only once in your code
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
x = x.to(device) # We don't need to care anymore about whether cuda is available or not
print(x.device)
How much faster is GPU ? See for yourself ...
A = torch.rand(100, 1000, 1000)
B = A.cuda(1)
A.size()
%timeit -n 3 torch.bmm(A, A)
%timeit -n 30 torch.bmm(B, B)