In this first notebook, we will introduce Tensors, which are the base element in PyTorch.
We will see different ways to create Tensors and check their properties. Then, we will briefly go through all the different kind of operations they support, such as mathematical operations, indexing, reshaping, expansion, masking, type conversion, etc.
Python-based scientific computing library, similar to NumPy.
Differentiates from NumPy in 3 main aspects:
import torch
import numpy as np
print("PyTorch Version:", torch.__version__)
A matrix is a grid of numbers, let's say (3x5).
In simple terms, a tensor can be seen as a generalization of a matrix to higher dimension.
It can be of arbitrary shape, e.g. (3 x 6 x 2 x 10).
You can think of tensors as multidimensional arrays.
X = torch.tensor([1, 2, 3, 4, 5])
X
X.shape
X = torch.tensor([[1, 2, 3], [4, 5, 6]])
X
X.shape
torch.tensor
behaves like numpy.array
under mathematical operations.
The syntax is very similar between the two libraries.
If you are familiar with NumPy, you can browse here to check what are NumPy functions equivalent in PyTorch.
For example:
np.eye(2)
torch.eye(2)
np.arange(1,5)
torch.arange(1,5)
As we said, torch.tensor
additionally keeps track of the computation graphs (see next notebook) and provides GPU support.
torch.zeros(5)
torch.ones(5)
torch.eye(3)
torch.empty((3, 5))
torch.rand((5, 3))
torch.arange(3, 9, 2)
torch.linspace(0, 1, 11)
Most of these creation operations have a _like
counterpart that creates a tensor of the same size, dtype and device as the given tensor.
A = torch.ones((2,3)) # A is a float tensor of size 2x3 located on cpu
B = torch.zeros_like(A) # So B is a float tensor of size 2x3 located on cpu
B
This list is not exhaustive but gives you an idea of the diversity of way to create a Tensor.
Create the tensor:
x = torch.Tensor([[0,1,2], [3,4,5]])
print("x.shape: \n%s\n" % (x.shape,))
print("x.size(): \n%s\n" % (x.size(),))
print("x.size(1): \n%s\n" % x.size(1))
print("x.dim(): \n%s\n" % x.dim())
print("x.numel(): \n%s\n" % x.numel())
print("x.dtype: \n%s\n" % x.dtype)
print("x.device: \n%s\n" % x.device)
The nonzero
function returns indices of the non zero elements.
x = torch.Tensor([[0,1,2], [3,4,5]])
print("x.nonzero(): \n%s\n" % x.nonzero())
Unlike in NumPy, there are two ways to performs most operations in PyTorch:
torch.op(tensor)
tensor.op()
X = torch.rand(3, 2)
torch.exp(X)
X.exp()
You can easily chain operators :
X.sqrt().std()
(X.exp() + 2).sqrt() - 2 * X.log().sigmoid() # be creative :-)
Many more functions are available: sin, cos, tanh, bmm, cumsum, dot, etc.
Compute the norms of the row-vectors in matrix X without using torch.norm()
.
Remember: $$||\vec{v}||_2 = \sqrt{x_1^2 + x_2^2 + \dots + x_n^2}$$
Hint: X**2
computes the element-wise square.
X = torch.eye(4) + torch.arange(4).repeat(4, 1).float()
# YOUR TURN
# SOLUTION: tensor([3.8730, 4.1231, 4.3589, 4.5826])
X = torch.rand(3, 2)
X
X.sum()
X.max()
X.mean(dim=1)
X.norm(p=1)
For $X_{i,j}$ , compute $Y$ such that $Y_j = \log \big[ \sum_j \exp (X_{i,j}) \big]$.
X = torch.eye(4) + torch.arange(4).repeat(4, 1).float()
# YOUR TURN
# SOLUTION: tensor([3.4938, 3.5797, 3.7817, 4.1852])
Y = torch.rand(2, 3)
Y
# Matrix multiplication
Y.t() @ Y
Y.t().matmul(Y)
# CAUTION: Operator '*' does element-wise multiplication, just like in numpy!
# Y.t() * Y # error, dimensions do not match for element-wise multiplication
torch.inverse(Y.t() @ Y)
Y = torch.rand(3, 3)
Y.det()
Y.eig()
Functions that mutate the object end with an underscore, e.g. add_, div_, etc.
A = torch.eye(3)
A
A.add(5)
A
A.add_(5)
A
A.uniform_() # fills the tensor with random uniform numbers in [0, 1]
A
A = A + 1 # After this operation, A is a new variable and memory has been copied
A += 1 # After this operation, A stayed the same variable and memory has been changed in place
Compare the outputs:
A = torch.ones(1)
A_before = A
A = A + 1
print(A, A_before)
A = torch.ones(1)
A_before = A
A += 1
print(A, A_before)
Again, it works just like in NumPy.
A = torch.randint(100, (3, 3))
A
A[0,2]
A[:, 1:2], A[:, 1:2].shape
Note: You can use ...
to mark any number of dimension
X = torch.randint(100, (5, 3))
X
mask = (X > 25) & (X < 75)
mask
X[mask] # returns all elements matching the criteria in a 1D-tensor
X[mask] = 0 # You can assign new values only to indices matching the condition:
X
X = torch.arange(40).view(5,8)
X
Extract this vector from the tensor X:
$ \begin{bmatrix} 17 & 19 & 21 & 23 \\ \end{bmatrix} $
# YOUR TURN
X = torch.tensor([[1., 0., 2.], [4., 6., 0.]])
Get the number of non-zeros in X
# YOUR TURN
Compute the sum of all entries in X that are larger than the mean of all values in X.
# YOUR TURN
View
view
is the equivalent of reshape
in NumPy, but view
does not allocate new memory: the output tensor shares the same data!
The number of arguments of view
will be the number of dimensions of the output tensor.
X = torch.tensor([1, 2, 3, 4, 5, 6])
X
Y = X.view(2, 3) # view tensor X on 2 dimensions, with a size 2 on dimension 1 and a size 3 on dimension 2
Y
Y = X.view(2, -1) ## -1 tells PyTorch to infer the number of elements along that dimension
Y, Y.shape
Expand
expand
creates a new view of the tensor with dimensions of size 1 expanded to a larger size. This does not allocate new memory either !
Y = torch.ones(5)
Y
Y.expand(5, 5)
Note: There also exists reshape
and repeat
functions in PyTorch. They work similarly to view
and expand
but do copy memory.
Squeeze and Unsqueeze
squeeze
removes all dimensions of size 1, while unsqueeze
adds a new dimension at a given position
X = torch.eye(4)
Y = X[:1]
Y, Y.shape
Y = Y.squeeze() # removes all dimensions of size '1'
Y, Y.shape
Y = Y.unsqueeze(1) # add a new dimension in position 1
Y, Y.shape
Flatten
We use flatten
to reshape the input into a lower-dimensional tensor.
X = torch.rand(3,2,4)
print(X.shape)
X
Y = X.flatten()
Y.shape, Y
Y = X.flatten(start_dim=1)
Y.shape, Y
Create the tensor:
$ \begin{bmatrix} 7 & 5 & 5 & 5 & 5 \\ 5 & 7 & 5 & 5 & 5 \\ 5 & 5 & 7 & 5 & 5 \\ 5 & 5 & 5 & 7 & 5 \\ 5 & 5 & 5 & 5 & 7 \end{bmatrix} $
Hint: You can use matrix sum and scalar multiplication
# YOUR TURN
Create the tensor:
$ \begin{bmatrix} 2 & 2 & 2 & 2 & 2 \\ 4 & 4 & 4 & 4 & 4 \\ 6 & 6 & 6 & 6 & 6 \\ 8 & 8 & 8 & 8 & 8 \end{bmatrix} $
# YOUR TURN
# press tab to autocomplete
x.
.to()
¶To cast a Tensor to a different type, we can use the Tensor.to(type)
function.
Y = 4 * torch.rand((2,4))
Y
Y.dtype
Y.to(torch.float16)
Y.to(torch.int64)
Y.to(torch.bool)
Note the automatic type promotion :
torch.LongTensor([1, 2]) + torch.FloatTensor([1.1, 2.2])
You can use .bool()
, .short()
, .int()
, .long()
, .float()
, .double()
to convert the tensor to the required type.
Y = 4 * torch.rand((2,4))
Y
Y.int()
Y.float()
X = np.random.random((5,3))
X
# numpy ---> torch
Y = torch.from_numpy(X) # Y is actually a DoubleTensor (i.e. 64-bit representation)
Y
Y = torch.rand((2,4))
Y
# torch ---> numpy
X = Y.numpy()
X
First, you may want to check:
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.is_available()
torch.cuda.device_count()
x = torch.Tensor([[1,2,3], [4,5,6]])
print(x)
torch.device
¶The best way to easily move a tensor from a device to another is again by using the Tensor.to(...)
function.
You need to pass as argument a torch.device
object.
A torch.device
is an object representing the device on which a torch.tensor is or will be allocated.
Note : If you don't have Cuda on the machine, the following examples won't work
cpu = torch.device('cpu')
cuda_0 = torch.device('cuda:0')
x = x.to(cpu)
print(x.device)
x = x.to(cuda_0)
print(x.device)
It is flexible since you can check if cuda exists only once in your code
Your Turn
Define the device to be a gpu if available or to fallback on cpu if not
device = None # YOUR TURN
x = x.to(device) # We don't need to worry anymore about whether cuda is available or not in the rest of the code
print(x.device)
Tensor.cuda
and Tensor.cpu
¶You may also see the use of the Tensor.cuda()
and Tensor.cpu()
functions.
x = torch.Tensor([[1,2,3], [4,5,6]])
print(x)
x.cuda()
print(x.device) # Note x is still on cpu
x = x.cuda()
print(x.device)
x = x.cuda(1) # This will fail if you have only one gpu
print(x.device)
x = x.cpu()
print(x.device)
x = torch.Tensor([[1,2,3], [4,5,6]])
# This will generate an error since you cannot do operation on tensor that are not on the same device
x + x.cuda()
In general, the more flexible Tensor.to(...)
function should be used preferably.