At the top of your python code, load Numpy like this:

```
import numpy as np
```

Now you can use numpy throughout your file via the `np`

object.

```
{tip}
1. Jupyter Lab's "Help" menu has a link to Numpy documentation.
2. You can access an element of a numpy array just like a list:
```py
x=np.arange(1,5,1)
x[1]
```

```
If the array is a matrix, `x[row,col]` works.
```

- Whirlwind has a more comprehensive dive into splitting, slicing, and other numpy operations. ```

```
{tip}
Copy this table into your cookbook notes folder.
```

Function | Description |
---|---|

`np.array([user defined list, or lists of lists])`

| creates an array or matrix
`np.ones(how many)`

and `np.ones([rows,cols])`

| same but all elements are 1
`np.zeros(how many)`

and `np.zeros([rows,cols])`

| same but all elements are 0
`np.arange(start,end,stepsize)`

| creates array, note that the array will not include any elements `>=end`

`np.linspace(from,to,# of elements)`

| creates array covering the range specified
`np.eye(#)`

| creates an identity matrix of size `#`

`np.concatenate([x, y])`

| combines arrays `x`

and `y`

`np.nan`

| is a NaN object (e.g. like a missing element in a data table)

**We will definitely use this in pandas**
`np.ceil(#)`

, `np.floor(#)`

| if #=3.4, ceil will return 4, and floor will return 3.
`np.max(x)`

, `np.min(x)`

, `np.average(x)`

, `np.median(x)`

| **many statistical operations work as you would expect**
`np.reshape(x,[rows,cols])`

| works like it looks
`np.random.<dist>`

| can draw random numbers from many distributions

use tab autocompletion to see all the options (type `np.random.`

and then hit TAB)

**YOU MUST NEVER EVER EVER EVER EVER DRAW RANDOM NUMBERS WITHOUT SETTING A SEED!!!**

```
{warning}
Let me repeat that: **YOU MUST NEVER EVER EVER EVER EVER DRAW RANDOM NUMBERS WITHOUT SETTING A SEED!!!**
If you don't, your code will produce different outputs every single time you run it. And other people will get different answers too!
And the point of code is that it is reproducible.
```

In [1]:

```
import numpy as np
np.set_printoptions(2) # just to control # of decimal places shown
np.random.seed(0) # this is how you set a seed
print("original random draw: ",np.random.rand(4))
print("now it's different: ",np.random.rand(4))
print("now it's different: ",np.random.rand(4))
np.random.seed(0)
print("now it's the same again: ",np.random.rand(4))
```

Because pandas is built on top of numpy, all of these numpy functions work on pandas objects.

Numpy 🤝 Pandas

`numpy`

¶- You can't vectorize every operation :(
- Numpy is a great solution for the issue of speed, but not for the issue of memory.

Numpy can be prohibitive, memory-wise: When you run an array operation, Python creates the entire array and puts it into memory, then runs it. A vector of length `1,000,000,000,000`

is huge and requires substantial memory to create. By contrast, you can execute `for i in range(1,000,000,000,000): pass`

without causing an issue, because Python **never created that vector, it just iterated over numbers**. This is because `range(#)`

is a "generator" and not an explicit object.