2020 Data Labs REU
Written by Sage Lichtenwalner, Rutgers University, June 9, 2020
Welcome to Python! In this notebook, we will demonstrate how you can quickly get started programming in Python, using Google's cool Colaboratory platform. Colab is basically a free service that can run Python/Jupyter notebooks in the cloud.
In this notebook, we will demonstrate some of the basics of programming Python. If you want to lean more, there are lots of other resources and training sessions out there, including the official Python Tutorial. But as an oceanographer, you don't really need to know all the ins-and-outs of programming (though it helps), especially when just starting out.
Over the next few sessions we will cover many of the basic recipes you need to:
Jupyter notebooks have two kids of cells. "Markdown" cells, like this one which can contain formatted text, and "Code" cells, which contain the code you will run.
To execute the code in a cell, you can either:
You can try all these options on our first very elaborate piece of code in the next cell.
After you execute the cell, the result will automatically display underneath the cell.
2+2
print("Hello, world!")
# This is a comment
As we go through the notebooks, you can add your own comments or text blocks to save your notes.
# Your Turn: Create your own print() command here with your name
print()
A note about print()
print()
command.print()
on each line.;
at the end.3
4
5
print(3)
print(4)
print(5)
Let's review a few basic features of programming.
First, it's great for math. You can use addition (+), subtraction (-), multiplication (*), division (/) and exponents (**).
# Your Turn: Try some math here
5*2
The order of operations is also important.
print(5 * 2 + 3)
print(5 * (2+3))
print((5 * 2) + 3)
# We can eailsy assign variables, just like in other languages
x = 4
y = 2.5
# And we can use them in our formulas
print(x + y)
print(x/y)
# What kind of objects are these?
print(type(x))
print(type(y))
# A string needs to be in quotes (single or double)
z = 'Python is great'
z
# You can't concatenate (add) strings and integers
print( z + x )
# But you can multiply them!
print( z * x )
# If you convert an integer into a string, you can then catenate them
print( z + ' ' + str(x) + ' you!' )
# A better way
print( 'Python is great %s you!' % x )
Remember, Python uses 0-based indexes, so to grab the first element in a list you actually use "0". The last element is n-1, or just "-1" for short. In Matlab this would be 1 to n, or 1:end.
my_list = [3, 4, 5, 9, 12, 13]
# The fist item
my_list[0]
# The last item
my_list[-1]
# Extract a subset
my_list[2:5]
# A subset from the end
my_list[-3:]
# Update a value
my_list[3] = 99
my_list
# Warning, Python variables are object references and not copies by default
my_second_list = my_list
print( my_second_list )
my_second_list[0] = 66
print( my_second_list )
print( my_list ) # The first list has been overwritten
# To avoid this, create a copy of the list, which keeps the original intact
my_list = [3, 4, 5, 9, 12]
my_second_list = list(my_list) # You can also use copy.copy() or my_list[:]
my_second_list[0] = 66
print( my_second_list )
print( my_list )
Note, a list is not an array by default. But we can turn it into an array using the NumPy library.
NumPy is an essential library for working with scientific data. It provides an array object that is very similar to Matlab's array functionality, allowing you to perform mathematical calculations or run linear algebra routines.
my_list * x
import numpy as np
a = np.array(my_list)
a * x
Note, we won't be explicitly creating NumPy arrays much in this course. But later on, when we load datasets using Pandas or Xarray, the actually arrays under the hood will be numpy arrays.
These are a great way to stored structured data of different types. You'll often find metadata information inside dictionaries.
my_dict = {'temperature': 21, 'salinity':35, 'sensor':'CTD 23'}
my_dict
# Grab a list of dictionary keys
my_dict.keys()
# Accessing a key/value pair
my_dict['sensor']
If you're familiar with how to do these in Matlab or R, it's all very similar, just with a different syntax.
Remember, Python uses spaces to group together sub-elements, rather than parentheses, curly braces, or end statements. Traditionally, you can use 2 or 4 spaces to indent lines.
def times_two(num):
return num * 2;
times_two(3)
def my_name(name='Sage'):
return name;
my_name()
Here one quick example that demonstrates how to define a function, use a conditional, and iterate over a for loop all at once.
# A more complicated function
def my_func(number):
print('Running my_func')
if type(number)==int:
for i in range(number):
print(i)
else:
print("Not a number")
my_func('Test')
my_func(4)
Now that we've covered some basics, let's start having some fun with actual ocean data.
The National Data Buoy Center (NDBC) provides a great dataset to start with. And for this example, we'll use my favorite buoy Station 44025.
To load datasets like this, there are 2 popular libraries we can use.
NDBC actually makes their data available in a variety of ways. Text files are often more intuitive. However, the NDBC text files require a few hoops to load a use (each file is a separate year, dates are in multiple columns, etc.).
Luckily, NDBC also provides a Thredds server DODS, which we can use to quickly load some data to play with.
import xarray as xr
!pip install netcdf4
data = xr.open_dataset('https://dods.ndbc.noaa.gov/thredds/dodsC/data/stdmet/44025/44025.ncml')
# The Dataset
data
# Let's look at one variable
data.air_temperature
# And one piece of metadata
data.air_temperature.long_name
# Now let's make a quick plot
data.air_temperature.plot();
# Let's subset the data in time
data2 = data.sel(time=slice('2019-01-01','2020-01-01'))
# Let's make that quick plot again
data2.air_temperature.plot();
import matplotlib.pyplot as plt
# We can even plot 2 variables on one graph
data2.air_temperature.plot(label="Air Temperature")
data2.sea_surface_temperature.plot(label="Sea Surface Temperature")
plt.legend();
Tomorrow, we'll delve a lot more into data visualization and many of the other plotting commands you can use. But now, it's your turn to create your own plots.
Try plotting different:
As you create your graphs, try to write figure captions that describe what you think is going on.
# Your Turn: Create some plots
2019 Data Labs Quick Intro to Pytyon