Material for a UC Irvine course offered by the Department of Physics and Astronomy.
Content is maintained on github and distributed under a BSD3 license.
Notes that are indented like this provide additional details on more advanced topics.
If you already have an older version of anaconda installed that you want to keep, the instructions below create a new environment that should not disrupt your previous work and you can probably continue with the instructions below after running:
conda update conda
This course assumes a basic familiarity with the core python language. If you are rusty or still learning, I recommend the free ebook A Whirlwind Tour of Python, which is "a fast-paced introduction to essential components of the Python language for researchers and developers who are already familiar with programming in another language".
If you are currently using python 2.x and reluctant to move to python 3, read this and this.
No previous experience with git or github is necessary for this course (but they are useful research tools so worth learning - here is a good starting point). If you are finding the git learning curve to be steep, you are not alone.
Clone the course material from github with the following command, which will create a subdirectory called :
git clone https://github.com/dkirkby/MachineLearningStatistics.git
This command may prompt you for your github username and password but you can streamline future github access using ssh.
You should now have a subdirectory called MachineLearningStatistics
. Enter it for the remaining steps:
cd MachineLearningStatistics
Any instructions below that assume you are in this subdirectory will be prefaced with the shell comment:
# cd MachineLearningStatistics/
We will use the conda command to create a standard python environment for this course.
Create a new environment using the following command at a shell prompt:
# cd MachineLearningStatistics/
conda env create -f environment.yml
This command may run for several minutes.
We are using python version 3.8 for this environment which, as of Apr 2021, is the anaconda default.
Activate the new environment using (this should add "(MLS)" to your command prompt, as a reminder of your current environment):
conda activate MLS
To "deactivate" this enironment and return to your default base environment, use:
conda deactivate
Older versions of conda used a different syntax to activate and deactivate an environment.
Install the tensorflow machine-learning framework:
pip install tensorflow
This pip command will only install tensorflow into your MLS environment (since it uses a version of pip that is local to the MLS environment).
If you are installing onto a linux or windows system with a CUDA-enabled GPU card, use instead:
pip install tensorflow-gpu
Install the pytorch machine-learning framework. The exact command is different for Mac and linux/windows. On a Mac, use:
conda install pytorch torchvision -c pytorch
On a windows or linux system, use:
conda install pytorch torchvision cpuonly -c pytorch
See here for alternate install commands for a linux or windows system with a CUDA-enabled GPU card.
Install the jax machine-learning framework:
pip install jax jaxlib
See here to install on a system with a CUDA-enabled GPU card.
Enable a jupyter notebook extension we will use for in-class exercises:
jupyter nbextension enable exercise2/main
In case something goes wrong with your installation and you want to start again, shutdown any jupyter sessions with the old environment, then use:
conda deactivate
conda remove --name MLS --all
Finally, install the course code and data into your new environment using:
# cd MachineLearningStatistics/
pip install .
To launch the notebook server at any time, you can now use:
# cd MachineLearningStatistics/
conda activate MLS
cd notebooks
jupyter notebook
We are not using the newer JupyterLab since it is not compatible with notebook extensions.
Click on Contents.ipynb
if this is your first time doing this, to check that you can view a notebook.
These instructions allow you to modify and run python code on your local computer from within your browser. If you just want to view these notebooks online, try this nbviewer link.
You can skip this section if you are installing MLS for the first time.
These instructions are only needed in case you need to update your local version of the MLS files, to synchronize with a change on github.
For git experts: you will normally be working on the master branch to simplify the workflow. This means that your local work must be discarded or saved to another branch each time you update, using the instructions below.
The first step is to "factory reset" your installation before getting the updates. The simplest method is to throw away any changes you have made using:
# cd MachineLearningStatistics/
git checkout master
git reset --hard
Alternatively, you can keep a permanent record of your changes in a git branch with a name of your choice, for example "08-Jan-2021":
# cd MachineLearningStatistics/
git checkout -b "08-Jan-2021"
git commit -a -m "Save work in progress"
git checkout master
The second step is to download the changes from github:
# cd MachineLearningStatistics/
git pull
If this commands reports Already up-to-date.
then there are no updates to download.
The final step is to update your local python environment:
# cd MachineLearningStatistics/
conda activate MLS
pip install . --upgrade