#!/usr/bin/env python # coding: utf-8 # # compositioncalc1.py # >code by Steven H. D. Haddock and Casey W. Dunn as described in: # >Practical Computing for Biologists # >by Steven H. D. Haddock and Casey W. Dunn # >published in 2011 by Sinauer Associates. # >ISBN 978-0-87893-391-4 # > # >[http://www.sinauer.com/practical-computing-for-biologists.html](http://www.sinauer.com/practical-computing-for-biologists.html) # >see [practicalcomputing.org](practicalcomputing.org) # > # >scripts freely available by the original authors at practicalcomputing.org # >DIRECT LINK: [http://practicalcomputing.org/files/pcfb_examples.zip](http://practicalcomputing.org/files/pcfb_examples.zip) # >Updated to Python 3 by Wayne Decatur # >#### posted as a Gist and IPython Notebook by Wayne (fomightez at GitHub) with full credit and reference to original code authors. # #### compositioncalc1.py calculates percent of the four bases (A,C,G,&T) in a DNA sequence.
The code: # In[1]: DNASeq = "ATGTCTCATTCAAAGCA" SeqLength = float(len(DNASeq)) BaseList = "ACGT" for Base in BaseList: Percent = 100 * DNASeq.count(Base) / SeqLength print ("%s: %4.1f" % (Base,Percent)) # **See the code in action and explore it interactively [here](https://www.pythonanywhere.com/gists/6076502/compositioncalc1.py/ipython2/).** # **Obtain a copy of this entire IPython Notebook [here](https://gist.github.com/fomightez/6077226) in order to explore it interactively.** # # # # # # # # # ###
Additional aid and exploration below: # In[2]: get_ipython().run_line_magic('whos', '') # The above special command lets us see what is defined and can be used in an IPython Notebook. # (For some reason it doesn't work for any of the initiating variables over in the interactive gist console.) # # # # _We can go ahead and define a function that will do this caculation:_ # In[3]: def calc(MyDNASeq): SeqLength = float(len(MyDNASeq)) BaseList = "ACGT" for Base in BaseList: Percent = 100 * MyDNASeq.count(Base) / SeqLength print ("%s: %4.1f" % (Base,Percent)) # _Then define a variable_ # In[4]: MyDNASeq="TTGGGGGGCGAAAA" # _Then we feed that variable to the function:_ # In[5]: calc(MyDNASeq) # _In fact, we can even skip the variable and directly input the sequence into the function:_ # In[6]: calc("TGTTTTTCTTTTTCCCCCCCAAAA")