!date
Mon Apr 18 11:00:57 PDT 2016
%%bash
system_profiler SPSoftwareDataType
Software: System Software Overview: System Version: OS X 10.8.5 (12F45) Kernel Version: Darwin 12.6.0 Boot Volume: Macintosh HD Boot Mode: Normal Computer Name: genefish User Name: Sam (Sam) Secure Virtual Memory: Enabled Time since boot: 68 days 1:00
%%bash
system_profiler SPHardwareDataType | grep -v [SH][ea]
Model Name: Mac mini Model Identifier: Macmini6,2 Processor Name: Intel Core i7 Processor Speed: 2.3 GHz Number of Processors: 1 Total Number of Cores: 4 L2 Cache (per Core): 256 KB L3 Cache: 6 MB Memory: 16 GB Boot ROM Version: MM61.0106.B03 SMC Version (system): 2.8f0
%%bash
wget https://github.com/dereneaton/pyrad/archive/3.0.66.tar.gz
--2016-04-18 11:06:56-- https://github.com/dereneaton/pyrad/archive/3.0.66.tar.gz Resolving github.com... 192.30.252.128 Connecting to github.com|192.30.252.128|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://codeload.github.com/dereneaton/pyrad/tar.gz/3.0.66 [following] --2016-04-18 11:06:57-- https://codeload.github.com/dereneaton/pyrad/tar.gz/3.0.66 Resolving codeload.github.com... 192.30.252.160 Connecting to codeload.github.com|192.30.252.160|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/x-gzip] Saving to: '3.0.66.tar.gz' 0K .......... .......... .......... .......... .......... 316K 50K .......... .......... .... 87.9M=0.2s 2016-04-18 11:06:57 (469 KB/s) - '3.0.66.tar.gz' saved [76100]
%%bash
tar -xzf 3.0.66.tar.gz
ls
3.0.66.tar.gz Documents/ Library/ Music/ Pictures/ Untitled0.ipynb Desktop/ Downloads/ Movies/ PE-GBS_empirical.ipynb Public/ pyrad-3.0.66/
cd pyrad-3.0.66/
/Users/Sam/pyrad-3.0.66
%%bash
python setup.py install
running install
error: can't create or remove files in install directory The following error occurred while trying to add or remove files in the installation directory: [Errno 13] Permission denied: '/usr/local/bioinformatics/anaconda/lib/python2.7/site-packages/test-easy-install-83059.write-test' The installation directory you specified (via --install-dir, --prefix, or the distutils default setting) was: /usr/local/bioinformatics/anaconda/lib/python2.7/site-packages/ Perhaps your account does not have write access to this directory? If the installation directory is a system-owned directory, you may need to sign in as the administrator or "root" account. If you do not have administrative access to this machine, you may wish to choose a different installation directory, preferably one that is listed in your PYTHONPATH environment variable. For information on other options, you may wish to consult the documentation at: https://pythonhosted.org/setuptools/easy_install.html Please make the appropriate changes for your system and try again.
pyrad -h
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-10-f75d644a6021> in <module>() ----> 1 pyrad -h NameError: name 'pyrad' is not defined
cd /usr/local/bin
/usr/local/bin
%%bash
wget http://www.drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86darwin64.tar.gz
--2016-04-18 11:18:20-- http://www.drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86darwin64.tar.gz Resolving www.drive5.com... 205.196.220.130 Connecting to www.drive5.com|205.196.220.130|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 166130 (162K) [application/x-tar] Saving to: 'muscle3.8.31_i86darwin64.tar.gz' 0K .......... .......... .......... .......... .......... 30% 391K 0s 50K .......... .......... .......... .......... .......... 61% 386K 0s 100K .......... .......... .......... .......... .......... 92% 777K 0s 150K .......... .. 100% 5.86M=0.3s 2016-04-18 11:18:22 (501 KB/s) - 'muscle3.8.31_i86darwin64.tar.gz' saved [166130/166130]
%%bash
tar -xzf muscle3.8.31_i86darwin64.tar.gz
%%bash
wget https://github.com/torognes/vsearch/releases/download/v1.11.1/vsearch-1.11.1-osx-x86_64.tar.gz
--2016-04-18 11:22:27-- https://github.com/torognes/vsearch/releases/download/v1.11.1/vsearch-1.11.1-osx-x86_64.tar.gz Resolving github.com... 192.30.252.131 Connecting to github.com|192.30.252.131|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://github-cloud.s3.amazonaws.com/releases/19848353/5a974bd0-018b-11e6-810c-daa981b4e335.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20160418%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20160418T182228Z&X-Amz-Expires=300&X-Amz-Signature=85098bc4bc106568a61bbc8d43517659e99f54919d3ca6ac414e43d772434528&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Dvsearch-1.11.1-osx-x86_64.tar.gz&response-content-type=application%2Foctet-stream [following] --2016-04-18 11:22:28-- https://github-cloud.s3.amazonaws.com/releases/19848353/5a974bd0-018b-11e6-810c-daa981b4e335.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20160418%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20160418T182228Z&X-Amz-Expires=300&X-Amz-Signature=85098bc4bc106568a61bbc8d43517659e99f54919d3ca6ac414e43d772434528&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Dvsearch-1.11.1-osx-x86_64.tar.gz&response-content-type=application%2Foctet-stream Resolving github-cloud.s3.amazonaws.com... 54.231.81.72 Connecting to github-cloud.s3.amazonaws.com|54.231.81.72|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 274585 (268K) [application/octet-stream] Saving to: 'vsearch-1.11.1-osx-x86_64.tar.gz' 0K .......... .......... .......... .......... .......... 18% 367K 1s 50K .......... .......... .......... .......... .......... 37% 729K 0s 100K .......... .......... .......... .......... .......... 55% 368K 0s 150K .......... .......... .......... .......... .......... 74% 54.4M 0s 200K .......... .......... .......... .......... .......... 93% 744K 0s 250K .......... ........ 100% 79.1M=0.4s 2016-04-18 11:22:29 (656 KB/s) - 'vsearch-1.11.1-osx-x86_64.tar.gz' saved [274585/274585]
%%bash
tar -xzf vsearch-1.11.1-osx-x86_64.tar.gz
cd ..
/usr/local/bin
ls
aclocal@ genOSAuth* imkdir* itrim* runQuota.r aclocal-1.15@ genomeCoverageBed* imv* iuserinfo* showCore.ir annotateBed* getOverlap* intersectBed* ixmsg* shuffleBed* autoconf@ groupBy* ipasswd* linksBed* slopBed* autoheader@ hg* ipcluster* lzcat@ sortBed* autom4te@ iadmin* ipcontroller* lzcmp@ subtractBed* automake@ ibun* ipengine* lzdiff@ tagBam* automake-1.15@ icd* iphybun* lzegrep@ tclsh8.6* autoreconf@ ichksum* iphymv* lzfgrep@ twdiff@ autoscan@ ichmod* iplogger* lzgrep@ twfind@ autoupdate@ icp* ips* lzless@ unionBedGraphs* bamToBed* idbo* iptest* lzma@ unlzma@ bamToFastq* idbug* iput* lzmadec@ unxz@ bed12ToBed6* ienv* ipwd* lzmainfo@ v1.11.1.tar.gz bedToBam* ierror* ipython* lzmore@ vsearch* bedToIgv* iexecmd* iqdel* makedepend@ vsearch-1.11.1/ bedpeToBam* iexit* iqmod* mapBed* wget@ bedtools* ifnames@ iqstat* maskFastaFromBed* windowBed* brew* ifsck* iquest* mergeBed* windowMaker* chgCoreToCore1.ir iget* iquota* multiBamCov* wish8.6* chgCoreToCore2.ir igetbyticket.pl* ireg* multiIntersectBed* xz@ chgCoreToOrig.ir igetwild.sh* irepl* muscle3.8.31_i86darwin64* xzcat@ closestBed* igroupadmin* irm* muscle3.8.31_i86darwin64.tar.gz xzcmp@ clusterBed* ihelp* irmtrash* nucBed* xzdec@ complementBed* iinit* irods3.2.icmds.mac.intel/ pairToBed* xzdiff@ coverageBed* ilocate* irsync* pairToPair* xzegrep@ delUnusedAVUs.ir ils* irule* pkg-config@ xzfgrep@ edit@ ilsresc* irunner* pycolor* xzgrep@ expandCols* imcoll* iscan* pyrad-3.0.66/ xzless@ fastaFromBed* imeta* isysmeta* randomBed* xzmore@ flankBed* imiscsvrinfo* iticket* runQuota.ir
%%bash
rm muscle3.8.31_i86darwin64.tar.gz v1.11.1.tar.gz
%%bash
git clone https://github.com/xflouris/PEAR.git
Cloning into 'PEAR'...
cd PEAR/
/usr/local/bin/PEAR
%%bash
./autogen.sh
configure.ac:15: installing './compile' configure.ac:6: installing './install-sh' configure.ac:6: installing './missing' src/Makefile.am: installing './depcomp'
%%bash
./configure
checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... ./install-sh -c -d checking for gawk... no checking for mawk... no checking for nawk... no checking for awk... awk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking for log in -lm... yes checking for main in -lpthread... yes checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for stdint.h... (cached) yes checking for stdlib.h... (cached) yes checking for string.h... (cached) yes checking for inline... inline checking for size_t... yes checking for uint32_t... yes checking for stdlib.h... (cached) yes checking for GNU libc compatible malloc... yes checking for working strtod... yes checking for memmove... yes checking for pow... yes checking for strtol... yes checking for library containing pthread_create... none required checking for library containing BZ2_bzCompress... -lbz2 checking for library containing zlibVersion... -lz checking pthread.h usability... yes checking pthread.h presence... yes checking for pthread.h... yes checking bzlib.h usability... yes checking bzlib.h presence... yes checking for bzlib.h... yes checking zlib.h usability... yes checking zlib.h presence... yes checking for zlib.h... yes checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating src/Makefile config.status: creating man/Makefile config.status: creating src/config.h config.status: executing depfiles commands
%%bash
make
Making all in src make all-am gcc -DHAVE_CONFIG_H -I. -O3 -fomit-frame-pointer -funroll-loops -Wall -Wredundant-decls -Wreturn-type -Wswitch-default -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wimport -Wunused-label -Wno-int-to-pointer-cast -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wpointer-sign -Wextra -Wredundant-decls -Wunused -Wunused-function -Wunused-parameter -Wunused-value -Wunused-variable -Wformat -Wformat-nonliteral -Wparentheses -Wsequence-point -Wuninitialized -Wundef -O3 -MT pear-pear-pt.o -MD -MP -MF .deps/pear-pear-pt.Tpo -c -o pear-pear-pt.o `test -f 'pear-pt.c' || echo './'`pear-pt.c mv -f .deps/pear-pear-pt.Tpo .deps/pear-pear-pt.Po gcc -DHAVE_CONFIG_H -I. -O3 -fomit-frame-pointer -funroll-loops -Wall -Wredundant-decls -Wreturn-type -Wswitch-default -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wimport -Wunused-label -Wno-int-to-pointer-cast -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wpointer-sign -Wextra -Wredundant-decls -Wunused -Wunused-function -Wunused-parameter -Wunused-value -Wunused-variable -Wformat -Wformat-nonliteral -Wparentheses -Wsequence-point -Wuninitialized -Wundef -O3 -MT pear-args.o -MD -MP -MF .deps/pear-args.Tpo -c -o pear-args.o `test -f 'args.c' || echo './'`args.c mv -f .deps/pear-args.Tpo .deps/pear-args.Po gcc -DHAVE_CONFIG_H -I. -O3 -fomit-frame-pointer -funroll-loops -Wall -Wredundant-decls -Wreturn-type -Wswitch-default -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wimport -Wunused-label -Wno-int-to-pointer-cast -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wpointer-sign -Wextra -Wredundant-decls -Wunused -Wunused-function -Wunused-parameter -Wunused-value -Wunused-variable -Wformat -Wformat-nonliteral -Wparentheses -Wsequence-point -Wuninitialized -Wundef -O3 -MT pear-statistics.o -MD -MP -MF .deps/pear-statistics.Tpo -c -o pear-statistics.o `test -f 'statistics.c' || echo './'`statistics.c mv -f .deps/pear-statistics.Tpo .deps/pear-statistics.Po gcc -DHAVE_CONFIG_H -I. -O3 -fomit-frame-pointer -funroll-loops -Wall -Wredundant-decls -Wreturn-type -Wswitch-default -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wimport -Wunused-label -Wno-int-to-pointer-cast -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wpointer-sign -Wextra -Wredundant-decls -Wunused -Wunused-function -Wunused-parameter -Wunused-value -Wunused-variable -Wformat -Wformat-nonliteral -Wparentheses -Wsequence-point -Wuninitialized -Wundef -O3 -MT pear-reader.o -MD -MP -MF .deps/pear-reader.Tpo -c -o pear-reader.o `test -f 'reader.c' || echo './'`reader.c mv -f .deps/pear-reader.Tpo .deps/pear-reader.Po gcc -O3 -fomit-frame-pointer -funroll-loops -Wall -Wredundant-decls -Wreturn-type -Wswitch-default -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Wimport -Wunused-label -Wno-int-to-pointer-cast -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wpointer-sign -Wextra -Wredundant-decls -Wunused -Wunused-function -Wunused-parameter -Wunused-value -Wunused-variable -Wformat -Wformat-nonliteral -Wparentheses -Wsequence-point -Wuninitialized -Wundef -O3 -o pear pear-pear-pt.o pear-args.o pear-statistics.o pear-reader.o -lz -lbz2 -lpthread -lm Making all in man make[1]: Nothing to be done for `all'. make[1]: Nothing to be done for `all-am'.
reader.c:967: warning: ‘do_cpuid’ defined but not used
%%bash
make install
Making install in src .././install-sh -c -d '/usr/local/bin' /usr/bin/install -c pear '/usr/local/bin' make[2]: Nothing to be done for `install-data-am'. Making install in man make[2]: Nothing to be done for `install-exec-am'. .././install-sh -c -d '/usr/local/share/man/man1' /usr/bin/install -c -m 644 pear.1 '/usr/local/share/man/man1' make[2]: Nothing to be done for `install-exec-am'. make[2]: Nothing to be done for `install-data-am'.
install: /usr/local/bin/pear: Inappropriate file type or format
cd
/Users/Sam
%%bash
mkdir -p analysis_pyrad
%%bash
pyrad -n
new params.txt file created
cat params.txt
==** parameter inputs for pyRAD version 3.0.66 **======================== affected step == ./ ## 1. Working directory (all) ./*.fastq.gz ## 2. Loc. of non-demultiplexed files (if not line 18) (s1) ./*.barcodes ## 3. Loc. of barcode file (if not line 18) (s1) vsearch ## 4. command (or path) to call vsearch (or usearch) (s3,s6) muscle ## 5. command (or path) to call muscle (s3,s7) TGCAG ## 6. Restriction overhang (e.g., C|TGCAG -> TGCAG) (s1,s2) 2 ## 7. N processors (parallel) (all) 6 ## 8. Mindepth: min coverage for a cluster (s4,s5) 4 ## 9. NQual: max # sites with qual < 20 (or see line 20)(s2) .88 ## 10. Wclust: clustering threshold as a decimal (s3,s6) rad ## 11. Datatype: rad,gbs,pairgbs,pairddrad,(others:see docs)(all) 4 ## 12. MinCov: min samples in a final locus (s7) 3 ## 13. MaxSH: max inds with shared hetero site (s7) c88d6m4p3 ## 14. Prefix name for final output (no spaces) (s7) ==== optional params below this line =================================== affected step == ## 15.opt.: select subset (prefix* only selector) (s2-s7) ## 16.opt.: add-on (outgroup) taxa (list or prefix*) (s6,s7) ## 17.opt.: exclude taxa (list or prefix*) (s7) ## 18.opt.: loc. of de-multiplexed data (s2) ## 19.opt.: maxM: N mismatches in barcodes (def= 1) (s1) ## 20.opt.: phred Qscore offset (def= 33) (s2) ## 21.opt.: filter: def=0=NQual 1=NQual+adapters. 2=strict (s2) ## 22.opt.: a priori E,H (def= 0.001,0.01, if not estimated) (s5) ## 23.opt.: maxN: max Ns in a cons seq (def=5) (s5) ## 24.opt.: maxH: max heterozyg. sites in cons seq (def=5) (s5) ## 25.opt.: ploidy: max alleles in cons seq (def=2;see docs) (s4,s5) ## 26.opt.: maxSNPs: (def=100). Paired (def=100,100) (s7) ## 27.opt.: maxIndels: within-clust,across-clust (def. 3,99) (s3,s7) ## 28.opt.: random number seed (def. 112233) (s3,s6,s7) ## 29.opt.: trim overhang left,right on final loci, def(0,0) (s7) ## 30.opt.: output formats: p,n,a,s,v,u,t,m,k,g,* (see docs) (s7) ## 31.opt.: maj. base call at depth>x<mindepth (def.x=mindepth) (s5) ## 32.opt.: keep trimmed reads (def=0). Enter min length. (s2) ## 33.opt.: max stack size (int), def= max(500,mean+2*SD) (s3) ## 34.opt.: minDerep: exclude dereps with <= N copies, def=1 (s3) ## 35.opt.: use hierarchical clustering (def.=0, 1=yes) (s6) ## 36.opt.: repeat masking (def.=1='dust' method, 0=no) (s3,s6) ## 37.opt.: vsearch max threads per job (def.=6; see docs) (s3,s6) ==== optional: list group/clade assignments below this line (see docs) ==================
ls
3.0.66.tar.gz Documents/ Library/ Music/ Pictures/ Untitled0.ipynb params.txt Desktop/ Downloads/ Movies/ PE-GBS_empirical.ipynb Public/ analysis_pyrad/
%%bash
mkdir oly_pyrad
cp params.txt oly_pyrad/
cp -r analysis_pyrad/ oly_pyrad/
cd oly_pyrad/
/Users/Sam/oly_pyrad
%%bash
mkdir fastq
cd fastq/
/Users/Sam/oly_pyrad/fastq
%%bash
time cp /Volumes/web/nightingales/O_lurida/20160223_gbs/*_{1..10}A_*.fq.gz .
real 4m33.626s user 0m0.024s sys 0m14.946s
ls
1HL_10A_1.fq.gz* 1HL_3A_1.fq.gz* 1HL_6A_1.fq.gz* 1HL_9A_1.fq.gz* 1NF_2A_1.fq.gz* 1NF_6A_1.fq.gz* 1NF_9A_1.fq.gz* 1SN_2A_1.fq.gz* 1SN_5A_1.fq.gz* 1SN_8A_1.fq.gz* 1HL_10A_2.fq.gz* 1HL_3A_2.fq.gz* 1HL_6A_2.fq.gz* 1HL_9A_2.fq.gz* 1NF_2A_2.fq.gz* 1NF_6A_2.fq.gz* 1NF_9A_2.fq.gz* 1SN_2A_2.fq.gz* 1SN_5A_2.fq.gz* 1SN_8A_2.fq.gz* 1HL_1A_1.fq.gz* 1HL_4A_1.fq.gz* 1HL_7A_1.fq.gz* 1NF_10A_1.fq.gz* 1NF_4A_1.fq.gz* 1NF_7A_1.fq.gz* 1SN_10A_1.fq.gz* 1SN_3A_1.fq.gz* 1SN_6A_1.fq.gz* 1SN_9A_1.fq.gz* 1HL_1A_2.fq.gz* 1HL_4A_2.fq.gz* 1HL_7A_2.fq.gz* 1NF_10A_2.fq.gz* 1NF_4A_2.fq.gz* 1NF_7A_2.fq.gz* 1SN_10A_2.fq.gz* 1SN_3A_2.fq.gz* 1SN_6A_2.fq.gz* 1SN_9A_2.fq.gz* 1HL_2A_1.fq.gz* 1HL_5A_1.fq.gz* 1HL_8A_1.fq.gz* 1NF_1A_1.fq.gz* 1NF_5A_1.fq.gz* 1NF_8A_1.fq.gz* 1SN_1A_1.fq.gz* 1SN_4A_1.fq.gz* 1SN_7A_1.fq.gz* 1HL_2A_2.fq.gz* 1HL_5A_2.fq.gz* 1HL_8A_2.fq.gz* 1NF_1A_2.fq.gz* 1NF_5A_2.fq.gz* 1NF_8A_2.fq.gz* 1SN_1A_2.fq.gz* 1SN_4A_2.fq.gz* 1SN_7A_2.fq.gz*
cd ..
/Users/Sam/oly_pyrad
cp /Users/Sam/Downloads/params.txt .
cat params.txt
==** parameter inputs for pyRAD version 3.0.66 **======================== affected step == ./ ## 1. Working directory (all) ## 2. Loc. of non-demultiplexed files (if not line 18) (s1) ## 3. Loc. of barcode file (if not line 18) (s1) /usr/local/bin/vsearch-1.11.1/bin/vsearch ## 4. command (or path) to call vsearch (or usearch) (s3,s6) /usr/local/bin/muscle3.8.31_i86darwin64 ## 5. command (or path) to call muscle (s3,s7) CWGC ## 6. Restriction overhang (e.g., C|TGCAG -> TGCAG) (s1,s2) 16 ## 7. N processors (parallel) (all) 6 ## 8. Mindepth: min coverage for a cluster (s4,s5) 4 ## 9. NQual: max # sites with qual < 20 (or see line 20)(s2) .88 ## 10. Wclust: clustering threshold as a decimal (s3,s6) merged ## 11. Datatype: rad,gbs,pairgbs,pairddrad,(others:see docs)(all) 4 ## 12. MinCov: min samples in a final locus (s7) 3 ## 13. MaxSH: max inds with shared hetero site (s7) oly_gbs_pyrad ## 14. Prefix name for final output (no spaces) (s7) ==== optional params below this line =================================== affected step == ## 15.opt.: select subset (prefix* only selector) (s2-s7) ## 16.opt.: add-on (outgroup) taxa (list or prefix*) (s6,s7) ## 17.opt.: exclude taxa (list or prefix*) (s7) /Users/Sam/oly_pyrad/analysis_pyrad/fastq/*.assembled.fastq ## 18.opt.: loc. of de-multiplexed data (s2) ## 19.opt.: maxM: N mismatches in barcodes (def= 1) (s1) ## 20.opt.: phred Qscore offset (def= 33) (s2) 2 ## 21.opt.: filter: def=0=NQual 1=NQual+adapters. 2=strict (s2) ## 22.opt.: a priori E,H (def= 0.001,0.01, if not estimated) (s5) 4 ## 23.opt.: maxN: max Ns in a cons seq (def=5) (s5) 8 ## 24.opt.: maxH: max heterozyg. sites in cons seq (def=5) (s5) ## 25.opt.: ploidy: max alleles in cons seq (def=2;see docs) (s4,s5) ## 26.opt.: maxSNPs: (def=100). Paired (def=100,100) (s7) ## 27.opt.: maxIndels: within-clust,across-clust (def. 3,99) (s3,s7) ## 28.opt.: random number seed (def. 112233) (s3,s6,s7) ## 29.opt.: trim overhang left,right on final loci, def(0,0) (s7) * ## 30.opt.: output formats: p,n,a,s,v,u,t,m,k,g,* (see docs) (s7) ## 31.opt.: maj. base call at depth>x<mindepth (def.x=mindepth) (s5) 50 ## 32.opt.: keep trimmed reads (def=0). Enter min length. (s2) ## 33.opt.: max stack size (int), def= max(500,mean+2*SD) (s3) ## 34.opt.: minDerep: exclude dereps with <= N copies, def=1 (s3) ## 35.opt.: use hierarchical clustering (def.=0, 1=yes) (s6) ## 36.opt.: repeat masking (def.=1='dust' method, 0=no) (s3,s6) 16 ## 37.opt.: vsearch max threads per job (def.=6; see docs) (s3,s6) ==== optional: list group/clade assignments below this line (see docs) ==================
%%bash
time gunzip analysis_pyrad/fastq/*.gz
real 21m41.187s user 4m30.902s sys 0m28.895s
ls analysis_pyrad/fastq/
1HL_10A_1.fq* 1HL_2A_2.fq* 1HL_5A_1.fq* 1HL_7A_2.fq* 1NF_10A_1.fq* 1NF_2A_2.fq* 1NF_6A_1.fq* 1NF_8A_2.fq* 1SN_1A_1.fq* 1SN_3A_2.fq* 1SN_6A_1.fq* 1SN_8A_2.fq* 1HL_10A_2.fq* 1HL_3A_1.fq* 1HL_5A_2.fq* 1HL_8A_1.fq* 1NF_10A_2.fq* 1NF_4A_1.fq* 1NF_6A_2.fq* 1NF_9A_1.fq* 1SN_1A_2.fq* 1SN_4A_1.fq* 1SN_6A_2.fq* 1SN_9A_1.fq* 1HL_1A_1.fq* 1HL_3A_2.fq* 1HL_6A_1.fq* 1HL_8A_2.fq* 1NF_1A_1.fq* 1NF_4A_2.fq* 1NF_7A_1.fq* 1NF_9A_2.fq* 1SN_2A_1.fq* 1SN_4A_2.fq* 1SN_7A_1.fq* 1SN_9A_2.fq* 1HL_1A_2.fq* 1HL_4A_1.fq* 1HL_6A_2.fq* 1HL_9A_1.fq* 1NF_1A_2.fq* 1NF_5A_1.fq* 1NF_7A_2.fq* 1SN_10A_1.fq* 1SN_2A_2.fq* 1SN_5A_1.fq* 1SN_7A_2.fq* 1HL_2A_1.fq* 1HL_4A_2.fq* 1HL_7A_1.fq* 1HL_9A_2.fq* 1NF_2A_1.fq* 1NF_5A_2.fq* 1NF_8A_1.fq* 1SN_10A_2.fq* 1SN_3A_1.fq* 1SN_5A_2.fq* 1SN_8A_1.fq*
cd ./oly_pyrad/
/Users/Sam/oly_pyrad
ls
analysis_pyrad/ params.txt pear.log
%%bash
rm pear.log
ls
analysis_pyrad/ params.txt
%%bash
time for gfile in analysis_pyrad/fastq/*_1.fq;
do pear -f $gfile \
-r ${gfile/_1.fq/_2.fq} \
-o ${gfile/_1.fq/} \
-n 33 \
-t 33 \
-q 10 \
-j 20 >> pear.log 2>&1;
done
real 51m28.614s user 310m59.181s sys 1m40.621s
%%bash
time pyrad -p params.txt -s 2
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ sorted .fastq from /Users/Sam/oly_pyrad/analysis_pyrad/fastq/*.assembled.fastq being used step 2: editing raw reads ............................. real 25m24.605s user 194m20.913s sys 0m48.644s
pwd
u'/Users/Sam/oly_pyrad'
%%bash
cat stats/s2.rawedit.txt
sample Nreads passed passed.w.trim passed.total 1HL_10A.assembled 1820770 1391336 199641 1590977 1HL_1A.assembled 1272730 983255 135282 1118537 1HL_2A.assembled 1601871 1230522 175952 1406474 1HL_3A.assembled 1249829 959096 136975 1096071 1HL_4A.assembled 1467079 1127114 160413 1287527 1HL_5A.assembled 1108322 854805 118424 973229 1HL_6A.assembled 1120919 862095 123224 985319 1HL_7A.assembled 1555328 1126212 243812 1370024 1HL_8A.assembled 1390494 1065503 151199 1216702 1HL_9A.assembled 1602480 1226829 176983 1403812 1NF_10A.assembled 1187014 907067 134206 1041273 1NF_1A.assembled 727077 554075 84411 638486 1NF_2A.assembled 1946720 1482437 222755 1705192 1NF_4A.assembled 2025316 1544728 231490 1776218 1NF_5A.assembled 1517299 1153759 177977 1331736 1NF_6A.assembled 1356495 1038280 152972 1191252 1NF_7A.assembled 1458739 1118102 163618 1281720 1NF_8A.assembled 1526495 1163697 175955 1339652 1NF_9A.assembled 1896089 1448815 216321 1665136 1SN_10A.assembled 1461469 1127065 157990 1285055 1SN_1A.assembled 1689935 1295083 190364 1485447 1SN_2A.assembled 1748906 1352061 188942 1541003 1SN_3A.assembled 1340465 1029190 148242 1177432 1SN_4A.assembled 1918135 1462860 217735 1680595 1SN_5A.assembled 1726528 1321611 191401 1513012 1SN_6A.assembled 1920845 1478193 211244 1689437 1SN_7A.assembled 1694825 1293688 189713 1483401 1SN_8A.assembled 1855879 1421110 206644 1627754 1SN_9A.assembled 1829438 1396438 206920 1603358 Nreads = total number of reads for a sample passed = retained reads that passed quality filtering at full length passed.w.trim= retained reads that were trimmed due to detection of adapters passed.total = total kept reads of sufficient length note: you can set the option in params file to include trimmed reads of xx length.
cat params.txt
==** parameter inputs for pyRAD version 3.0.66 **======================== affected step == ./ ## 1. Working directory (all) ## 2. Loc. of non-demultiplexed files (if not line 18) (s1) ## 3. Loc. of barcode file (if not line 18) (s1) /usr/local/bin/vsearch-1.11.1/bin/vsearch ## 4. command (or path) to call vsearch (or usearch) (s3,s6) /usr/local/bin/muscle3.8.31_i86darwin64 ## 5. command (or path) to call muscle (s3,s7) CWGC ## 6. Restriction overhang (e.g., C|TGCAG -> TGCAG) (s1,s2) 16 ## 7. N processors (parallel) (all) 6 ## 8. Mindepth: min coverage for a cluster (s4,s5) 4 ## 9. NQual: max # sites with qual < 20 (or see line 20)(s2) .88 ## 10. Wclust: clustering threshold as a decimal (s3,s6) merged ## 11. Datatype: rad,gbs,pairgbs,pairddrad,(others:see docs)(all) 4 ## 12. MinCov: min samples in a final locus (s7) 3 ## 13. MaxSH: max inds with shared hetero site (s7) oly_gbs_pyrad ## 14. Prefix name for final output (no spaces) (s7) ==== optional params below this line =================================== affected step == ## 15.opt.: select subset (prefix* only selector) (s2-s7) ## 16.opt.: add-on (outgroup) taxa (list or prefix*) (s6,s7) ## 17.opt.: exclude taxa (list or prefix*) (s7) /Users/Sam/oly_pyrad/analysis_pyrad/fastq/*.assembled.fastq ## 18.opt.: loc. of de-multiplexed data (s2) ## 19.opt.: maxM: N mismatches in barcodes (def= 1) (s1) ## 20.opt.: phred Qscore offset (def= 33) (s2) 2 ## 21.opt.: filter: def=0=NQual 1=NQual+adapters. 2=strict (s2) ## 22.opt.: a priori E,H (def= 0.001,0.01, if not estimated) (s5) 4 ## 23.opt.: maxN: max Ns in a cons seq (def=5) (s5) 8 ## 24.opt.: maxH: max heterozyg. sites in cons seq (def=5) (s5) ## 25.opt.: ploidy: max alleles in cons seq (def=2;see docs) (s4,s5) ## 26.opt.: maxSNPs: (def=100). Paired (def=100,100) (s7) ## 27.opt.: maxIndels: within-clust,across-clust (def. 3,99) (s3,s7) ## 28.opt.: random number seed (def. 112233) (s3,s6,s7) ## 29.opt.: trim overhang left,right on final loci, def(0,0) (s7) * ## 30.opt.: output formats: p,n,a,s,v,u,t,m,k,g,* (see docs) (s7) ## 31.opt.: maj. base call at depth>x<mindepth (def.x=mindepth) (s5) 50 ## 32.opt.: keep trimmed reads (def=0). Enter min length. (s2) ## 33.opt.: max stack size (int), def= max(500,mean+2*SD) (s3) ## 34.opt.: minDerep: exclude dereps with <= N copies, def=1 (s3) ## 35.opt.: use hierarchical clustering (def.=0, 1=yes) (s6) ## 36.opt.: repeat masking (def.=1='dust' method, 0=no) (s3,s6) 16 ## 37.opt.: vsearch max threads per job (def.=6; see docs) (s3,s6) ==== optional: list group/clade assignments below this line (see docs) ================== 1HL 5 1HL* 1NF 5 1NF* 1SN 5 1SN*
%%bash
time pyrad -p params.txt -s 3
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ de-replicating files for clustering... step 3: within-sample clustering of 29 samples at '.88' similarity. Running 16 parallel jobs with up to 16 threads per job. If needed, adjust to avoid CPU and MEM limits sample 1HL_7A finished, 93536 loci sample 1HL_9A finished, 99179 loci sample 1HL_2A finished, 84916 loci sample 1NF_8A finished, 88468 loci sample 1SN_5A finished, 106967 loci sample 1SN_7A finished, 96438 loci sample 1SN_1A finished, 86975 loci sample 1HL_10A finished, 96319 loci sample 1SN_8A finished, 93143 loci sample 1SN_9A finished, 110001 loci sample 1SN_2A finished, 95605 loci sample 1SN_4A finished, 99963 loci sample 1NF_9A finished, 94576 loci sample 1SN_6A finished, 97635 loci sample 1NF_2A finished, 93430 loci sample 1NF_4A finished, 100841 loci sample 1NF_1A finished, 54628 loci sample 1NF_5A finished, 88013 loci Process Worker-7: Traceback (most recent call last): File "/usr/local/bioinformatics/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "build/bdist.macosx-10.5-x86_64/egg/pyrad/potpour.py", line 31, in run res = self.func(*job) File "build/bdist.macosx-10.5-x86_64/egg/pyrad/cluster7dp.py", line 545, in final alignwrap(outfolder+"/"+handle.split("/")[-1].replace(".edit",".clust.gz"), mindepth, muscle, w1) File "build/bdist.macosx-10.5-x86_64/egg/pyrad/cluster7dp.py", line 432, in alignwrap keys.sort(key=lambda x:int(x.split(";")[1].replace("size=","")), reverse=True) File "build/bdist.macosx-10.5-x86_64/egg/pyrad/cluster7dp.py", line 432, in <lambda> keys.sort(key=lambda x:int(x.split(";")[1].replace("size=","")), reverse=True) IndexError: list index out of range sample 1HL_5A finished, 78327 loci sample 1HL_3A finished, 96685 loci sample 1HL_6A finished, 77553 loci sample 1NF_7A finished, 85566 loci sample 1SN_10A finished, 94779 loci sample 1HL_1A finished, 81103 loci sample 1HL_8A finished, 80377 loci sample 1NF_6A finished, 83802 loci sample 1HL_4A finished, 84504 loci sample 1SN_3A finished, 106541 loci real 250m0.701s user 1679m50.366s sys 194m22.084s
cat stats/s3.clusters.txt
taxa total dpt.me dpt.sd d>5.tot d>5.me d>5.sd badpairs ## total = total number of clusters, including singletons ## dpt.me = mean depth of clusters ## dpt.sd = standard deviation of cluster depth ## >N.tot = number of clusters with depth greater than N ## >N.me = mean depth of clusters with depth greater than N ## >N.sd = standard deviation of cluster depth for clusters with depth greater than N ## badpairs = mismatched 1st & 2nd reads (only for paired ddRAD data) HISTOGRAMS taxa total dpt.me dpt.sd d>5.tot d>5.me d>5.sd badpairs 1HL_10A.assembled 96319 8.175 26.643 27546 24.034 46.113 0 1HL_1A.assembled 81103 6.915 32.77 21685 20.681 61.266 0 1HL_2A.assembled 84916 8.135 26.405 25225 22.924 45.083 0 1HL_3A.assembled 96685 5.794 21.024 21927 19.527 41.239 0 1HL_4A.assembled 84504 7.699 36.772 24572 21.916 66.042 0 1HL_5A.assembled 78327 6.242 24.341 19956 18.949 45.874 0 1HL_6A.assembled 77553 6.448 23.051 19925 19.647 42.773 0 1HL_7A.assembled 93536 7.511 35.323 26658 21.716 63.97 0 1HL_8A.assembled 80377 7.702 29.826 22318 22.847 53.688 0 1HL_9A.assembled 99179 7.095 27.517 25142 22.827 51.495 0 1NF_1A.assembled 54628 5.986 16.562 13499 18.349 30.043 0 1NF_2A.assembled 93430 8.834 32.408 28560 24.709 55.405 0 1NF_4A.assembled 100841 8.704 35.853 30880 24.282 62.006 0 1NF_5A.assembled 88013 7.512 25.075 25169 21.616 43.778 0 1NF_6A.assembled 83802 7.107 30.616 22252 21.568 56.934 0 1NF_7A.assembled 85566 7.348 25.729 23063 22.143 46.393 0 1NF_8A.assembled 88468 7.492 24.994 24030 22.604 44.525 0 1NF_9A.assembled 94576 8.75 42.972 28263 24.948 76.169 0 1SN_10A.assembled 94779 6.814 26.835 23903 21.522 50.616 0 1SN_1A.assembled 86975 8.342 31.219 25283 24.13 54.753 0 1SN_2A.assembled 95605 7.934 37.881 26414 23.834 69.577 0 1SN_3A.assembled 106541 5.684 24.182 22272 20.421 50.179 0 1SN_4A.assembled 99963 8.253 30.238 29839 23.322 52.308 0 1SN_5A.assembled 106967 6.996 22.104 28188 21.451 39.58 0 1SN_6A.assembled 97635 8.526 47.897 29140 24.224 85.627 0 1SN_7A.assembled 96438 7.718 29.465 27143 22.727 52.608 0 1SN_8A.assembled 93143 8.68 89.174 28977 23.765 158.831 0 1SN_9A.assembled 110001 7.246 27.658 28911 22.59 50.867 0 ## total = total number of clusters, including singletons ## dpt.me = mean depth of clusters ## dpt.sd = standard deviation of cluster depth ## >N.tot = number of clusters with depth greater than N ## >N.me = mean depth of clusters with depth greater than N ## >N.sd = standard deviation of cluster depth for clusters with depth greater than N ## badpairs = mismatched 1st & 2nd reads (only for paired ddRAD data) HISTOGRAMS sample: 1HL_10A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 65252 5 ***** 12231 10 *** 5823 15 ** 3322 20 ** 2177 25 * 1510 30 * 1156 35 * 841 40 * 1380 50 ** 2101 100 * 401 250 * 83 500 * 42 sample: 1HL_1A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 56029 5 ***** 11364 10 *** 4582 15 ** 2526 20 ** 1641 25 * 1178 30 * 969 35 * 746 40 * 932 50 * 850 100 * 216 250 * 49 500 * 21 sample: 1HL_2A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 56361 5 ***** 11825 10 *** 5506 15 ** 2800 20 ** 1841 25 * 1380 30 * 995 35 * 841 40 * 1265 50 ** 1652 100 * 344 250 * 69 500 * 37 sample: 1HL_3A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 *********************** 71269 5 ***** 11612 10 *** 4992 15 ** 2525 20 ** 1697 25 * 1190 30 * 926 35 * 666 40 * 860 50 * 679 100 * 202 250 * 46 500 * 21 sample: 1HL_4A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 56535 5 ***** 11765 10 *** 5419 15 ** 2958 20 ** 1960 25 * 1304 30 * 996 35 * 843 40 * 1153 50 * 1210 100 * 265 250 * 67 500 * 29 sample: 1HL_5A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 55099 5 ***** 11043 10 *** 4350 15 ** 2286 20 ** 1510 25 * 1202 30 * 855 35 * 582 40 * 626 50 * 557 100 * 165 250 * 35 500 * 17 sample: 1HL_6A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 54404 5 ***** 10719 10 *** 4149 15 ** 2356 20 ** 1592 25 * 1164 30 * 940 35 * 717 40 * 706 50 * 562 100 * 187 250 * 40 500 * 17 sample: 1HL_7A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 63246 5 ***** 12604 10 *** 6017 15 ** 3462 20 ** 2115 25 * 1426 30 * 1123 35 * 763 40 * 1168 50 * 1221 100 * 291 250 * 66 500 * 34 sample: 1HL_8A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 54804 5 ***** 11048 10 *** 4511 15 ** 2450 20 ** 1718 25 * 1250 30 * 981 35 * 767 40 * 1114 50 ** 1356 100 * 293 250 * 52 500 * 33 sample: 1HL_9A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 70776 5 **** 11563 10 *** 5531 15 ** 2890 20 ** 1852 25 * 1415 30 * 1103 35 * 793 40 * 1225 50 * 1596 100 * 340 250 * 68 500 * 27 sample: 1NF_1A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 38814 5 ***** 7149 10 *** 3021 15 ** 1863 20 ** 1303 25 * 877 30 * 527 35 * 316 40 * 301 50 * 308 100 * 119 250 * 21 500 * 9 sample: 1NF_2A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 61356 5 ***** 12215 10 *** 6132 15 ** 3496 20 ** 2281 25 * 1523 30 * 1188 35 * 938 40 * 1415 50 ** 2323 100 * 425 250 * 97 500 * 41 sample: 1NF_4A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 66278 5 ***** 12991 10 *** 6516 15 ** 4062 20 ** 2727 25 ** 1821 30 * 1243 35 * 946 40 * 1459 50 ** 2239 100 * 421 250 * 85 500 * 53 sample: 1NF_5A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 59287 5 ***** 12241 10 *** 5483 15 ** 3067 20 ** 1895 25 * 1339 30 * 1045 35 * 846 40 * 1214 50 * 1220 100 * 281 250 * 63 500 * 32 sample: 1NF_6A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 58114 5 ***** 11484 10 *** 4719 15 ** 2537 20 ** 1638 25 * 1212 30 * 948 35 * 778 40 * 1065 50 * 992 100 * 227 250 * 55 500 * 33 sample: 1NF_7A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 58873 5 ***** 11822 10 *** 4784 15 ** 2579 20 ** 1680 25 * 1282 30 * 971 35 * 756 40 * 1172 50 * 1287 100 * 267 250 * 59 500 * 34 sample: 1NF_8A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 60973 5 ***** 11764 10 *** 5081 15 ** 2592 20 ** 1713 25 * 1323 30 * 1059 35 * 811 40 * 1250 50 ** 1510 100 * 296 250 * 66 500 * 30 sample: 1NF_9A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 62779 5 ***** 12604 10 *** 5809 15 ** 3317 20 ** 2191 25 * 1554 30 * 1170 35 * 873 40 * 1362 50 ** 2357 100 * 433 250 * 75 500 * 52 sample: 1SN_10A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 67109 5 ***** 12369 10 *** 5164 15 ** 2708 20 ** 1690 25 * 1295 30 * 976 35 * 800 40 * 1131 50 * 1186 100 * 266 250 * 59 500 * 26 sample: 1SN_1A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 58247 5 ***** 11888 10 *** 5228 15 ** 2764 20 ** 1725 25 * 1419 30 * 1100 35 * 829 40 * 1331 50 ** 2002 100 * 326 250 * 78 500 * 38 sample: 1SN_2A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 65452 5 ***** 12751 10 *** 5595 15 ** 2993 20 ** 1862 25 * 1320 30 * 1024 35 * 878 40 * 1292 50 ** 1962 100 * 342 250 * 95 500 * 39 sample: 1SN_3A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ************************ 80441 5 **** 12267 10 ** 4617 15 ** 2503 20 * 1672 25 * 1247 30 * 979 35 * 755 40 * 922 50 * 841 100 * 226 250 * 53 500 * 18 sample: 1SN_4A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 66393 5 ***** 13090 10 *** 6546 15 ** 3852 20 ** 2395 25 * 1627 30 * 1198 35 * 941 40 * 1400 50 ** 1996 100 * 399 250 * 80 500 * 46 sample: 1SN_5A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 74908 5 ***** 13523 10 *** 6269 15 ** 3448 20 ** 2200 25 * 1485 30 * 1106 35 * 839 40 * 1279 50 * 1488 100 * 310 250 * 82 500 * 30 sample: 1SN_6A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 64836 5 ***** 12907 10 *** 6379 15 ** 3500 20 ** 2255 25 * 1567 30 * 1095 35 * 1008 40 * 1394 50 ** 2180 100 * 379 250 * 92 500 * 43 sample: 1SN_7A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 65621 5 ***** 12628 10 *** 6010 15 ** 3312 20 ** 2150 25 * 1473 30 * 1084 35 * 875 40 * 1268 50 * 1563 100 * 339 250 * 73 500 * 42 sample: 1SN_8A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************* 60561 5 ***** 12830 10 *** 6261 15 ** 3761 20 ** 2392 25 ** 1617 30 * 1149 35 * 979 40 * 1374 50 ** 1745 100 * 364 250 * 70 500 * 40 sample: 1SN_9A.assembled bins depth_histogram cnts : 0------------50-------------100% 0 ********************** 77269 5 ***** 13128 10 *** 6562 15 ** 3704 20 ** 2318 25 * 1550 30 * 1102 35 * 878 40 * 1302 50 * 1715 100 * 357 250 * 72 500 * 44
%%bash
time pyrad -p params.txt -s 4
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ step 4: estimating error rate and heterozygosity ............................. real 471m18.216s user 3176m23.286s sys 3m33.770s
%%bash
time pyRAD -p params.txt -s 5
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ step 5: creating consensus seqs for 29 samples, using H=0.01108 E=0.00184 ............................. real 150m47.055s user 1150m47.513s sys 1m8.497s
%%bash
time pyRAD -p params.txt -s 6
vsearch v1.11.1_osx_x86_64, 16.0GB RAM, 8 cores https://github.com/torognes/vsearch finished clustering
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ step 6: clustering across 29 samples at '.88' similarity Reading file /Users/Sam/oly_pyrad/clust.88/cat.haplos_ 100% 62229589 nt in 556070 seqs, min 32, max 245, avg 112 Counting unique k-mers 100% Clustering 100% Sorting clusters 100% Writing clusters 100% Clusters: 71397 Size min 1, max 65, avg 7.8 Singletons: 18100, 3.3% of seqs, 25.4% of clusters real 11m41.582s user 15m13.421s sys 0m10.647s
%%bash
time pyRAD -p params.txt -s 7
ingroup 1HL_10A.assembled,1HL_1A.assembled,1HL_2A.assembled,1HL_3A.assembled,1HL_4A.assembled,1HL_5A.assembled,1HL_6A.assembled,1HL_7A.assembled,1HL_8A.assembled,1HL_9A.assembled,1NF_10A.assembled,1NF_1A.assembled,1NF_2A.assembled,1NF_4A.assembled,1NF_5A.assembled,1NF_6A.assembled,1NF_7A.assembled,1NF_8A.assembled,1NF_9A.assembled,1SN_10A.assembled,1SN_1A.assembled,1SN_2A.assembled,1SN_3A.assembled,1SN_4A.assembled,1SN_5A.assembled,1SN_6A.assembled,1SN_7A.assembled,1SN_8A.assembled,1SN_9A.assembled addon exclude final stats written to: /Users/Sam/oly_pyrad/stats/oly_gbs_pyrad.stats output files being written to: /Users/Sam/oly_pyrad/outfiles/ directory filtering & writing to phylip file writing nexus file Writing gphocs file + writing full SNPs file + writing unlinked bi-allelic SNPs file + writing STRUCTURE file + writing geno file + writing treemix file data set reduced for group coverage minimums 1HL ['1HL_10A.assembled', '1HL_1A.assembled', '1HL_2A.assembled', '1HL_3A.assembled', '1HL_4A.assembled', '1HL_5A.assembled', '1HL_6A.assembled', '1HL_7A.assembled', '1HL_8A.assembled', '1HL_9A.assembled'] minimum= 5 1NF ['1NF_10A.assembled', '1NF_1A.assembled', '1NF_2A.assembled', '1NF_4A.assembled', '1NF_5A.assembled', '1NF_6A.assembled', '1NF_7A.assembled', '1NF_8A.assembled', '1NF_9A.assembled'] minimum= 5 1SN ['1SN_10A.assembled', '1SN_1A.assembled', '1SN_2A.assembled', '1SN_3A.assembled', '1SN_4A.assembled', '1SN_5A.assembled', '1SN_6A.assembled', '1SN_7A.assembled', '1SN_8A.assembled', '1SN_9A.assembled'] minimum= 5 writing vcf file writing migrate-n file data set reduced for group coverage minimums 1HL ['1HL_10A.assembled', '1HL_1A.assembled', '1HL_2A.assembled', '1HL_3A.assembled', '1HL_4A.assembled', '1HL_5A.assembled', '1HL_6A.assembled', '1HL_7A.assembled', '1HL_8A.assembled', '1HL_9A.assembled'] minimum= 5 1NF ['1NF_10A.assembled', '1NF_1A.assembled', '1NF_2A.assembled', '1NF_4A.assembled', '1NF_5A.assembled', '1NF_6A.assembled', '1NF_7A.assembled', '1NF_8A.assembled', '1NF_9A.assembled'] minimum= 5 1SN ['1SN_10A.assembled', '1SN_1A.assembled', '1SN_2A.assembled', '1SN_3A.assembled', '1SN_4A.assembled', '1SN_5A.assembled', '1SN_6A.assembled', '1SN_7A.assembled', '1SN_8A.assembled', '1SN_9A.assembled'] minimum= 5
------------------------------------------------------------ pyRAD : RADseq for phylogenetics & introgression analyses ------------------------------------------------------------ Cluster input file: using /Users/Sam/oly_pyrad/clust.88/cat.clust_.gz ........Traceback (most recent call last): File "/usr/local/bioinformatics/anaconda/bin/pyRAD", line 9, in <module> load_entry_point('pyrad==3.0.66', 'console_scripts', 'pyrad')() File "build/bdist.macosx-10.5-x86_64/egg/pyrad/pyRAD.py", line 527, in main File "build/bdist.macosx-10.5-x86_64/egg/pyrad/alignable.py", line 1013, in main File "build/bdist.macosx-10.5-x86_64/egg/pyrad/loci2mig.py", line 75, in make IndexError: list index out of range real 8m51.355s user 31m44.363s sys 4m53.344s
cat stats/oly_gbs_pyrad.stats
35373 ## loci with > minsp containing data 26281 ## loci with > minsp containing data & paralogs removed 26281 ## loci with > minsp containing data & paralogs removed & final filtering ## number of loci recovered in final data set for each taxon. taxon nloci 1HL_10A.assembled 10633 1HL_1A.assembled 8091 1HL_2A.assembled 9825 1HL_3A.assembled 8641 1HL_4A.assembled 9753 1HL_5A.assembled 7550 1HL_6A.assembled 7099 1HL_7A.assembled 10707 1HL_8A.assembled 8555 1HL_9A.assembled 9876 1NF_10A.assembled 6648 1NF_1A.assembled 5081 1NF_2A.assembled 10995 1NF_4A.assembled 12148 1NF_5A.assembled 9897 1NF_6A.assembled 8795 1NF_7A.assembled 9020 1NF_8A.assembled 8517 1NF_9A.assembled 11097 1SN_10A.assembled 9426 1SN_1A.assembled 9878 1SN_2A.assembled 10092 1SN_3A.assembled 8229 1SN_4A.assembled 12028 1SN_5A.assembled 11164 1SN_6A.assembled 11243 1SN_7A.assembled 10620 1SN_8A.assembled 11504 1SN_9A.assembled 11687 ## nloci = number of loci with data for exactly ntaxa ## ntotal = number of loci for which at least ntaxa have data ntaxa nloci saved ntotal 1 - 2 - - 3 - - 4 4023 * 26281 5 3130 * 22258 6 2433 * 19128 7 2093 * 16695 8 1756 * 14602 9 1461 * 12846 10 1392 * 11385 11 1170 * 9993 12 957 * 8823 13 817 * 7866 14 812 * 7049 15 651 * 6237 16 624 * 5586 17 569 * 4962 18 534 * 4393 19 499 * 3859 20 430 * 3360 21 379 * 2930 22 328 * 2551 23 327 * 2223 24 309 * 1896 25 300 * 1587 26 284 * 1287 27 266 * 1003 28 318 * 737 29 419 * 419 ## nvar = number of loci containing n variable sites (pis+autapomorphies). ## sumvar = sum of variable sites (SNPs). ## pis = number of loci containing n parsimony informative sites. ## sumpis = sum of parsimony informative sites. nvar sumvar PIS sumPIS 0 11292 0 15943 0 1 3989 3989 3613 3613 2 2426 8841 2024 7661 3 1748 14085 1404 11873 4 1285 19225 974 15769 5 1104 24745 694 19239 6 909 30199 531 22425 7 734 35337 352 24889 8 599 40129 230 26729 9 487 44512 164 28205 10 360 48112 123 29435 11 292 51324 62 30117 12 199 53712 60 30837 13 199 56299 45 31422 14 154 58455 21 31716 15 91 59820 17 31971 16 101 61436 9 32115 17 71 62643 6 32217 18 49 63525 2 32253 19 43 64342 0 32253 20 34 65022 1 32273 21 22 65484 0 32273 22 28 66100 1 32295 23 10 66330 1 32318 24 19 66786 1 32342 25 2 66836 2 32392 26 2 66888 0 32392 27 6 67050 1 32419 28 3 67134 0 32419 29 3 67221 0 32419 30 3 67311 0 32419 31 4 67435 0 32419 32 1 67467 0 32419 33 4 67599 0 32419 34 1 67633 0 32419 35 1 67668 0 32419 36 1 67704 0 32419 37 0 67704 0 32419 38 1 67742 0 32419 39 3 67859 0 32419 40 0 67859 0 32419 41 0 67859 0 32419 42 0 67859 0 32419 43 1 67902 0 32419 total var= 67902 total pis= 32419 sampled unlinked SNPs= 14989 sampled unlinked bi-allelic SNPs= 14881
ls outfiles/
oly_gbs_pyrad.alleles oly_gbs_pyrad.migrate oly_gbs_pyrad.snps oly_gbs_pyrad.unlinked_snps oly_gbs_pyrad.excluded_loci oly_gbs_pyrad.nex oly_gbs_pyrad.snps.geno oly_gbs_pyrad.usnps.geno oly_gbs_pyrad.gphocs oly_gbs_pyrad.phy oly_gbs_pyrad.str oly_gbs_pyrad.vcf oly_gbs_pyrad.loci oly_gbs_pyrad.phy.partitions oly_gbs_pyrad.treemix.gz
%%bash
head -n 39 outfiles/oly_gbs_pyrad.loci | cut -c 1-75
>1HL_1A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1HL_2A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGYCACAACGTGA >1HL_5A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1HL_6A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1HL_7A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGYCACAACGTGA >1HL_9A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1NF_10A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1NF_1A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1NF_6A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1NF_7A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGCCACAACGTGA >1SN_10A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1SN_2A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1SN_3A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1SN_6A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1SN_7A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA >1SN_9A.assembled ATACAGTCAGATTGGTGCAAAGCCTTCCTGACTGCCGGGAAGTCACAACGTGA // * >1HL_10A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1HL_3A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1HL_4A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1HL_7A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1HL_8A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1NF_4A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1NF_7A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1NF_9A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1SN_10A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1SN_1A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTNNTCATGGGATCCAGGAAA >1SN_2A.assembled AAGCCAAAATATCTTGNATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1SN_4A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1SN_5A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA >1SN_8A.assembled AAGCCAAAATATCTTGAATCTTGATGCGACCAGTCCTCATGGGATCCAGGAAA // >1HL_2A.assembled NAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGAGAC >1HL_5A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGAGAC >1HL_7A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGRGAC >1NF_2A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGRGAC >1NF_4A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGAGAC >1NF_5A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGAGAC >1NF_6A.assembled GAATCGGAGGCATTTTCCCATGCAGGTTTGTCGAGTGGACAATTGTCGGAGAC
%%bash
head -1 outfiles/oly_gbs_pyrad.snps
## 29 taxa, 52562 loci, 105475 snps
%%bash
pwd
/Users/Sam
cd /Users/Sam/oly_pyrad/outfiles/
/Users/Sam/oly_pyrad/outfiles
%%bash
zcat < oly_gbs_pyrad.treemix.gz | head
1HL 1NF 1SN 20,0 15,3 18,0 13,1 8,2 7,3 12,0 10,0 12,2 10,0 4,6 14,0 14,0 9,1 12,0 9,1 12,0 14,0 17,1 16,0 16,0 20,0 17,1 19,1 14,0 12,0 15,1