docker run -p 8888:8888 -v /Users/sam/data/:/data -v /Users/sam/owl_home/:/owl_home -v /Users/sam/owl_web/:/owl_web -v /Users/sam/gitrepos:/gitrepos -it f99537d7e06a
The command allows access to Jupyter Notebook over port 8888 and makes my Jupyter Notebook GitHub repo and my data files on Owl/home and Owl/web accessible to the Docker container.
Once the container was started, started Jupyter Notebook with the following command inside the Docker container:
jupyter notebook
This is configured in the Docker container to launch a Jupyter Notebook without a browser on port 8888.
The Docker container is running on an image created from this Dockerfile (Git commit 443bc42)
%%bash
date
Mon Feb 27 18:32:53 UTC 2017
%%bash
hostname
0f2bca9c664b
%%bash
lscpu
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 26 Model name: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz Stepping: 5 CPU MHz: 2260.998 BogoMIPS: 4521.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K
Hi Sam,
Your new directory is on the server "Dimond_170224", the checksum info in listed in a text file for the three files. Is there a better way to list the checksums? I'm new to this.
Also, I gave you the 2 reads and the 6bp index file.
Best,
Shana
Shana McDevitt
Director
Vincent J. Coates Genomics Sequencing Laboratory
California Institute for Quantitative Biosciences (QB3)
University of California, Berkeley
cd /data/20170227_jay_data_tmp/
/data/20170227_jay_data_tmp
wget
to download all of the files in the target directory. Here's an explanation of the code:¶time
: Evaluates how long it takes for the command to complete.
WGETRC=/data/wgetrc_berk_seq
: This assigns the value of the bash variable WGETRC
to the contents of the wgetrc_berk_seq
file. This file contains the username and password needed to ftp the data from the UC Berkeley server. Using this allows me to run the command in a Jupyter notebook without the need for pasting the actual username and password into the command string.
-r
: Recursive; i.e. download all things in this directory and anything in any subdirectories.
-np
: No parent; i.e. do not ascend to higher directories.
-nc
: No clobber; i.e. do not overwrite any existing files in the download directory.
-q
: Quiet; i.e. do not print wget status to screen. This is to prevent bogging down the Jupyter notebook with thousands of output lines.
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/Dimond_170224
real 37m7.999s user 0m3.130s sys 20m34.060s
%%bash
ls -lh
total 0 drwxr-xr-x 1 srlab staff 102 Feb 27 19:54 gslserver.qb3.berkeley.edu
cd gslserver.qb3.berkeley.edu/Dimond_170224/
/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/Dimond_170224
%%bash
ls -lh
total 41G -rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz -rw-r--r-- 1 srlab staff 18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 192 Feb 25 00:01 md5sum_report
cat md5sum_report
baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz
%%bash
time for i in *.gz
do
md5sum "$i" >> checksums.md5
done
real 8m0.815s user 0m4.260s sys 4m44.580s
cat checksums.md5
baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz
Visual inspection suggests that these are good to go, but we'll compare them programmatically anyway...
%%bash
diff checksums.md5 md5sum_report
No output means no differences between the two files. However, to further verify, we'll check the exit status of the last command run (should be 0 if last command completed successfully with no errors). This is accomplished by calling the bash variable $?
.
%%bash
echo $?
0
Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the -no-clobber
argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name.
%%bash
time for file in *.gz
do
cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
done
real 130m18.473s user 0m0.020s sys 18m23.290s
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002_S0_L005*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 31m19.144s user 0m3.860s sys 4m44.720s
%%bash
time for i in /owl_web/nightingales/Porites_spp/JD002_S0_L005*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 27m17.756s user 0m4.390s sys 4m44.170s
%%bash
time for i in /owl_web/nightingales/A_elegantissima/JD002_S0_L005*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 26m45.605s user 0m4.870s sys 4m43.930s
I screwed up and didn't create/write the temp_checksums.md5
file into the different directories on Owl. Will create a concatenated md5sum_report
file that mimics the contents of the temp_checksums.md5
file.
%%bash
cat temp_checksums.md5
baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/Porites_spp/JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/Porites_spp/JD002_S0_L005_R2_001.fastq.gz baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R2_001.fastq.gz
%%bash
cat md5sum_report >> md5sum_report_cat
cat md5sum_report >> md5sum_report_cat
cat md5sum_report >> md5sum_report_cat
%%bash
cat md5sum_report_cat
baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz
%%bash
diff md5sum_report_cat temp_checksums.md5
1,9c1,9 < baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz < e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz < 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz < baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz < e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz < 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz < baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz < e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz < 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz --- > baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz > e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz > 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz > baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.gz > e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/Porites_spp/JD002_S0_L005_R1_001.fastq.gz > 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/Porites_spp/JD002_S0_L005_R2_001.fastq.gz > baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_I1_001.fastq.gz > e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R1_001.fastq.gz > 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R2_001.fastq.gz
Well, I didn't take into account that the full path to the file would be written to the checksum file. As such, the diff
command sees this. However, the checksums appear to visually match. Will proceed with adding the checksums to the checksum files in each Owl directory.
%%bash
cat checksums.md5sums.md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5
cat md5sum_report >> /owl_web/nightingales/Porites_spp/checksums.md5
cat md5sum_report >> /owl_web/nightingales/A_elegantissima/checksums.md5
Whoops! Typo in that first line above! Fixed below
%%bash
cat md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5
%%bash
ls -lh
total 41G -rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz -rw-r--r-- 1 srlab staff 18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 192 Feb 27 20:44 checksums.md5 -rw-r--r-- 1 srlab staff 192 Feb 25 00:01 md5sum_report -rw-r--r-- 1 srlab staff 576 Feb 28 01:13 md5sum_report_cat -rw-r--r-- 1 srlab staff 891 Feb 28 01:13 temp_checksums.md5
rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/
%%bash
ls -lh /data/20170227_jay_data_tmp/
total 0
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
cd /data/20170227_jay_data_tmp/
/data/20170227_jay_data_tmp
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts
real 41m4.943s user 0m3.130s sys 17m12.850s
%%bash
ls -lh /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/
total 34G -rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002A_S131_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002A_S131_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002B_S132_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.6G Feb 21 23:13 JD002B_S132_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002C_S133_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002C_S133_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002D_S134_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.7G Feb 21 23:13 JD002D_S134_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002E_S135_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002E_S135_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002F_S136_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.9G Feb 21 23:13 JD002F_S136_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002G_S137_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002G_S137_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002H_S138_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002H_S138_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002I_S139_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.6G Feb 21 23:13 JD002I_S139_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002J_S140_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002J_S140_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002K_S141_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002K_S141_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002L_S142_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002L_S142_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.6K Feb 27 19:11 demultiplexed_checksums
cd /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/
/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts
%%bash
cat demultiplexed_checksums
c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz
%%bash
time for i in *.gz
do
md5sum "$i" >> checksums.md5
done
real 7m0.125s user 0m6.260s sys 3m52.740s
%%bash
cat checksums.md5
c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz
%%bash
diff demultiplexed_checksums checksums.md5
%%bash
echo $?
0
Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the -no-clobber
argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name.
%%bash
time for file in *.gz
do
cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
done
real 155m12.124s user 0m0.180s sys 15m13.790s
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz
do
md5sum "$i" >> temp_checksums.md5
done
md5sum: /owl_web/nightingales/P_generosa/JD002[A-Z].gz: No such file or directory real 0m3.691s user 0m0.000s sys 0m0.000s
Whoops! Typo! Fixed below...
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 24m29.381s user 0m5.040s sys 3m53.230s
%%bash
time for i in /owl_web/nightingales/Porites_spp/JD002[A-Z]*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 25m3.500s user 0m5.080s sys 3m52.580s
%%bash
time for i in /owl_web/nightingales/A_elegantissima/JD002[A-Z]*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 25m41.897s user 0m4.710s sys 3m52.660s
%%bash
cat temp_checksums.md5
c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz
Since there are so many files this time, I'm going to strip the leading file path from the filenames so that I can actually use the diff
command to compare checksums. The code below uses sed
to edit the file in place (using the -i
argument), automatically creates a backup of the original file with the extension .bak
and then substitutes everything up to the last slash from the specified input file.
%%bash
time sed -i.bak 's/^.*\///' temp_checksums.md5
real 0m0.021s user 0m0.000s sys 0m0.000s
%%bash
cat temp_checksums.md5
JD002A_S131_L005_R1_001.fastq.gz JD002A_S131_L005_R2_001.fastq.gz JD002B_S132_L005_R1_001.fastq.gz JD002B_S132_L005_R2_001.fastq.gz JD002C_S133_L005_R1_001.fastq.gz JD002C_S133_L005_R2_001.fastq.gz JD002D_S134_L005_R1_001.fastq.gz JD002D_S134_L005_R2_001.fastq.gz JD002E_S135_L005_R1_001.fastq.gz JD002E_S135_L005_R2_001.fastq.gz JD002F_S136_L005_R1_001.fastq.gz JD002F_S136_L005_R2_001.fastq.gz JD002G_S137_L005_R1_001.fastq.gz JD002G_S137_L005_R2_001.fastq.gz JD002H_S138_L005_R1_001.fastq.gz JD002H_S138_L005_R2_001.fastq.gz JD002I_S139_L005_R1_001.fastq.gz JD002I_S139_L005_R2_001.fastq.gz JD002J_S140_L005_R1_001.fastq.gz JD002J_S140_L005_R2_001.fastq.gz JD002K_S141_L005_R1_001.fastq.gz JD002K_S141_L005_R2_001.fastq.gz JD002L_S142_L005_R1_001.fastq.gz JD002L_S142_L005_R2_001.fastq.gz JD002A_S131_L005_R1_001.fastq.gz JD002A_S131_L005_R2_001.fastq.gz JD002B_S132_L005_R1_001.fastq.gz JD002B_S132_L005_R2_001.fastq.gz JD002C_S133_L005_R1_001.fastq.gz JD002C_S133_L005_R2_001.fastq.gz JD002D_S134_L005_R1_001.fastq.gz JD002D_S134_L005_R2_001.fastq.gz JD002E_S135_L005_R1_001.fastq.gz JD002E_S135_L005_R2_001.fastq.gz JD002F_S136_L005_R1_001.fastq.gz JD002F_S136_L005_R2_001.fastq.gz JD002G_S137_L005_R1_001.fastq.gz JD002G_S137_L005_R2_001.fastq.gz JD002H_S138_L005_R1_001.fastq.gz JD002H_S138_L005_R2_001.fastq.gz JD002I_S139_L005_R1_001.fastq.gz JD002I_S139_L005_R2_001.fastq.gz JD002J_S140_L005_R1_001.fastq.gz JD002J_S140_L005_R2_001.fastq.gz JD002K_S141_L005_R1_001.fastq.gz JD002K_S141_L005_R2_001.fastq.gz JD002L_S142_L005_R1_001.fastq.gz JD002L_S142_L005_R2_001.fastq.gz JD002A_S131_L005_R1_001.fastq.gz JD002A_S131_L005_R2_001.fastq.gz JD002B_S132_L005_R1_001.fastq.gz JD002B_S132_L005_R2_001.fastq.gz JD002C_S133_L005_R1_001.fastq.gz JD002C_S133_L005_R2_001.fastq.gz JD002D_S134_L005_R1_001.fastq.gz JD002D_S134_L005_R2_001.fastq.gz JD002E_S135_L005_R1_001.fastq.gz JD002E_S135_L005_R2_001.fastq.gz JD002F_S136_L005_R1_001.fastq.gz JD002F_S136_L005_R2_001.fastq.gz JD002G_S137_L005_R1_001.fastq.gz JD002G_S137_L005_R2_001.fastq.gz JD002H_S138_L005_R1_001.fastq.gz JD002H_S138_L005_R2_001.fastq.gz JD002I_S139_L005_R1_001.fastq.gz JD002I_S139_L005_R2_001.fastq.gz JD002J_S140_L005_R1_001.fastq.gz JD002J_S140_L005_R2_001.fastq.gz JD002K_S141_L005_R1_001.fastq.gz JD002K_S141_L005_R2_001.fastq.gz JD002L_S142_L005_R1_001.fastq.gz JD002L_S142_L005_R2_001.fastq.gz
Well, that didn't work. It eliminated the first column (the checksums). Let's restore our file from the .bak backup file. Actually, I phrased that incorrectly. It did work exactly as it should. Sed edits things by lines, so each line was read and the pattern matching applied, leaving just the file name on each line.
%%bash
mv temp_checksums.md5.bak temp_checksums.md5
%%bash
cat temp_checksums.md5
c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz
Found a solution using awk (duh, as it's perfect for operating on specific columns). The code below uses the gsub fucntion in awk to substitute the longest string that ends with a forward slash with nothing (the empty quotes) in column 2 ($2) and print the result of the new file in its entirety. In this case, I redirect the output of that command to the temp_checksums.md5
file.
%%bash
awk '{gsub(/\/.*\//,"",$2); print}' < temp_checksums.md5.bak > temp_checksums.md5
%%bash
cat temp_checksums.md5
c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz
%%bash
cat demultiplexed_checksums >> demultiplexed_checksums_cat
cat demultiplexed_checksums >> demultiplexed_checksums_cat
cat demultiplexed_checksums >> demultiplexed_checksums_cat
%%bash
cat demultiplexed_checksums_cat
c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz
%%bash
diff demultiplexed_checksums_cat temp_checksums.md5
echo $?
1,72c1,72 < c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz < 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz < 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz < 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz < 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz < c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz < 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz < 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz < 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz < 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz < ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz < d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz < 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz < 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz < 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz < 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz < 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz < 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz < 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz < e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz < 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz < 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz < 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz < 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz < c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz < 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz < 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz < 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz < 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz < c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz < 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz < 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz < 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz < 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz < ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz < d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz < 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz < 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz < 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz < 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz < 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz < 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz < 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz < e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz < 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz < 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz < 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz < 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz < c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz < 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz < 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz < 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz < 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz < c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz < 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz < 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz < 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz < 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz < ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz < d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz < 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz < 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz < 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz < 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz < 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz < 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz < 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz < e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz < 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz < 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz < 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz < 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz --- > c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz > 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz > 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz > 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz > 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz > c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz > 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz > 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz > 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz > 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz > ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz > d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz > 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz > 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz > 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz > 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz > 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz > 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz > 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz > e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz > 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz > 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz > 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz > 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz > c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz > 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz > 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz > 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz > 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz > c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz > 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz > 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz > 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz > 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz > ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz > d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz > 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz > 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz > 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz > 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz > 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz > 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz > 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz > e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz > 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz > 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz > 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz > 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz > c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz > 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz > 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz > 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz > 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz > c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz > 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz > 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz > 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz > 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz > ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz > d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz > 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz > 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz > 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz > 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz > 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz > 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz > 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz > e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz > 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz > 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz > 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz > 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz 1
Argh! Delimiter between the two columns is different!! That leads to diff
identifying each line as being different in each file. Let's try again...
%%bash
UsageError: %%bash is a cell magic, but the cell body is empty.
%%bash
awk '{print $1 " " $2}' temp_checksums.md5.bak > temp_checksums.md5
%%bash
cat temp_checksums.md5
c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz
Yeesh, screwed it up again (but fixed the spacing!). Here we go with another shot. The awk code is changed from earlier in that the print statement is modified to print the first column ($1
), followed by two spaces (that's what's contained in the double quotes), and then print the second column ($2
).
%%bash
awk '{gsub(/\/.*\//,"",$2); print $1 " " $2}' < temp_checksums.md5.bak > temp_checksums.md5
%%bash
cat temp_checksums.md5
c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz
%%bash
diff demultiplexed_checksums_cat temp_checksums.md5
echo $?
0
Boom! Got it! Now, I'll append the facility checksums to the cheksum files in the directories on Owl.
%%bash
cat demultiplexed_checksums >> /owl_web/nightingales/P_generosa/checksums.md5
cat demultiplexed_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5
cat demultiplexed_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5
%%bash
cd
rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/
ls -lh /data/20170227_jay_data_tmp
total 0
cd /data/20170227_jay_data_tmp
/data/20170227_jay_data_tmp
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined*" --reject "*S0_L00[1234678]"ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
wget: missing URL Usage: wget [OPTION]... [URL]... Try `wget --help' for more options. real 0m0.026s user 0m0.010s sys 0m0.000s
Typo! Need space after reject list (between quotation and URL)
The wget command below adds an accept list and reject list to download only Jay's sequencing files (his were in Lane 5; L005) and the corresponding MD5 checksum file, named "Undetermined_checksums".
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined*" --reject "*S0_L00[1234678]" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
Process is interrupted.
Turns out, the accept/reject lists weren't working - all of the "Undetermined files were being download.
ls -lh
total 0
drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 gslserver.qb3.berkeley.edu/
ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
total 9.1G -rw-r--r-- 1 srlab staff 528M Feb 21 21:32 Undetermined_S0_L001_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 745M Feb 21 21:32 Undetermined_S0_L001_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 2.0G Feb 21 22:00 Undetermined_S0_L002_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 2.4G Feb 21 22:00 Undetermined_S0_L002_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 1.9G Feb 21 22:27 Undetermined_S0_L003_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 1.7G Mar 1 21:53 Undetermined_S0_L003_R2_001.fastq.gz
rm *.gz gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
rm: cannot remove '*.gz': No such file or directory rm: cannot remove 'gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/': Is a directory
%%bash
for i in gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/*.gz
do
rm "$i"
done
ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
total 0
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined_S0_L005*" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
real 11m7.430s user 0m4.890s sys 4m35.840s
ls -lh
total 0
drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 gslserver.qb3.berkeley.edu/
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined_checksums" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
real 0m2.741s user 0m0.020s sys 0m0.040s
ls -lh
total 0
drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 gslserver.qb3.berkeley.edu/
%%bash
time for i in *.gz
do
md5sum "$i" >> checksums.md5
done
md5sum: *.gz: No such file or directory real 0m0.016s user 0m0.000s sys 0m0.000s
%%bash
diff Undetermined_checksums checksums.md5
echo $?
2
diff: Undetermined_checksums: No such file or directory
Duh! I ran all of this from the wrong directory. Here we go again...
cd gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA
%%bash
ls -lh
total 5.2G drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Alfaro drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Chang drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Coates drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Doudna drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Johnson drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Pachter drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Roberts -rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Wayne
%%bash
time for i in *.gz
do
md5sum "$i" >> checksums.md5
done
real 1m48.294s user 0m15.830s sys 0m33.760s
%%bash
diff Undetermined_checksums checksums.md5
echo $?
0
%%bash
cat Undetermined_checksums
484082c497c7a52fa225cb0983c709a9 Undetermined_S0_L005_R1_001.fastq.gz 9718d259172f2c05ef97eb0d439c31da Undetermined_S0_L005_R2_001.fastq.gz
%%bash
time for file in *.gz
do
cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
done
real 20m26.143s user 0m0.040s sys 3m3.080s
%%bash
time for i in /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 5m2.141s user 0m7.910s sys 0m37.020s
%%bash
time for i in /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 5m42.952s user 0m6.680s sys 0m39.320s
%%bash
time for i in /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R*.gz
do
md5sum "$i" >> temp_checksums.md5
done
real 5m57.994s user 0m6.940s sys 0m38.520s
%%bash
cat temp_checksums.md5
484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R1_001.fastq.gz 9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R2_001.fastq.gz 484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R1_001.fastq.gz 9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R2_001.fastq.gz 484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R1_001.fastq.gz 9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R2_001.fastq.gz
%%bash
cat Undetermined_checksums
484082c497c7a52fa225cb0983c709a9 Undetermined_S0_L005_R1_001.fastq.gz 9718d259172f2c05ef97eb0d439c31da Undetermined_S0_L005_R2_001.fastq.gz
Looks like everything matches.
%%bash
cat Undetermined_checksums >> /owl_web/nightingales/P_generosa/checksums.md5
cat Undetermined_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5
cat Undetermined_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5
I'm not entirely sure what this is, but it might be useful to have.
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "laneBarcode.html" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/
real 0m2.784s user 0m0.000s sys 0m0.070s
ls -lh
total 5.2G drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Alfaro/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Chang/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Coates/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Doudna/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Johnson/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Pachter/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Roberts/ -rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Wayne/ -rw-r--r-- 1 srlab staff 142 Mar 1 22:18 checksums.md5 drwxr-xr-x 1 srlab staff 102 Mar 1 23:13 gslserver.qb3.berkeley.edu/ -rw-r--r-- 1 srlab staff 636 Mar 1 22:59 temp_checksums.md5
%%bash
head gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html xmlns:bcl2fastq="http://www.illumina.com/bcl2fastq"> <link rel="stylesheet" href="../../../../Report.css" type="text/css"> <body> <table width="100%"><tr> <td><p><p>HG3WNBBXX / [all projects] / [all samples] / [all barcodes]</p></p></td> <td><p align="right"><a href="../../../../HG3WNBBXX/all/all/all/lane.html">hide barcodes</a></p></td>
%%bash
tail gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html
<td>GTCCCG</td> <td>187,142</td> <td>CGGACGAG</td> <td>290,611</td> <td>CGAGGCTG+CACAAAAA</td> </tr> </table> <p></p> </body> </html>
I'm going to rename the file so that it has better association with these files and then copy to each of the directories on Owl.
%%bash
mv gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html JD_L005_laneBarcode.html
ls -l
total 5426292 drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Alfaro/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Chang/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Coates/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Doudna/ -rw-r--r-- 1 srlab staff 43031 Feb 22 00:29 JD_L005_laneBarcode.html drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Johnson/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Pachter/ drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Roberts/ -rw-r--r-- 1 srlab staff 2281224946 Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz -rw-r--r-- 1 srlab staff 3275238267 Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz -rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Wayne/ -rw-r--r-- 1 srlab staff 142 Mar 1 22:18 checksums.md5 drwxr-xr-x 1 srlab staff 102 Mar 1 23:13 gslserver.qb3.berkeley.edu/ -rw-r--r-- 1 srlab staff 636 Mar 1 22:59 temp_checksums.md5
%%bash
cp JD_L005_laneBarcode.html /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html
cp JD_L005_laneBarcode.html /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html
cp JD_L005_laneBarcode.html /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html
%%bash
ls /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html
ls /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html
ls /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html
/owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html