#!/usr/bin/env python # coding: utf-8 # # 6. Reading and writing files # One important aspect of using Python is being able to read and write files. To do this you use the `open` command: # # ``` # open(filename-in-quotes, mode) # ``` # # In place of `mode` you can specify: # - `'r'`, to open a file you want to read # - `'w'`, to open a file that you want to write (starting a new file). # - `'a'`, to open a file that you want to append to (adding to an existing file). # # So, to open a file with filename `file1.txt` in "read" mode: # # ``` # open('file1.txt', 'r') # ``` # # To open a file, `file2.txt`, in "write" mode: # # ``` # open('file2.txt', 'w') # ``` # # **CAREFUL: this will create a new file called `file2.txt`. If you had an existing file named `file2.txt` it will be overwritten.** # # To open a file, `file3.txt`, in 'append' mode: # # ``` # open('file3.txt', 'a') # ``` # # If you don't specify the mode, the default is `'r'`. # # ## 6.1 Reading files # # When we open a file in 'read' mode, we will also want to be able to access what has been read. # # To start with, we will create an opened file object and set it as a new variable, `f`. (You could change `f` to whatever you want!): # In[ ]: f = open('datafiles/file1.txt', 'r') # The file object, `f`, is an *iterator*, that is we can iterate through the file object (using a `for` loop) to obtain each new line in the file. We can then perform some operation on each line of the file. # # For example, if we just want to print each line in the file: # In[ ]: for line in f: print(line) # Open the file, `file1.txt`, in a text editor and check that the file indeed contains the lines we printed above. # # Once we have iterated once through the file object, *f*, it is empty, so if we do the same thing again, nothing will happen: # In[ ]: for line in f: print(line) # If we want to read the lines again, we need to close the file object, using `f.close()`, and reopen the file: # # In[ ]: f.close() f = open("datafiles/file1.txt", "r") # Now we can read the lines in the file again: # In[ ]: for line in f: print(line) # Let's close the file object again, so we don't have it hanging around! # In[ ]: f.close() # If you only wanted to read through the contents of the file once, this would be fine, but it is quite a nuisance if you want to refer back to the lines in the file later on. # # Instead, we can convert the file object to a list, which then gives us a list of the lines in the file, that we can go back to: # In[ ]: # First open the file object file_object = open("datafiles/file1.txt", "r") # Then we convert it to a list file_list = list(file_object) # Let's print the contents of the list print(file_list) # Finally we have to make sure that the file object is closed file_object.close() # Note that: # 1. This is a list of string objects, and each string ends in a new line character. # 2. We can still use file_list to print out the lines of the file in a 'for' loop. # In[ ]: for line in file_list: print(line) # You can also use the `readlines()` *method* function: # In[ ]: # First open the file object file_object = open("datafiles/file1.txt","r") # Then we use the readlines method to get all the lines file_list2 = file_object.readlines() # We close the file object for safety file_object.close() # We print the extracted list print(file_list2) # But now we can use the functions that work on lists to access the lines in the file. # # Let's put that into practise with an exercise: # ### Exercise 6.1.1 # # a) Print out just the first line in the file # In[ ]: # Exercise 6.1.1 a) # b) Print out all except the first line in the file # In[ ]: # Exercise 6.1.1 b) # c) Change the third peptide sequence from 'PLSDMASI' to 'PLSEMASI' # In[ ]: # Exercise 6.1.1 c) # d) Add a new peptide sequence (YYVHNKSERFT) to the end of `file_list` # In[ ]: # Exercise 6.1.1 d) # e) Now print all the peptide sequences again (without the first line) # In[ ]: # Exercise 6.1.1 e) # ## 6.2 Writing files # # To write the updated list of peptides to a new file we would need to open a new file, using "write" mode, and then use the `write` *method* fuction. # # The following example opens a new file, `file2.txt` (as mentioned above, we need to be careful that this file doesn't already exist, otherwise it will be overwritten), then writes three lines to the new file. Notice that the argument to the `write` function is a single string. We make use of the new line character `\n` as part of the string to make separate lines. # In[ ]: file_object_2 = open('file2.txt', 'w') file_object_2.write('In this new file\nthis is the second line\n') # Note: the write function returns the number of characters written to the file. # If I want to write more to the file I can do so by using the `write` function again: # In[ ]: file_object_2.write('and this is the third line\n') # Once finished writing to the file, we should close the file, using `close` as before. # In[ ]: file_object_2.close() # We can now no longer write to `file_object_2`: # In[ ]: file_object_2.write('another line\n') # Open `file2.txt` in a text editor to check that the lines we wrote to the file are there. # # If we now want to append a fourth line to `file2.txt`, we can open the file again, this time in "append' mode. # # The following opens `file2.txt` in append mode, writes a third line to that file, then closes the file object. # In[ ]: file_object_3 = open('file2.txt', 'a') file_object_3.write('this line will be appended\n') file_object_3.close() # Check `file2.txt` again (by opening in a text editor) # # ### Exercise 6.2.1 # # Going back to our file of peptide sequences (`file1.txt`): write the updated list of peptide sequence lines to a new file called `peptides.txt`. Check your answer by opening`peptides.txt` in a text editor. # # In[ ]: # Exercise 6.2.1 # ## 6.3 The context manager # # If you don't want to have to remember to close the file when you're done with it, you can work with your file inside a `with` statement. Your file will be automatically closed at the end of the `with` statement. The technical term you may see for this type of usage is using a file as a 'context manager'. # In[ ]: with open('file3.txt', 'w') as out_file: # within this statement, the file is open out_file.write('test') # If we try to write outside the context manager, python will give us an error as the file is no longer opened: # In[ ]: with open('file3.txt', 'w') as out_file: # within this statement, the file is open out_file.write('test') # now the file is closed out_file.write('test') # ## Review # # In this section we have learned the following: # - How to open files using `open` ('*filename*', '*mode*'). # - The 'r' is read mode; 'w' is write mode and 'a' is append mode. # - To perform operation on the lines in a file that has been opened in read mode, convert the file object to a list, using list(*file_object*). # - If opening a file in write mode, the new file will overwrite another file of the same name. # - To write lines to a file opened in append or write mode, use *file_object*.write('*text-to-write-to-file*'). # - Once finished with a file object, close it using *file_object*.close(). # - Using a context manager to handle file opening and closing within a `with` statement. #