Write a Program That Reads a File Containing Text

Reading and Writing Text Files

Overview

Teaching: 60 min
Exercises: 30 min

Questions

  • How can I read in information that is stored in a file or write data out to a file?

Objectives

  • Be able to open a file and read in the data stored in that file

  • Understand the departure between the file name, the opened file object, and the data read in from the file

  • Be able to write output to a text file with elementary formatting

Why do we desire to read and write files?

Existence able to open and read in files allows usa to piece of work with larger data sets, where it wouldn't be possible to type in each and every value and store them one-at-a-fourth dimension as variables. Writing files allows united states to process our data and so salve the output to a file so nosotros can look at information technology afterward.

Right now, we will do working with a comma-delimited text file (.csv) that contains several columns of data. However, what y'all larn in this lesson can be applied to any general text file. In the adjacent lesson, you lot volition acquire another way to read and procedure .csv information.

Paths to files

In order to open up a file, we demand to tell Python exactly where the file is located, relative to where Python is currently working (the working directory). In Spyder, we can exercise this past setting our current working directory to the binder where the file is located. Or, when we provide the file name, we can give a complete path to the file.

Lesson Setup

Nosotros will work with the do file Plates_output_simple.csv.

  1. Locate the file Plates_output_simple.csv in the directory home/Desktop/workshops/bash-git-python.
  2. Copy the file to your working directory, dwelling/Desktop/workshops/YourName.
  3. Make sure that your working directory is also set to the folder dwelling/Desktop/workshops/YourName.
  4. As yous are working, make sure that y'all save your file opening script(southward) to this directory.

The File Setup

Allow's open and examine the structure of the file Plates_output_simple.csv. If you open the file in a text editor, you will come across that the file contains several lines of text.

DataFileRaw

However, this is adequately difficult to read. If y'all open the file in a spreadsheet plan such equally LibreOfficeCalc or Excel, you can see that the file is organized into columns, with each column separated past the commas in the image in a higher place (hence the file extension .csv, which stands for comma-separated values).

DataFileColumns

The file contains one header row, followed by eight rows of data. Each row represents a unmarried plate prototype. If we look at the column headings, we can meet that we have collected data for each plate:

  • The proper noun of the image from which the data was collected
  • The plate number (there were four plates, with each plate imaged at two different time points)
  • The growth status (either control or experimental)
  • The observation timepoint (either 24 or 48 hours)
  • Colony count for the plate
  • The average colony size for the plate
  • The pct of the plate covered past bacterial colonies

Nosotros will read in this data file then work to clarify the data.

Opening and reading files is a three-step process

We volition open and read the file in iii steps.

  1. We will create a variable to hold the name of the file that we desire to open up.
  2. Nosotros will telephone call a open to open the file.
  3. We volition call a function to actually read the data in the file and store it in a variable so that we can process it.

And then, there'southward one more step to exercise!

  • When we are done, nosotros should recollect to close the file!

You can call up of these 3 steps as existence similar to checking out a book from the library. Get-go, you have to go to the catalog or database to find out which book you need (the filename). Then, you lot have to go and get it off the shelf and open the book upwards (the open function). Finally, to gain any information from the book, yous have to read the words (the read function)!

Here is an example of opening, reading, and endmost a file.

                          #Create a variable for the file proper noun              filename              =              'Plates_output_simple.csv'              #This is just a string of text              #Open the file              infile              =              open up              (              filename              ,              'r'              )              # 'r' says we are opening the file to read, infile is the opened file object that we volition read from              #Store the data from the file in a variable              information              =              infile              .              read              ()              #Print the information in the file              print              (              information              )              #shut the file              infile              .              shut              ()                      

Once we have read the data in the file into our variable data, we tin can treat it similar whatever other variable in our code.

Use consistent names to make your code clearer

It is a adept idea to develop some consistent habits well-nigh the way y'all open and read files. Using the same (or similar!) variable names each time will make it easier for you lot to keep track of which variable is the proper name of the file, which variable is the opened file object, and which variable contains the read-in data.

In these examples, we will use filename for the text string containing the file name, infile for the open file object from which nosotros can read in information, and data for the variable holding the contents of the file.

Commands for reading in files

There are a variety of commands that allow usa to read in data from files.
infile.read() volition read in the entire file as a unmarried cord of text.
infile.readline() will read in one line at a time (each time you phone call this command, information technology reads in the side by side line).
infile.readlines() will read all of the lines into a list, where each line of the file is an item in the listing.

Mixing these commands can have some unexpected results.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #Open the file              infile              =              open              (              filename              ,              'r'              )              #Print the first two lines of the file              print              (              infile              .              readline              ())              print              (              infile              .              readline              ())              #telephone call infile.read()              impress              (              infile              .              read              ())              #close the file              infile              .              shut              ()                      

Notice that the infile.read()control started at the third line of the file, where the first two infile.readline() commands left off.

Think of information technology like this: when the file is opened, a pointer is placed at the top left corner of the file at the beginning of the showtime line. Whatever time a read function is chosen, the cursor or arrow advances from where information technology already is. The first infile.readline() started at the showtime of the file and advanced to the end of the kickoff line. At present, the pointer is positioned at the beginning of the 2nd line. The 2nd infile.readline() avant-garde to the finish of the 2nd line of the file, and left the pointer positioned at the commencement of the third line. infile.read() began from this position, and advanced through to the stop of the file.

In full general, if you want to switch betwixt the unlike kinds of read commands, you should close the file and and so open up it again to showtime over.

Reading all of the lines of a file into a listing

infile.readlines() will read all of the lines into a list, where each line of the file is an item in the list. This is extremely useful, because in one case we have read the file in this fashion, we can loop through each line of the file and process it. This approach works well on data files where the data is organized into columns like to a spreadsheet, because it is probable that we will want to handle each line in the same way.

The example below demonstrates this approach:

                          #Create a variable for the file name              filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              #lines is a list with each item representing a line of the file              if              'control'              in              line              :              print              (              line              )              #print lines for control condition              infile              .              close              ()              #close the file when you're done!                      

Using .split() to separate "columns"

Since our data is in a .csv file, we can apply the dissever command to separate each line of the file into a list. This tin be useful if we want to admission specific columns of the file.

                          #Create a variable for the file name                            filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              sline              =              line              .              carve up              (              ','              )              # separates line into a list of items.  ',' tells it to separate the lines at the commas              print              (              sline              )              #each line is now a list              infile              .              close              ()              #Ever close the file!                      

Consistent names, over again

At first glance, the variable name sline in the example above may non make much sense. In fact, we chose it to be an abridgement for "split up line", which exactly describes the contents of the variable.

You don't take to employ this naming convention if you don't desire to, but yous should work to use consistent variable names across your lawmaking for common operations like this. It will arrive much easier to open an former script and quickly understand exactly what it is doing.

Converting text to numbers

When we called the readlines() command in the previous lawmaking, Python reads in the contents of the file as a string. If we want our code to recognize something in the file every bit a number, we need to tell it this!

For example, float('5.0') will tell Python to treat the text string '5.0' as the number 5.0. int(sline[iv]) will tell our code to treat the text cord stored in the 5th position of the list sline as an integer (not-decimal) number.

For each line in the file, the ColonyCount is stored in the 5th cavalcade (index 4 with our 0-based counting).
Modify the code above to print the line simply if the ColonyCount is greater than thirty.

Solution

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  split up                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to separate the lines at the commas                  colonyCount                  =                  int                  (                  sline                  [                  iv                  ])                  #shop the colony count for the line as an integer                  if                  colonyCount                  >                  30                  :                  print                  (                  sline                  )                  #close the file                  infile                  .                  close                  ()                              

Writing data out to a file

Ofttimes, nosotros will desire to write data to a new file. This is especially useful if we have done a lot of computations or data processing and we want to be able to save it and come back to information technology later.

Writing a file is the same multi-step process

Simply like reading a file, we will open and write the file in multiple steps.

  1. Create a variable to hold the name of the file that we want to open up. Oftentimes, this volition exist a new file that doesn't yet exist.
  2. Call a function to open the file. This time, nosotros will specify that we are opening the file to write into information technology!
  3. Write the information into the file. This requires some conscientious attention to formatting.
  4. When we are washed, we should recall to close the file!

The lawmaking below gives an case of writing to a file:

                          filename              =              "output.txt"              #west tells python we are opening the file to write into it              outfile              =              open up              (              filename              ,              'w'              )              outfile              .              write              (              "This is the kickoff line of the file"              )              outfile              .              write              (              "This is the second line of the file"              )              outfile              .              close              ()              #Close the file when we're done!                      

Where did my file terminate up?

Any time you open a new file and write to it, the file will exist saved in your current working directory, unless you lot specified a different path in the variable filename.

Newline characters

When yous examine the file yous simply wrote, you will see that all of the text is on the same line! This is considering we must tell Python when to kickoff on a new line by using the special cord grapheme '\n'. This newline grapheme volition tell Python exactly where to start each new line.

The example below demonstrates how to use newline characters:

                          filename              =              'output_newlines.txt'              #w tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'westward'              )              outfile              .              write              (              "This is the start line of the file              \due north              "              )              outfile              .              write              (              "This is the second line of the file              \north              "              )              outfile              .              shut              ()              #Close the file when we're done!                      

Go open the file you but wrote and and check that the lines are spaced correctly.:

Dealing with newline characters when you read a file

Y'all may have noticed in the last file reading example that the printed output included newline characters at the stop of each line of the file:

['colonies02.tif', '2', 'exp', '24', '84', '3.two', '22\n']
['colonies03.tif', '3', 'exp', '24', '792', '3', '78\n']
['colonies06.tif', '2', 'exp', '48', '85', 'v.2', '46\n']

We tin can get rid of these newlines by using the .strip() role, which volition get rid of newline characters:

                              #Create a variable for the file name                filename                =                'Plates_output_simple.csv'                ##Open up the file                infile                =                open                (                filename                ,                'r'                )                lines                =                infile                .                readlines                ()                for                line                in                lines                [                one                :]:                #skip the outset line, which is the header                sline                =                line                .                strip                ()                #get rid of trailing newline characters at the end of the line                sline                =                sline                .                split                (                ','                )                # separates line into a list of items.  ',' tells it to carve up the lines at the commas                colonyCount                =                int                (                sline                [                four                ])                #store the colony count for the line as an integer                if                colonyCount                >                30                :                print                (                sline                )                #close the file                infile                .                close                ()                          

Writing numbers to files

Just similar Python automatically reads files in every bit strings, the write()function expects to just write strings. If we want to write numbers to a file, we will need to "cast" them as strings using the function str().

The code below shows an instance of this:

                          numbers              =              range              (              0              ,              x              )              filename              =              "output_numbers.txt"              #west tells python we are opening the file to write into information technology              outfile              =              open up              (              filename              ,              'w'              )              for              number              in              numbers              :              outfile              .              write              (              str              (              number              ))              outfile              .              shut              ()              #Close the file when we're done!                      

Writing new lines and numbers

Become open up and examine the file you but wrote. You will see that all of the numbers are written on the same line.

Modify the code to write each number on its own line.

Solution

                                  numbers                  =                  range                  (                  0                  ,                  10                  )                  #Create the range of numbers                  filename                  =                  "output_numbers.txt"                  #provide the file name                  #open up the file in 'write' fashion                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  for                  number                  in                  numbers                  :                  outfile                  .                  write                  (                  str                  (                  number                  )                  +                  '                  \northward                  '                  )                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

The file you just wrote should be saved in your Working Directory. Open the file and check that the output is correctly formatted with one number on each line.

Opening files in different 'modes'

When we have opened files to read or write data, we have used the function parameter 'r' or 'west' to specify which "manner" to open the file.
'r' indicates we are opening the file to read data from information technology.
'due west' indicates we are opening the file to write data into it.

Be very, very careful when opening an existing file in 'west' mode.
'w' will over-write any data that is already in the file! The overwritten information will be lost!

If you lot want to add on to what is already in the file (instead of erasing and over-writing it), you tin can open the file in suspend mode by using the 'a' parameter instead.

Pulling information technology all together

Read in the data from the file Plates_output_simple.csv that nosotros accept been working with. Write a new csv-formatted file that contains only the rows for control plates.
You volition need to practise the following steps:

  1. Open up the file.
  2. Use .readlines() to create a list of lines in the file. Then close the file!
  3. Open a file to write your output into.
  4. Write the header line of the output file.
  5. Use a for loop to allow you lot to loop through each line in the listing of lines from the input file.
  6. For each line, check if the growth condition was experimental or command.
  7. For the control lines, write the line of information to the output file.
  8. Close the output file when y'all're done!

Solution

Here'southward one way to practice it:

                                  #Create a variable for the file proper noun                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file later on                  #close the input file                  infile                  .                  close                  ()                  #Create the file we will write to                  filename                  =                  'ControlPlatesData.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  outfile                  .                  write                  (                  lines                  [                  0                  ])                  #This will write the header line of the file                                    for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  dissever                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to split the lines at the commas                  condition                  =                  sline                  [                  2                  ]                  #store the condition for the line as a string                  if                  condition                  ==                  "command"                  :                  outfile                  .                  write                  (                  line                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

Challenge Trouble

Open and read in the information from Plates_output_simple.csv. Write a new csv-formatted file that contains merely the rows for the control condition and includes only the columns for Time, colonyCount, avgColonySize, and percentColonyArea. Hint: you can utilize the .join() part to join a list of items into a cord.

                              names                =                [                'Erin'                ,                'Marking'                ,                'Tessa'                ]                nameString                =                ', '                .                bring together                (                names                )                #the ', ' tells Python to bring together the list with each item separated by a comma + space                print                (                nameString                )                          

'Erin, Mark, Tessa'

Solution

                                  #Create a variable for the input file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file later                  #shut the file                  infile                  .                  close                  ()                  # Create the file we will write to                  filename                  =                  'ControlPlatesData_Reduced.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  #Write the header line                  headerList                  =                  lines                  [                  0                  ]                  .                  split                  (                  ','                  )[                  3                  :]                  #This will return the list of cavalcade headers from 'time' on                  headerString                  =                  ','                  .                  join                  (                  headerList                  )                  #join the items in the listing with commas                  outfile                  .                  write                  (                  headerString                  )                  #There is already a newline at the end, so no demand to add one                  #Write the remaining lines                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the start line, which is the header                  sline                  =                  line                  .                  divide                  (                  ','                  )                  # separates line into a listing of items.  ',' tells information technology to split up the lines at the commas                  condition                  =                  sline                  [                  ii                  ]                  #store the colony count for the line as an integer                  if                  status                  ==                  "control"                  :                  dataList                  =                  sline                  [                  3                  :]                  dataString                  =                  ','                  .                  join                  (                  dataList                  )                  outfile                  .                  write                  (                  dataString                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Shut the file when we're done!                              

Cardinal Points

  • Opening and reading a file is a multistep process: Defining the filename, opening the file, and reading the data

  • Data stored in files can be read in using a variety of commands

  • Writing information to a file requires attention to data types and formatting that isn't necessary with a print() statement

berryhisgrat.blogspot.com

Source: https://eldoyle.github.io/PythonIntro/08-ReadingandWritingTextFiles/

0 Response to "Write a Program That Reads a File Containing Text"

ارسال یک نظر

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel