Python tutorial: File Operations




A file is a collection of data stored in a storage unit like hard disk. While running test scripts, these files can be used to store your input or outputs of your test script execution. Like other high level languages Python also has inbuilt support for file handling, you don’t have to import any libraries to use these functions.

Commonly used file handling functions:

File functions Description
open() Used to create or open a file
fid.read() To read the full contents or some bytes from file
fid.readline() To read line by line
fid.write() To write something in a file
fid.writelines() To write a list of lines to a file
fid.seek() To move the position of file object
fid.tell() To get the position of file object
fid.readable() To check whether file can be used for reading
fid.writable() To check whether file can be used for writing

Using Open() function

open() is used to open a file which will return a file object that has above mentioned functions

Syntax:

fid = open ("abcd.txt","r")

Here first argument “abcd.txt” is a file name and second argument “r” is the mode indicating how the file is to be opened.

“fid” is not a keyword , it is just a file object which has methods like read() , write() , seek() , close() etc

Note: In the above code, the second argument is optional. If only one argument is given , file will be opened in “read” mode.

If the file is in another location, you have to provide full path followed by file name.

File modes Description
r Open file for reading only. File should be present
r+ Open file for reading and writing. File should be present
w Open file only for writing. If file is not present , file will be created. If present contents will be deleted
w+ Open file only for reading and writing. If file is not present , file will be created. If present contents will be deleted
a Open file for appending.If file is not present , file will be created
a+ Open file for appending and reading. If file is not present , file will be created.
x Exclusive open : New file will be created for writing. If file is present, exception is raised

Reading from a file:

We can read the contents of a file in couple of ways.

Make sure that the file is present before using the “read” mode , if the file is not present an “OSError” is raised

Before you try the following code, create a file "abcd.txt" with the following lines in the same folder from where you are executing your script


ethernet0 status: up

ethernet1 status: down

ethernet2 status: up


1st method : Using read() function


fid = open ("abcd.txt","r")

tmp =  fid.read () 

print (tmp)


------------Output-----------

ethernet0 status: up

ethernet1 status: down

ethernet2 status: up


here “fid.read()” is returning entire contents of the file “abcd.txt” in string format.

Example 2:
		
fid = open ("abcd.txt","r")

print ( fid.read(10) )


------------Output-----------

ethernet0


If you provide a number as an argument to read() , it will read only that many bytes from the current position.

2nd method : Using readline() function

readline() will return one line at a time


fid = open ("abcd.txt","r")

print (fid.readline())

------------Output-----------

ethernet0 status: up


3rd method : Using readlines()

		
fid = open ("abcd.txt","r")

print ( fid.readlines() )


------------Output-----------

['ethernet0 status: up\n', 'ethernet1 status: down\n', 'ethernet2 status: up\n']


readlines() will return the entire contents of a file in “LIST” format

Reading a particular line using readlines().

You can use “list indexing” to take out a particular line by giving the index (line number) with the readlines().

For example if you want to print only the second line from the above file , try the following code.

	

fid = open ("a.txt","r")

tmp = fid.readlines ()

print (tmp[2])

# or 

fid = open ("a.txt","r")

print (fid.readlines()[2])


4th method : Reading line by line using looping over file object


fid = open ("abcd.txt","r")

for tmp in fid:

	print (tmp)
	

------------Output-----------

ethernet0 status: up

ethernet1 status: down

ethernet2 status: up


5th method: Using list

	
fid = open ("a.txt","r")

print (list (fid))


------------Output-----------

['ethernet0 status: up\n', 'ethernet1 status: down\n', 'ethernet2 status: up\n']




Writing to a file using write() function

To open a file for writing use “w” mode. Note that the existing contents will be deleted when “w” mode is used.

		
fid = open ("abcd.txt", "w")

fid.write ("hello world")

fid.close()


write function supports string identifiers in arguments (similar to print function)


a =['ethernet0 status: up',  'ethernet1 status: down',  'ethernet2 status: up']

fid = open ("abcd.txt", "w")

for tmp in a:
	fid.write ("%s \n" %tmp)
	

------------Output-----------

ethernet0 status: up

ethernet1 status: down

ethernet2 status: up


Use writelines() to write a list of lines to a file

		
a =['ethernet0 status: up\n', 'ethernet1 status: down\n', 'ethernet2 status: up\n']

fid = open ("xyz.txt", "w")

fid.writelines(a)

fid.close()


------------Output-----------

ethernet0 status: up

ethernet1 status: down

ethernet2 status: up




Using seek () to move the postion of file object

Assume that there is an invisible cursor moving through the file when you are reading or writing. Your read or write command will change the position of this cursor depending on how many bytes you are reading or writing. Seek function is used move this invisible cursor to a different position.

Examples:

fid.seek(5,0) will move the cursor 6 bytes from the starting position.

fid.seek(5,1) will move the cursor 6 bytes from the current position.

fid.seek(-5,2) will move the cursor 5 bytes before the end position.

Here the first argument represent the offset and second argument from where to move - 0 represents start , 1 represents current position and 2 represents end.

If seek is used with out the second argument, cursor is moved from the starting position

For eg, fid.seek(5) means move cursor 6 bytes from starting position.



Using Tell() function:

Tell function can be used to get the current position of cursor.

Syntax: fid.tell()

tell() will return a number which is current position of the cursor.


Difference between “r+” and “w+” modes.

"r+" allows reading and writing

If the file has some contents, you can read the contents first and then start writing.

If you read first the cursor will be moved to the end of the file and whatever you are writing will not overwrite the existing contents.

	
fid = open ("abcd.txt" , "r+")

tmp = fid.read()

fid.write ("hello world")

fid.seek(0)

print (fid.read())

fid.close()


------------Output-----------

ethernet0 status: up 

ethernet1 status: down
 
ethernet2 status: up 

hello world


"w+" can also be used for reading and writing.

If you use “w+” it will delete all contents first, so there will be nothing to read initially. So you have to write something , then use seek function to move the cursor and then you can use read function.

	
fid = open ("abcd.txt" , "w+")

fid.write ("hello world")

fid.seek(0)

print (fid.read())

fid.close()


------------Output-----------

hello world


Python file types:

In python files are classified in to two categories: Text or Binary

A text file contains a group of characters or set of lines in the form of text strings. (The files used in the above examples are all text files).

The end of every line is represented with a special character called as EOL (end of the line). The special characters used for EOL depends on the platform.

EOL in windows is “\r\n”

EOL in unix\linux is “\n”

So when python is used in a windows machine it has to convert to “\n” to “\r\n” while writing to a file and vice versa while reading.

Text strings will be stored in a file using some encoding formats like “UTF-8” , ASCII etc. When we use read or write command, python is decoding/encoding the data in the background along with EOL conversions.

Binary files:

Any non-text file like Image files, executable files, audio files etc are considered as binary files. Python will neither decode/encode nor convert EOL when working with binary files.

When a binary file is read , return value will be in byte string format not in text string format. While writing we have to provide the input in byte-like objects (bytes or bytearrays)

Modes to work with binary files : “rb” , “wb”, “ab” , “r+b” , “w+b” , “a+b”

Buffering:

When python reads contents of a text file it will read a chunk of data and will keep in a buffer. Further reading will take data from this buffer , and this buffer will be filled as and when we read. This method is more efficient than reading frequently from hard disk or other types of memories using the operating system’s file interface.

Commonly seen buffer size is 4092 or 8192 bytes. You can check this using the following code

	
import io

print (io.DEFAULT_BUFFER_SIZE)


------------Output-----------

8192


Syntax of Open command with buffering:

fid = open ("abcd.txt" , "r",buffering=100)

Buffering = 0 - Means buffering is disabled

Buffering = 1 - Means “line buffering”

Buffering = n - Means buffer size is “n” bytes (n should be > 1)

Buffering = -1 - Means buffering is enabled and default buffer size is used.

Note: Buffering can be disabled only in binary files



Some more file commands:

	
>>> fid.encoding
'UTF-8'

>>> fid.mode
'r'

>>> fid.readable()
True

>>> fid.writable()
False

>>> fid.seekable()
True

>>> fid.line_buffering
False

>>> fid.name
'abcd.txt'




Python Exercises

1.Write a python program to count the number of words in a file

2.Write a python program to count the number of lines in a file

3.Write a python program to copy the contents of one file to another

4.Write a python program to print from line 2 to line 5 (assuming the file has more than 5 lines)

5.Write a python program to insert a new line at the beginning of the file

6.Write a python program to replace a line with another of a file

7.Write a python program to move the contents of a file to an array

8.Write a python program to print the last two lines of a file

Router# show ip interface brief
Interface     IP-Address     OK?  Method  Status                  Protocol
Ethernet0     10.108.00.5    YES  NVRAM   up                      up      
Ethernet1     unassigned     YES  unset   administratively down   down    
Loopback0     10.108.200.5   YES  NVRAM   up                      up      
Serial0       10.108.100.5   YES  NVRAM   up                      up      

Store the above lines in a file and

9.1.Write a python program to check given IP address is present in a file

9.2.Write a python program to status of given interface

9.3.Write a python program to find how many interface are UP

9.4.Write a python program to print all interfaces which are UP


Next chapter:Python training : Exception Handling


Previous chapter:Python training : Classes and Objects Part 2