Skip to content
Advertisement

Iterating over lines in a file python

I have seen these two ways to process a file:

file = open("file.txt")
for line in file:
    #do something

file = open("file.txt")
contents = file.read()
for line in contents:
    # do something

I know that in the first case, the file will act like a list, so the for loop iterates over the file as if it were a list. What exactly happens in the second case, where we read the file and then iterate over the contents? What are the consequences of taking each approach, and how should I choose between them?

Advertisement

Answer

In the first one you are iterating over the file, line by line. In this scenario, the entire file data is not read into the memory at once; instead, only the current line is read into memory. This is useful for handling very large files, and good for robustness if you don’t know if the file is going to be large or not.

In the second one, file.read() returns the complete file data as a string. When you are iterating over it, you are actually iterating over the file’s data character by character. This reads the complete file data into memory.

Here’s an example to show this behavior.

a.txt file contains

Hello
Bye

Code:

>>> f = open('a.txt','r')
>>> for l in f:
...     print(l)
...
Hello

Bye


>>> f = open('a.txt','r')
>>> r = f.read()
>>> print(repr(r))
'HellonBye'
>>> for c in r:
...     print(c)
...
H
e
l
l
o


B
y
e
Advertisement