Skip to content
Advertisement

Python split function. Too many values to unpack error

I have a python function that must read data from file and split it into two key and value, and then store it in dictionary. Example: file:

http://google.com 2
http://python.org 3
# and so on a lot of data

I use the split function for it, but when there is really a lot of data it raises value error

ValueError: too many values to unpack

What can I do about this ?

This is the exact code that fails

with open(urls_file_path, "r") as f:
    for line in f.readlines():
        url, count = line.split()# fails here
        url_dict[url] = int(count)

Advertisement

Answer

You are trying to unwrap the split list in to these two variables.

url, count = line.split()

What if there is no space or two or more spaces? Where will the rest of the words go?

data = "abcd"
print data.split()    # ['abcd']
data = "ab cd"
print data.split()    # ['ab', 'cd']
data = "a b c d"
print data.split()    # ['a', 'b', 'c', 'd']

You can actually check the length before assigning

with open(urls_file_path, "r") as f:
    for idx, line in enumerate(f, 1):
        split_list = line.split()
        if len(split_list) != 2:
            raise ValueError("Line {}: '{}' has {} spaces, expected 1"
                .format(idx, line.rstrip(), len(split_list) - 1))
        else:
            url, count = split_list
            print url, count

With the input file,

http://google.com 2
http://python.org 3
http://python.org 4 Welcome
http://python.org 5

This program produces,

$ python Test.py
Read Data: http://google.com 2
Read Data: http://python.org 3
Traceback (most recent call last):
  File "Test.py", line 6, in <module>
    .format(idx, line.rstrip(), len(split_list) - 1))
ValueError: Line 3: 'http://python.org 4 Welcome' has 2 spaces, expected 1

Following @abarnert’s comment, you can use partition function like this

url, _, count = data.partition(" ")

If there are more than one spaces/no space, then count will hold rest of the string or empty string, respectively.

If you are using Python 3.x, you can do something like this

first, second, *rest = data.split()

First two values will be assigned in first and second respectively and the rest of the list will be assigned to rest, in Python 3.x

Advertisement