Skip to content

python: class vs tuple huge memory overhead (?)

I’m storing a lot of complex data in tuples/lists, but would prefer to use small wrapper classes to make the data structures easier to understand, e.g.

class Person:
    def __init__(self, first, last):
        self.first = first
        self.last = last

p = Person('foo', 'bar')

would be preferable over

p = ['foo', 'bar']

however there seems to be a horrible memory overhead:

l = [Person('foo', 'bar') for i in range(10000000)]
# ipython now taks 1.7 GB RAM


del l
l = [('foo', 'bar') for i in range(10000000)]
# now just 118 MB RAM

Why? is there any obvious alternative solution that I didn’t think of?


(I know, in this example the ‘wrapper’ class looks silly. But when the data becomes more complex and nested, it is more useful)



As others have said in their answers, you’ll have to generate different objects for the comparison to make sense.

So, let’s compare some approaches.


l = [(i, i) for i in range(10000000)]
# memory taken by Python3: 1.0 GB

class Person

class Person:
    def __init__(self, first, last):
        self.first = first
        self.last = last

l = [Person(i, i) for i in range(10000000)]
# memory: 2.0 GB

namedtuple (tuple + __slots__)

from collections import namedtuple
Person = namedtuple('Person', 'first last')

l = [Person(i, i) for i in range(10000000)]
# memory: 1.1 GB

namedtuple is basically a class that extends tuple and uses __slots__ for all named fields, but it adds fields getters and some other helper methods (you can see the exact code generated if called with verbose=True).

class Person + __slots__

class Person:
    __slots__ = ['first', 'last']
    def __init__(self, first, last):
        self.first = first
        self.last = last

l = [Person(i, i) for i in range(10000000)]
# memory: 0.9 GB

This is a trimmed-down version of namedtuple above. A clear winner, even better than pure tuples.

User contributions licensed under: CC BY-SA
6 People found this is helpful