Skip to content
Advertisement

Do attribute names consume memory on instance basis in python

Considering I have millions of objects with 3 __slots__

Is it more memory efficient to have short slot names like x vs. long like would_you_like_fries_with_that_cheeseburger?

Or are the names allocated only once per class (opposed to once per instance?)

Advertisement

Answer

Names for slots only take memory per class, not per instance.

Slots use descriptors that map directly into the memory reserved for an instance, and the attribute names are mapped to these descriptors on the class.

Thus, the length of the names has no influence on how much memory each instance uses for the slots; the names only take space in the __dict__ attribute on the class (mapping name to descriptor), and in the descriptor object itself (to provide a string representation of the object); the string is interned even.

You can check how the custom descriptors are given their state in the C source code for type.__new__() (responsible for creating class objects):

if (et->ht_slots != NULL) {
    for (i = 0; i < nslots; i++, mp++) {
        mp->name = PyUnicode_AsUTF8(
            PyTuple_GET_ITEM(et->ht_slots, i));
        if (mp->name == NULL)
            goto error;
        mp->type = T_OBJECT_EX;
        mp->offset = slotoffset;

        /* __dict__ and __weakref__ are already filtered out */
        assert(strcmp(mp->name, "__dict__") != 0);
        assert(strcmp(mp->name, "__weakref__") != 0);

        slotoffset += sizeof(PyObject *);
    }
}

where mp->offset is the index into the memory for the instance.

The descriptor used is a PyMemberDescr_Type object, whose member_get function uses the (very generic) PyMember_GetOne() function; with the offset the pointer is retrieved from the instance:

PyMember_GetOne(const char *addr, PyMemberDef *l)
{
    PyObject *v;


    addr += l->offset;

addr is the memory address of the instance. The rest of the function deals with various types of members; the slot member is always set to type T_OBJECT_EX:

case T_OBJECT_EX:
    v = *(PyObject **)addr;
    if (v == NULL)
        PyErr_SetString(PyExc_AttributeError, l->name);
    Py_XINCREF(v);
    break;

which the function then returns; if the attribute was never set (and so v == NULL) an AttributeError exception is raised.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement