Skip to content
Advertisement

Why can sub-module names be accessed in __init__.py even without explicitly importing them?

The issue:

pkg/
    __init__.py
    sub1.py
    sub2.py


$ cat pkg/__init__.py
from .sub2 import *
print("init", dir())

$ cat pkg/sub1.py
from .sub2 import *
print("sub1", dir())

$ cat pkg/sub2.py
def spam():
    ...

$ python -c "import pkg"
init [... 'spam', 'sub2']

$ python -c "import pkg.sub1"
init [... 'spam', 'sub2']
sub1 [... 'spam']

Note how sub2 is in the namespace of pkg, even though I don’t actually import it. I would expect only the names inside sub2 to be imported. Why is that not the case? I see that it has something to do with importing a package vs. importing a module, because:

$ python -c "import pkg.__init__"
init [... 'spam', 'sub2']
init [... 'spam']

It also seems to confuse mypy; I edit __init__.py to explicitly access sub2:

$ cat pkg/__init__.py
from .sub2 import *
print(sub2)

Then running mypy pkg gives:

pkg/__init__.py:2: error: Name "sub2" is not defined
Found 1 error in 1 file (checked 3 source files)

Why is this happening? Is this a documented feature? I should note that this “feature” is used in the Cpython source; check, for example, Lib/asyncio/__init__.py.

Advertisement

Answer

This is a bit of a quirk of submodules, but this is documented behavior:

When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.

Given Python’s familiar name binding rules this might seem surprising, but it’s actually a fundamental feature of the import system. The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.

Advertisement