The issue:
pkg/
__init__.py
sub1.py
sub2.py
$ cat pkg/__init__.py
from .sub2 import *
print("init", dir())
$ cat pkg/sub1.py
from .sub2 import *
print("sub1", dir())
$ cat pkg/sub2.py
def spam():
$ python -c "import pkg"
init [ 'spam', 'sub2']
$ python -c "import pkg.sub1"
init [ 'spam', 'sub2']
sub1 [ 'spam']
Note how sub2
is in the namespace of pkg
, even though I don’t actually import it. I would expect only the names inside sub2
to be imported. Why is that not the case? I see that it has something to do with importing a package vs. importing a module, because:
$ python -c "import pkg.__init__"
init [ 'spam', 'sub2']
init [ 'spam']
It also seems to confuse mypy
; I edit __init__.py
to explicitly access sub2
:
$ cat pkg/__init__.py
from .sub2 import *
print(sub2)
Then running mypy pkg
gives:
pkg/__init__.py:2: error: Name "sub2" is not defined
Found 1 error in 1 file (checked 3 source files)
Why is this happening? Is this a documented feature? I should note that this “feature” is used in the Cpython source; check, for example, Lib/asyncio/__init__.py
.
Advertisement
Answer
This is a bit of a quirk of submodules, but this is documented behavior:
When a submodule is loaded using any mechanism (e.g.
importlib
APIs, theimport
orimport-from
statements, or built-in__import__()
) a binding is placed in the parent module’s namespace to the submodule object. For example, if packagespam
has a submodulefoo
, after importingspam.foo
,spam
will have an attributefoo
which is bound to the submodule.
…
Given Python’s familiar name binding rules this might seem surprising, but it’s actually a fundamental feature of the import system. The invariant holding is that if you have
sys.modules['spam']
andsys.modules['spam.foo']
(as you would after the above import), the latter must appear as the foo attribute of the former.