I have this code, which type checks fine on its own:
import pandas as pd
from typing import Tuple, Type
def df2objects(df: pd.DataFrame, object_type: Type[BaseObject]) -> Tuple[BaseObject, ...]:
return tuple(object_type(**kwargs) for kwargs in df.to_dict(orient='records')
But if I try to use it:
def use_it(df: pd.DataFrame) -> Tuple[DerivedObject,...]:
return df2objects(df, DerivedObject)
I get
Incompatible return value type (got “Tuple[BaseObject, …]”, expected “Tuple[DerivedObject, …]”)mypy(error)
I can fix this with
import typing
def use_it(df: pd.DataFrame) -> Tuple[DerivedObject,...]:
return typing.cast(Tuple[DerivedObject,...], df2objects(df, DerivedObject))
But what I would really like to do is specify that df2objects returns a Tuple of object_type, like this:
def df2objects(df: pd.DataFrame, object_type: Type[BaseObject]) -> Tuple[object_type, ...]:
or this:
def df2objects(df: pd.DataFrame, object_type: Type[BaseObject]) -> Tuple[BaseObject, ...]:
return typing.cast(Tuple[object_type,...],tuple(object_type(**kwargs) for kwargs in df.to_dict(orient='records'))
Of course, neither of those actually work.
Advertisement
Answer
As was alluded to in the comments, using a TypeVar is the solution here, I believe. Assuming DerivedObject is a subclass of BaseObject, the following should work:
import pandas as pd
from typing import Type, TypeVar, Tuple
T = TypeVar('T', bound=BaseObject)
def df2objects(df: pd.DataFrame, object_type: Type[T]) -> Tuple[T, ...]:
return tuple(object_type(**kwargs) for kwargs in df.to_dict(orient='records')
def use_it(df: pd.DataFrame) -> Tuple[DerivedObject,...]:
return df2objects(df, DerivedObject)
By using the bound keyword-argument, we restrict the possible types the TypeVar T could be to DerivedObject and its subclasses.
The mypy documentation on generic functions can be found here, the docs for TypeVar can be found here, and information on bound TypeVars can be found here.