ddf = dd.read_csv("data/csvs/*.part", dtype=better_dtypes)
Is there an easy equivalent way to convert all columns in a dask df(converted from a pandas df) using a dictionary. I have a dictionary as follows:
better_dtypes = { "id1": "string[pyarrow]", "id2": "string[pyarrow]", "id3": "string[pyarrow]", "id4": "int64", "id5": "int64", "id6": "int64", "v1": "int64", "v2": "int64", "v3": "float64", }
and would like to convert the pandas|dask df dtypes all at once to the suggested dtypes in the dictionary.
ddf = ddf.astype(better_dtypes).dtypes
Advertisement
Answer
Not sure if I understand the question correctly, but the conversion of dtypes can be done using .astype
(as you wrote), except you would want to remove .dtype
from the assignment:
# this will store the converted ddf ddf = ddf.astype(better_dtypes)