ddf = dd.read_csv("data/csvs/*.part", dtype=better_dtypes)
Is there an easy equivalent way to convert all columns in a dask df(converted from a pandas df) using a dictionary. I have a dictionary as follows:
better_dtypes = {
    "id1": "string[pyarrow]",
    "id2": "string[pyarrow]",
    "id3": "string[pyarrow]",
    "id4": "int64",
    "id5": "int64",
    "id6": "int64",
    "v1": "int64",
    "v2": "int64",
    "v3": "float64",
}
and would like to convert the pandas|dask df dtypes all at once to the suggested dtypes in the dictionary.
ddf = ddf.astype(better_dtypes).dtypes
Advertisement
Answer
Not sure if I understand the question correctly, but the conversion of dtypes can be done using .astype (as you wrote), except you would want to remove .dtype from the assignment:
# this will store the converted ddf ddf = ddf.astype(better_dtypes)