I want to do a left assignment of one column’s values between DataFrame slices where the indexes don’t match.
df = pd.DataFrame(data=[('A', '20210101', 5.0), ('B', '20210101', 3.0), ('C', '20210101', 2.0), ('A', '20210102', 0.0), ('C', '20210102', 0.0), ('A', '20210103', 0.0), ('C', '20210103', 0.0), ('D', '20210103', 0.0)], columns=('Name', 'Date', 'Dollars')).set_index(['Name', 'Date']) dft = df.groupby(df.index.get_level_values('Date')) dates = list(dft.groups.keys()) df0 = dft.get_group(dates[0]).reset_index(level=1) df1 = dft.get_group(dates[1]).reset_index(level=1) df2 = dft.get_group(dates[2]).reset_index(level=1)
Is there a single expression that will work whether the left slice’s indexes are a subset or a superset of the right slice’s? The following attempt fails when left is a subset:
df0.loc[df1.index, 'Dollars'] = df1.Dollars # Works because every key in df1 is in df0 df0.loc[df2.index, 'Dollars'] = df2.Dollars # KeyError: "['D'] not in index"
Advertisement
Answer
If you want the missing index in the left DataFrame
You can do a index union on df0.index
and df2.index
by Index.union
followed by reindex()
before assigning values of df2.index
to df0
, as follows:
df0 = df0.reindex(df0.index.union(df2.index)) df0.loc[df2.index, 'Dollars'] = df2.Dollars # then this run successfully
Result:
print(df0) Date Dollars Name A 20210101 0.0 B 20210101 3.0 C 20210101 0.0 D NaN 0.0
If you don’t want the missing index in the left DataFrame
commonKeys = df0.index.intersection(df2.index) df0.loc[commonKeys, 'Dollars'] = df2.loc[commonKeys].Dollars
Result df0
:
Date Dollars Name A 20210101 0.0 B 20210101 3.0 C 20210101 0.0