I want to do a left assignment of one column’s values between DataFrame slices where the indexes don’t match.
df = pd.DataFrame(data=[('A', '20210101', 5.0),
('B', '20210101', 3.0),
('C', '20210101', 2.0),
('A', '20210102', 0.0),
('C', '20210102', 0.0),
('A', '20210103', 0.0),
('C', '20210103', 0.0),
('D', '20210103', 0.0)],
columns=('Name', 'Date', 'Dollars')).set_index(['Name', 'Date'])
dft = df.groupby(df.index.get_level_values('Date'))
dates = list(dft.groups.keys())
df0 = dft.get_group(dates[0]).reset_index(level=1)
df1 = dft.get_group(dates[1]).reset_index(level=1)
df2 = dft.get_group(dates[2]).reset_index(level=1)
Is there a single expression that will work whether the left slice’s indexes are a subset or a superset of the right slice’s? The following attempt fails when left is a subset:
df0.loc[df1.index, 'Dollars'] = df1.Dollars # Works because every key in df1 is in df0 df0.loc[df2.index, 'Dollars'] = df2.Dollars # KeyError: "['D'] not in index"
Advertisement
Answer
If you want the missing index in the left DataFrame
You can do a index union on df0.index and df2.index by Index.union followed by reindex() before assigning values of df2.index to df0, as follows:
df0 = df0.reindex(df0.index.union(df2.index)) df0.loc[df2.index, 'Dollars'] = df2.Dollars # then this run successfully
Result:
print(df0)
Date Dollars
Name
A 20210101 0.0
B 20210101 3.0
C 20210101 0.0
D NaN 0.0
If you don’t want the missing index in the left DataFrame
commonKeys = df0.index.intersection(df2.index) df0.loc[commonKeys, 'Dollars'] = df2.loc[commonKeys].Dollars
Result df0:
Date Dollars Name A 20210101 0.0 B 20210101 3.0 C 20210101 0.0