I currently have two dataFrames that look like this:
JavaScript
x
16
16
1
Df3 - which is the output dataFrame:
2
3
| CompanyNm | CpID | Date |
4
:--------|:-----------------------------------|:-------------|:----------|
5
0 | {Converting) | {C} | 2020-02-14|
6
1 | {Converting) | {C} | 2020-02-17|
7
2 | {Converting) | {C} | 2020-02-18|
8
3 | {Converting) | {C} | 2020-02-19|
9
4 | {Converting) | {C} | 2020-02-20|
10
| | | |
11
521473 |ZHENGZHOU WARMSHINE PAPER CO., LTD. | 2316 | 2022-02-08|
12
521474 |ZHENGZHOU WARMSHINE PAPER CO., LTD. | 2316 | 2022-02-09|
13
521475 |ZHENGZHOU WARMSHINE PAPER CO., LTD. | 2316 | 2022-02-10|
14
521476 |ZHENGZHOU WARMSHINE PAPER CO., LTD. | 2316 | 2022-02-11|
15
521477 |ZHENGZHOU WARMSHINE PAPER CO., LTD. | 2316 | 2022-02-14|
16
Df4:
JavaScript
1
14
14
1
| BuySellNo | CustID | InvoiceDt | DueDt | SalesAmt |
2
:-----|:------------|:-------------|:----------|:----------|------------|
3
0 | 100 | 1047 | 2017-07-10| 2017-11-04| 46137.30 |
4
1 | 1000 | 1145 | 2017-10-12| 2018-04-03| 362133.15 |
5
2 | 10000 | 1047 | 2019-12-27| 2020-02-25| 28445.90 |
6
3 | 10001 | 2007 | 2019-12-27| 2020-02-25| 28445.90 |
7
4 | 10002 | 2304 | 2019-12-26| 2020-01-03| 27558.62 |
8
| | | | | |
9
21362 | ZD229 | 1064 | 2017-07-24| 2017-10-11| 0.00 |
10
21363 | ZD230 | 1064 | 2017-06-14| 2017-10-12| 1400.39 |
11
21364 | ZD231 | 1064 | 2017-06-13| 2017-10-11| 1967.19 |
12
21365 | ZD232 | 1063 | 2017-06-28| 2017-06-28| 128.26 |
13
21366 | ZD233 | 1064 | 2017-06-14| 2017-10-12| 3146.06 |
14
I am trying to add a new column to Df3, which is the sum of all Sales (SalesAmt) where the invoicing date (InvoiceDt) is prior to the date column in Df3.
JavaScript
1
2
1
df3['Total sales'] = df4[['SalesAmt']].sum(axis=1).where(df3['Date'] <= df4['InvoiceDt'], 0)
2
I get the following error in this case:
JavaScript
1
2
1
ValueError: Can only compare identically-labeled Series objects
2
Any idea how to fix this? Or a more efficient way to obtain the wanted result?
Thanks a lot,
Advertisement
Answer
EDIT: I misunderstood your problem definition earlier. Now corrected it:
JavaScript
1
5
1
def func(g):
2
mask = (df4['CustID'] == g.name[0]) & (df4['InvoiceDt'] <= g.name[1])
3
return df4[mask]['SalesAmt'].sum()
4
df3.groupby(['CpID','Date']).apply(func)
5