Background: In mplfinance, I want to be able to plot multiple trade markers in the same bar. Currently to my understanding you can add only 1 (or 1 buy and 1 sell) to the same bar. I cannot have 2 more trades on the same side in the same bar unless I create another series.
Here is an example:
d = {'TradeDate': ['2018-10-15 06:00:00', '2018-10-29 03:00:00', '2018-10-29 03:00:00', '2018-10-29 06:00:00', '2018-11-15 05:00:00', '2018-11-15 05:00:00', '2018-11-15 05:00:00'], 'Price': [1.1596, 1.1433, 1.13926, 1.14015, 1.1413, 1.1400, 1.1403]} df = pd.DataFrame(data=d)
df
TradeDate Price 0 2018-10-15 06:00:00 1.15960 1 2018-10-29 03:00:00 1.14330 2 2018-10-29 03:00:00 1.13926 3 2018-10-29 06:00:00 1.14015 4 2018-11-15 05:00:00 1.14130 5 2018-11-15 05:00:00 1.14000 6 2018-11-15 05:00:00 1.14030
As you can see there are multiple trades for 2 datetimes. Now I would like to apply a rule that says “If there is more than 1 trade(here: Price) per date, create a new column for the additional price, keep doing so until all prices for the same TradeDate (datetime) have been distributed across columns, and all datetimes are unique”. So the more prices for the same date, the more extra columns are needed.
The end result would look like this (I finagled this data manually):
TradeDate Price Price2 Price3 0 2018-10-15 06:00:00 1.15960 NaN NaN 1 2018-10-29 03:00:00 1.14330 1.13926 NaN 3 2018-10-29 06:00:00 1.14015 NaN NaN 4 2018-11-15 05:00:00 1.14130 1.14000 1.1403
Advertisement
Answer
The trick is to add an incremental counter to each unique datetime. Such that if a datetime is encountered more than once, this counter increases.
To do this, we groupby tradedate, and get a cumulative count of the number of duplicate tradedates there are for a given tradedate. I then add 1 to this value so our counting starts at 1 instea of 0.
df["TradeDate_count"] = df.groupby("TradeDate").cumcount() + 1 print(df) TradeDate Price TradeDate_count 0 2018-10-15 06:00:00 1.15960 1 1 2018-10-29 03:00:00 1.14330 1 2 2018-10-29 03:00:00 1.13926 2 3 2018-10-29 06:00:00 1.14015 1 4 2018-11-15 05:00:00 1.14130 1 5 2018-11-15 05:00:00 1.14000 2 6 2018-11-15 05:00:00 1.14030 3
Now that we’ve added that column, we can simply pivot to achieve your desired result. Note that I added a rename(...)
method simply to add “price” to our column names. I also used the rename_axis
method since our pivot returned us a named index for the columns which some users find hard to look at, so I figured it would be best to remove it.
new_df = (df.pivot(index="TradeDate", columns="TradeDate_count", values="Price") .rename(columns="price{}".format) .rename_axis(columns=None)) price1 price2 price3 TradeDate 2018-10-15 06:00:00 1.15960 NaN NaN 2018-10-29 03:00:00 1.14330 1.13926 NaN 2018-10-29 06:00:00 1.14015 NaN NaN 2018-11-15 05:00:00 1.14130 1.14000 1.1403