I have the following dataframe df
with datetime index.
open high low close DateTime 2016-02-04 01:00:00 1.10886 1.10887 1.10702 1.10747 2016-02-04 02:00:00 1.10746 1.10893 1.10695 1.10810 2016-02-04 03:00:00 1.10809 1.11403 1.10788 1.11326 2016-02-04 04:00:00 1.11327 1.11673 1.11306 1.11628 2016-02-04 05:00:00 1.11628 1.11835 1.11523 1.11815 2016-02-04 06:00:00 1.11813 1.11848 1.11653 1.11723 2016-02-04 07:00:00 1.11712 1.12020 1.11705 1.11941
I need to filter out the indexes where the following condition is met.
It’s written in mql4
code which is where I’m converting this from, so bear with me. Using numbers as indexes so the bar2Open == 'open'
is the value in the second row prior to the current one and bar1Low == 'low'
is the value of the previous row:
double buffer = 10; double a = (bar2Open < bar2Close) ? bar1Low : bar1High; double b = (bar2Open < bar2Close) ? bar3High : bar3Low; if(abs(a-b) >= buffer) { do something; }
I’d like to do this with some form of itertuples()
but I’m not sure how to access previous rows values in different columns. Output can also be in the form of another column that just gives 1 if the condition is met, 0 otherwise.
Advertisement
Answer
There is no need to iterate over the dataframe rows, that is usually a sign that you are doing something wrong when working with pandas. You can access values from previous rows using shift
. The filtering can be done using something like:
import numpy as np import pandas as pd buffer = 10 cond = df['open'].shift(2) < df['close'].shift(2) a = np.where(cond, df['low'].shift(1), df['high'].shift(1)) b = np.where(cond, df['high'].shift(3), df['low'].shift(3)) idx = np.abs(a - b) > 10 df = df.loc[idx]
Using the example dataframe in the question, this will result in an empty dataframe.
Assigning an 0 or 1 to a new column can be done using idx
as follows:
df['filter'] = idx.astype(int)