I would like to know if there is someway of replacing all DataFrame negative numbers by zeros?
Advertisement
Answer
If all your columns are numeric, you can use boolean indexing:
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1]}) In [3]: df Out[3]: a b 0 0 -3 1 -1 2 2 2 1 In [4]: df[df < 0] = 0 In [5]: df Out[5]: a b 0 0 0 1 0 2 2 2 1
For the more general case, this answer shows the private method _get_numeric_data
:
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1], 'c': ['foo', 'goo', 'bar']}) In [3]: df Out[3]: a b c 0 0 -3 foo 1 -1 2 goo 2 2 1 bar In [4]: num = df._get_numeric_data() In [5]: num[num < 0] = 0 In [6]: df Out[6]: a b c 0 0 0 foo 1 0 2 goo 2 2 1 bar
With timedelta
type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you can do:
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'), ...: 'b': pd.to_timedelta([-3, 2, 1], 'd')}) In [3]: df Out[3]: a b 0 0 days -3 days 1 -1 days 2 days 2 2 days 1 days In [4]: for k, v in df.iteritems(): ...: v[v < 0] = 0 ...: In [5]: df Out[5]: a b 0 0 days 0 days 1 0 days 2 days 2 2 days 1 days
Update: comparison with a pd.Timedelta
works on the whole DataFrame:
In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a': pd.to_timedelta([0, -1, 2], 'd'), ...: 'b': pd.to_timedelta([-3, 2, 1], 'd')}) In [3]: df[df < pd.Timedelta(0)] = 0 In [4]: df Out[4]: a b 0 0 days 0 days 1 0 days 2 days 2 2 days 1 days