Skip to content

Tag: pandas

NaN values in pivot_table index causes loss of data

Here is a simple DataFrame: Pivot method 1 The data can be pivoted to this: Downside: data in the 2nd row is lost because df[‘b’][1] == None. Pivot method 2 Downside: column b is lost. How can the two methods be combined so that columns b and the 2nd row are kept like so: More generally: How can i…

Select multiple ranges of columns in Pandas DataFrame

I have to read several files some in Excel format and some in CSV format. Some of the files have hundreds of columns. Is there a way to select several ranges of columns without specifying all the column names or positions? For example something like selecting columns 1 -10, 15, 17 and 50-100: I need to know h…

What is as_index in groupby in pandas?

What exactly is the function of as_index in groupby in Pandas? Answer print() is your friend when you don’t understand a thing. It clears out doubts many times. Take a look: Output: When as_index=True the key(s) you use in groupby() will become an index in the new dataframe. The benefits you get when yo…

Pandas: filter dataframe with type of data

I have dataframe. It’s a part How filter df with type? Usually I do it with str.contains, maybe it’s normal to specify any like df[df.event_duration.astype(int) == True]? Answer If all the other row values are valid as in they are not NaN, then you can convert the column to numeric using to_numeri…

Pandas Timedelta in months

How can I calculate the elapsed months using pandas? I have write the following, but this code is not elegant. Could you tell me a better way? Answer Update for pandas 0.24.0: Since 0.24.0 has changed the api to return MonthEnd object from period subtraction, you could do some manual calculation as follows to…