I want to search for ‘row3’ in the index of the DataFrame below, which should be 2 for a zero-based array. Is there a function which return the row number of ‘row3’? Thanks in advance for any help. Answer You could use Index.get_loc (docs): But this is a somewhat unusual access pattern– never need it myself.
Tag: pandas
Python Pandas Setting Dataframe index and Column names from an array
Lets say I have data loaded from an spreadsheet: and I have a the names in another dataframe for the rows and the columns for example Is there a way I can set the values in colnames[‘Names’].value as the index for df? and is there a way to do this for column names? Answer How about df.index = colnames[‘Names’] for
Pandas dataframe.dot division method
I am trying to divide two series of different length to return the matrix product dataframe of them. I can multiply them using the dot method (from this answer): I’ve tried the div method, but this just fills the dataframe with NaNs: Likewise the standard division operator also returns the same result: So I’m a bit stumped as to what
Matplotlib’s fill_between doesnt work with plot_date, any alternatives?
I want to create a plot just like this: The code: but with dates in the x axis, like this (without bands): the code: the problem is that fill_between fails when x values are date_time objects. Does anyone know of a workaround? DF is a pandas DataFrame. Answer It would help if you show how df is defined. What does
Ordered Logit in Python?
I’m interested in running an ordered logit regression in python (using pandas, numpy, sklearn, or something that ecosystem). But I cannot find any way to do this. Is my google-skill lacking? Or is this not something that’s been implemented in a standard package? Answer Update: Logit and Probit Ordinal regression models are now built in to statsmodels. https://www.statsmodels.org/devel/examples/notebooks/generated/ordinal_regression.html Examples are
How to replace negative numbers in Pandas Data Frame by zero
I would like to know if there is someway of replacing all DataFrame negative numbers by zeros? Answer If all your columns are numeric, you can use boolean indexing: For the more general case, this answer shows the private method _get_numeric_data: With timedelta type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you
How to loop over grouped Pandas dataframe?
DataFrame: Code: I’m trying to just loop over the aggregated data, but I get the error: ValueError: too many values to unpack @EdChum, here’s the expected output: The output is not the problem, I wish to loop over every group. Answer df.groupby(‘l_customer_id_i’).agg(lambda x: ‘,’.join(x)) does already return a dataframe, so you cannot loop over the groups anymore. In general: df.groupby(…)
How do you represent missing data in a Pandas DataFrame?
Does Pandas have an equivalent of R’s na (meaning not available)? If not, what is the convention for representing a missing value, as opposed to NaN which represents a mathematically impossible value such as a divide by zero? Answer Currently there is no NA value available in Pandas or NumPy. From the section “Working with missing data” in the Pandas
Plot multiple boxplot in one graph in pandas or matplotlib?
I have a two boxplotes But I want to place them in one graph to compare them. Have you any advice to solve this problem? Thanks! Answer Use return_type=’axes’ to get a1.boxplot to return a matplotlib Axes object. Then pass that axes to the second call to boxplot using ax=ax. This will cause both boxplots to be drawn on the
Check if dataframe column is Categorical
I can’t seem to get a simple dtype check working with Pandas’ improved Categoricals in v0.15+. Basically I just want something like is_categorical(column) -> True/False. We can see that the dtype for the categorical column is ‘category’: And normally we can do a dtype check by just comparing to the name of the dtype: But this doesn’t seem to work