Tag: pandas

How do I find the position of a value in a pandas.DataFrame?

I want to search for ‘row3’ in the index of the DataFrame below, which should be 2 for a zero-based array. Is there a function which return the row number of ‘row3’? Thanks in advance for any help. Answer You could use Index.get_loc (docs): But this is a somewhat unusual access pattern– never need it myself.

Python Pandas Setting Dataframe index and Column names from an array

pandas python

Lets say I have data loaded from an spreadsheet: and I have a the names in another dataframe for the rows and the columns for example Is there a way I can set the values in colnames[‘Names’].value as the index for df? and is there a way to do this for column names? Answer How about df.index = colnames[‘Names’] for

Pandas dataframe.dot division method

matrix pandas python

I am trying to divide two series of different length to return the matrix product dataframe of them. I can multiply them using the dot method (from this answer): I’ve tried the div method, but this just fills the dataframe with NaNs: Likewise the standard division operator also returns the same result: So I’m a bit stumped as to what

Matplotlib’s fill_between doesnt work with plot_date, any alternatives?

matplotlib pandas python

I want to create a plot just like this: The code: but with dates in the x axis, like this (without bands): the code: the problem is that fill_between fails when x values are date_time objects. Does anyone know of a workaround? DF is a pandas DataFrame. Answer It would help if you show how df is defined. What does

Ordered Logit in Python?

machine-learning numpy pandas python scikit-learn

I’m interested in running an ordered logit regression in python (using pandas, numpy, sklearn, or something that ecosystem). But I cannot find any way to do this. Is my google-skill lacking? Or is this not something that’s been implemented in a standard package? Answer Update: Logit and Probit Ordinal regression models are now built in to statsmodels. https://www.statsmodels.org/devel/examples/notebooks/generated/ordinal_regression.html Examples are

How to replace negative numbers in Pandas Data Frame by zero

dataframe negative-number pandas python replace

I would like to know if there is someway of replacing all DataFrame negative numbers by zeros? Answer If all your columns are numeric, you can use boolean indexing: For the more general case, this answer shows the private method _get_numeric_data: With timedelta type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you

How to loop over grouped Pandas dataframe?

dataframe iteration pandas pandas-groupby python

DataFrame: Code: I’m trying to just loop over the aggregated data, but I get the error: ValueError: too many values to unpack @EdChum, here’s the expected output: The output is not the problem, I wish to loop over every group. Answer df.groupby(‘l_customer_id_i’).agg(lambda x: ‘,’.join(x)) does already return a dataframe, so you cannot loop over the groups anymore. In general: df.groupby(…)

How do you represent missing data in a Pandas DataFrame?

na nan pandas python

Does Pandas have an equivalent of R’s na (meaning not available)? If not, what is the convention for representing a missing value, as opposed to NaN which represents a mathematically impossible value such as a divide by zero? Answer Currently there is no NA value available in Pandas or NumPy. From the section “Working with missing data” in the Pandas

Plot multiple boxplot in one graph in pandas or matplotlib?

matplotlib pandas python

I have a two boxplotes But I want to place them in one graph to compare them. Have you any advice to solve this problem? Thanks! Answer Use return_type=’axes’ to get a1.boxplot to return a matplotlib Axes object. Then pass that axes to the second call to boxplot using ax=ax. This will cause both boxplots to be drawn on the

Check if dataframe column is Categorical

pandas python

I can’t seem to get a simple dtype check working with Pandas’ improved Categoricals in v0.15+. Basically I just want something like is_categorical(column) -> True/False. We can see that the dtype for the categorical column is ‘category’: And normally we can do a dtype check by just comparing to the name of the dtype: But this doesn’t seem to work