Tag: pandas

Spitting a column based on a delimiter

I would like to extract some information from a column in my dataframe: Example I was using str.contain to extract the first part (i.e., all the information before the first dash, where there is. I am still getting the same original column (so no extraction). My output would consist in two columns, one withou…

Plotting average linear regression of data set consisting of missing values

matplotlib numpy pandas python

I was trying to plot a linear graph using m,b = np.polyfit(x0, y0, 1) function however when I print m2,b2,m3,b3 I get nan. from the empty values. How do I fix this? Answer You seem to have a typo in It would probably help to rename the variables idxy12,idxy13 and idxy14 or so. You also could write all this wi…

Multiple dates in a pandas column

datetime pandas python

I am trying to make the dates in a Pandas DataFrame all of the same format. Currently I have the DataFrame storing the dates in two formats. “6/08/2017 2:15:00 AM” & 2016-01-01T00:05:00 The column name which these dates are stored under is INTERVAL_END. As you can see, one of the dates is a st…

Multiple XML files in directory Python

dataframe pandas python xml

I am fairly new to Python and this community has been a great help! I am learning a lot. I’m trying to use this existing code to loop through multiple XML files in the same directory. Currently, the code is looking at one specific file. Any help is greatly appreciated! Answer This should help you…

after a groupby create a new column with a list of unique values for another column of the groupes values

pandas python

So i have a dataframe with two columns: artistID and genre: And what I want to do is to group by the column artistID (so the resulting datafdrame has as many rows as artistID there are in this dataframe), and the second column of the new dataframe I want it to be like a list or an array or whatever

Converting a dataframe with a line separator

dataframe pandas python

I make a function that accepts a dataframe as input: And returns a dataframe, where a certain delimiter number (in the example, it is 6) is the passed parameter: Here’s what I got: How can I simplify the function and make it more versatile? How do I make the function faster? Thanks. Answer You can do th…

Get the max value from each group with pandas.DataFrame.groupby

pandas python

I need to aggregate two columns of my dataframe, count the values of the second columns and then take only the row with the highest value in the “count” column, let me show: so far so good, but now I need to get only the row of each ‘col1’ group that has the maximum ‘count’…

I’m getting float axis even with the command MaxNlocator(integer=True)

matplotlib pandas python

I have this df called normales: With this code i’m plotting time series and bars: You can realize that i’m using ax.yaxis.set_major_locator(MaxNLocator(integer=True)) in every axis to make integer the numbers of the axis. Although i’m using ax.yaxis.set_major_locator(MaxNLocator(integer=True…

How to replace the ‘,’ between two numbers like X,X% into X.X% in all the dataframe python

pandas python

I have a column in pandas data frame like below. Column name is ‘ingredients_text’ Now I want to replace all the values like 5,5% to 5.5% in this column in all the dataframe. Answer We can use str.replace here: The pattern b(d+),(d+)% matches in the first and second capture groups, respectively, t…