This question already has answers here: Concatenate strings from several rows using Pandas groupby (8 answers) Closed 6 months ago. I currently have dataframe at the top. Is there a way to use a groupby function to get another dataframe to group the data and concatenate the words into the format like further below using python pandas? Thanks [ Answer
Tag: pandas
catch exception and return empty dataframe
I query a database and save result as a dataframe which I then transform by factorize with pivot_table. This works fine when database query returns data but it throws an error when no data is returned(this is to be expected). How to catch this exception and return empty dataframe? DataError: No numeric types to aggregate Answer Credits to brenbarn who
A function to create interaction variables: what is wrong with the code?
I’ve written a function below that takes, as arguments, a dataframe (df) and two of its column names (var1, var2). Then it creates interaction variables for the two variables and adds those columns to the original dataframe. The code works when I hard code it, but when I try to call the function like: I receive no errors but the
Python StatsModels Time Series Decomposition Duplicate Plot
I am using a mixture of Pandas and StatsModels to plot a time series decomposition. I followed this answer but when I call plot() it seems to be plotting a duplicate. My DataFrame looks like My index looks like but when I plot the decomposition I get this Strangely, if I plot only an element of the decomposition, the duplication
how to sort pandas dataframe from one column
I have a data frame like this: As you can see, months are not in calendar order. So I created a second column to get the month number corresponding to each month (1-12). From there, how can I sort this data frame according to calendar months’ order? Answer Use sort_values to sort the df by a specific column’s values: If
How do I select and store columns greater than a number in pandas?
I have a pandas DataFrame with a column of integers. I want the rows containing numbers greater than 10. I am able to evaluate True or False but not the actual value, by doing: I don’t use Python very often so I’m going round in circles with this. I’ve spent 20 minutes Googling but haven’t been able to find what
Python/Pandas create zip file from csv
Is anyone can provide example how to create zip file from csv file using Python/Pandas package? Thank you Answer Use From the docs: compression : string, optional a string representing the compression to use in the output file, allowed values are ‘gzip’, ‘bz2’, ‘xz’, only used when the first argument is a filename See discussion of support of zip files
Convert month int to month name in Pandas
I want to transform an integer between 1 and 12 into an abbrieviated month name. I have a df which looks like: I want the df to look like this: Most of the info I found was not in python>pandas>dataframe hence the question. Answer You can do this efficiently with combining calendar.month_abbr and df[col].apply()
Pandas groupby and make set of items
I am using pandas groupby and want to apply the function to make a set from the items in the group. The following results in TypeError: ‘type’ object is not iterable: But the following works: In my understanding the two expression are similar, what is the reason why the first does not work? Answer Update As late as pandas version
Converting Pandas dataframe into Spark dataframe error
I’m trying to convert Pandas DF into Spark one. DF head: Code: And I got an error: Answer You need to make sure your pandas dataframe columns are appropriate for the type spark is inferring. If your pandas dataframe lists something like: And you’re getting that error try: Now, make sure .astype(str) is actually the type you want those columns