I have the following dataframe: I need to get a number of dataframes for each category. For instance, as output for category A: Answer Let’s split the categories, explode the data frame and groupby: And you get, for example df_dicts[‘A’]:
Tag: pandas
A shorter & more efficient pandas code for cumulative based data selection & column based data selection
Below is the requirement. There are 2 tables: brand_df (Brands’ value) and score_df (containing Subject score for each brand). [Generating samples below] What is being done :- Pick only the top brands that make 75% of the cumulative value Pick the subjects where 75% of the selected brand has a score (i.…
pandas not converting an object dtype to float64 even after error free execution of df.astype(‘float64’)
I have tried to convert an object dtype column to float64 using .astype(‘float64’) It ran without raising any error, but when I check the dtype using .dtype or .dtypes it is showing that converted column again as object. real_estate.dtypes Why is it not converting and why isn’t it giving any…
How to vectorize pandas operation
I have a dataset of house sales with timestamped Periods(per quarter). I want to adjust the price according to the house pricing index change per region. I have a separate dataframe with 3 columns, the Quarter, the Region and the % change in price. I am currently achieving this by iterating over both datafram…
Label dataframe column with a list
I have a dataframe column text I would like to create another column term by matching it with a list [‘apple’, ‘banana, melon’] The result I get However, I expected it to be I sense that it might be because I use a list but I don’t know how to make a loop in lambda itself Answer …
Is there a change for reading the published CSV from Google Sheets in Google Colab?
I have been using a file from Google sheets, published as CSV, and reading it with Pandas, to make the dataframe, but today stopped working here it is the error output: 2155 def read(self, nrows=None): 2156 try: -> 2157 data = self._reader.read(nrows) 2158 except StopIteration: 2159 if self._first_chunk: p…
How to replace rows which do not follow a specific schema-pattern? [closed]
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago. Improve this question I would like to delete all the rows that does not follow this pattern My c…
unboundlocalerror while using datetime to create new dataframe
I keep getting an error for this function: This isn’t the full function, but the whole problem lies here. I am trying to create a new dataframe that only pulls from the past year of the initial dataframe. I will need to create options for year count using a function variable once I get this figured out,…
Replace rows with different number of characters
I have a column having strings of different number of characters. Most of rows have the following number of characters: but there are also rows having different number, for instance I would like to replace those columns that have a number of characters different from xx.xx.xxxx xx-xx-xx with a null value (e.g…
Equidistant timeseries filling the blanks
I have the following code that generates a timeseries with 1 min steps but would like to have the time gaps filled. i.e 13:58 is missing in between. Every ip should be represented in the gap with zero values. How can this be achieved? Answer First change unstack by first level for DatetimeIndex, and add DataF…