I have this df: Then I want create a new column, df[‘category’] that takes in as value, the column’s name whose value is true. So that df[‘category’] for each TRUE value column as follows: NO 2 columns have TRUE value in a row. Expected output: Answer Simple..use idxmax along axis=1 to get the name of column having True value, then
Tag: dataframe
How to search and select column names based on values?
Suppose I have a pandas DataFrame like this I want (a) the names of any columns that contain a value of 2 anywhere in the column (i.e., col1, col3), and (b) the names of any columns that contain only values of 2 (i.e., col3). I understand how to use DataFrame.any() and DataFrame.all() to select rows in a DataFrame where a
Python Scrap Same item from all subpages using BeautifulSoup
I am trying to scrap “salary” from each subpage. For one of the subpage, I am copying the specific contents of the soup =BeautifulSoup(requests.get(‘url_of_job’).text. I copied soup content to a word file and sliced the content surrounding salary and copied here. Copying all text crosses the limit here. soup = My code: Present solution: Expected solution: Answer Here is a
Count occurrences in last 30 days with Pandas Dataframe
I have a pandas Dataframe with an ID column and a date column (YYYY-MM-DD), ID Date 001 2022-01-01 001 2022-01-04 001 2022-02-07 002 2022-01-02 002 2022-01-03 002 2022-01-28 There may be gaps in the date field, as shown. I would like to have a new column, “occurrences_last_month” where it counts the number of occurrences for each ID in the last
How to filter a dataframe column having multiple values in Python
I have a data frame that sometimes has multiple values in cells like this: Now, I want to filter the data frame having an apple in the value. So my output should look like this: I used the str.contains(‘apple’) but this is not returning the ideal result. Can anyone help me with how I can get this result? Answer You
Applying custom function to groupby object keeps groupby column
I have a dataframe which as a column for grouping by and several other columns. Play dataframe: When using a groupby on this dataframe followed by a default function, the groupby column is set as an index and not included in the results: But when I define a custom function and use apply, I get an unwanted additional column: How
Pivot matrix to time-series – Python
I’ve got a dataframe with date as first column and time as the name of the other columns. Date 13:00 14:00 15:00 16:00 … 2022-01-01 B R M M … 2022-01-02 B B B M … 2022-01-03 R B B M … How could I transform that matrix into a datetime time-series? My objective its something like this: Date Data
Search substrings in strings and return relevant string when matched
I have a dataframe with product titles, which contain keywords, that can identify the product type as such: df_product_titles dataframe I have another dataframe with two columns, where the 1st column has the keyword and the relevant product type: df_product_types dataframe I want to search each keyword from product_types dataframe in the product_titles dataframe and return the relevant product type.
Concat pandas dataframes in Python with different row size without getting NaN values
I have to combine some dataframes in Python. I’ve tried to combine them using concat operation, but I am getting NaN values because each dataframe has different row size. For example: In this example, dataframe 1 and dataframe 2 only have 1 row. However, dataframe 3 has 3 rows. When I combine these 3 dataframes, I get NaN values for
Pandas long format of success table
I have a table with the following structure in pandas: I would like to put it in a long format. In this case, we have, for each user, a different number of events, and successes. I would like to transform this into an event table (each row corresponds to an event, and there is a column that tells you whether