Tag: dataframe

populating dataframe column from BOOLEAN column with TRUE value

I have this df: Then I want create a new column, df[‘category’] that takes in as value, the column’s name whose value is true. So that df[‘category’] for each TRUE value column as follows: NO 2 columns have TRUE value in a row. Expected output: Answer Simple..use idxmax along axi…

How to search and select column names based on values?

dataframe pandas python

Suppose I have a pandas DataFrame like this I want (a) the names of any columns that contain a value of 2 anywhere in the column (i.e., col1, col3), and (b) the names of any columns that contain only values of 2 (i.e., col3). I understand how to use DataFrame.any() and DataFrame.all() to select rows in a Data…

Python Scrap Same item from all subpages using BeautifulSoup

beautifulsoup dataframe html python python-requests-html

I am trying to scrap “salary” from each subpage. For one of the subpage, I am copying the specific contents of the soup =BeautifulSoup(requests.get(‘url_of_job’).text. I copied soup content to a word file and sliced the content surrounding salary and copied here. Copying all text cross…

Count occurrences in last 30 days with Pandas Dataframe

dataframe date pandas python

I have a pandas Dataframe with an ID column and a date column (YYYY-MM-DD), ID Date 001 2022-01-01 001 2022-01-04 001 2022-02-07 002 2022-01-02 002 2022-01-03 002 2022-01-28 There may be gaps in the date field, as shown. I would like to have a new column, “occurrences_last_month” where it counts t…

How to filter a dataframe column having multiple values in Python

dataframe pandas python python-3.x

I have a data frame that sometimes has multiple values in cells like this: Now, I want to filter the data frame having an apple in the value. So my output should look like this: I used the str.contains(‘apple’) but this is not returning the ideal result. Can anyone help me with how I can get this …

Pivot matrix to time-series – Python

dataframe datetime pivot python time-series

I’ve got a dataframe with date as first column and time as the name of the other columns. Date 13:00 14:00 15:00 16:00 … 2022-01-01 B R M M … 2022-01-02 B B B M … 2022-01-03 R B B M … How could I transform that matrix into a datetime time-series? My objective its something like t…

Search substrings in strings and return relevant string when matched

dataframe pandas python

I have a dataframe with product titles, which contain keywords, that can identify the product type as such: df_product_titles dataframe I have another dataframe with two columns, where the 1st column has the keyword and the relevant product type: df_product_types dataframe I want to search each keyword from p…

Concat pandas dataframes in Python with different row size without getting NaN values

dataframe pandas python

I have to combine some dataframes in Python. I’ve tried to combine them using concat operation, but I am getting NaN values because each dataframe has different row size. For example: In this example, dataframe 1 and dataframe 2 only have 1 row. However, dataframe 3 has 3 rows. When I combine these 3 da…

Pandas long format of success table

dataframe pandas python

I have a table with the following structure in pandas: I would like to put it in a long format. In this case, we have, for each user, a different number of events, and successes. I would like to transform this into an event table (each row corresponds to an event, and there is a column that tells you whether