H, I have a dataset with two columns, one of them is target. If I group all the unique values in target, I get an array of 826 elements. My problem is when trying to assign some values based on this uniqueness. I have a second array, called array with contains a total of 826 values (of string type) to
Tag: dataframe
Questions about In-place memory operations in pandas (1/2)
I was explaining[1] in-place operations vs out-of-place operations to a new user of Pandas. This resulted in us discussing passing objects by reference of by value. Naturally, I wanted to show pandas.DataFrame.values as I thought it shared the memory location of the underlying data of the DataFrame. However, I was surprised with and then sidetracked by the results of the
Combine all column elements except two particular columns
I want to combine the elements in all columns except two columns, ‘SourceFile’ and ‘Label’. I tried the above code. Which resulted in value error. There is so many columns. So I can’t use Answer col != [‘SourceFile’,’Label’] is syntactically wrong and it gives NameError not the ValueError. First get the columns you don’t want and convert it to set.
Counting unique mentions in Pandas dataframe column while grouped by multiple other columns
For a school project I am attempting to determine the number of mentions specific words have in Reddit titles and comments. More specifically, stock ticker mentions. Currently the dataframe looks like this (where type could be a string of either title or comment): Where the mentions column contains a set of tickers mentioned in the body (could be multiple). What
Pandas create column of dictionaries based on condition from another column
Let’s say if I have a Pandas df called df_1 like this: id date_created rank_1 rank_2 rank_3 rank_dict 2223 3/3/21 3:26 www.google.com www.yahoo.com www.ford.com {www.google.com:3, www.yahoo.com:2, www.ford.com:1} 1112 2/25/21 1:35 www.autoblog.com www.motor1.com www.webull.com {www.autoblog.com:3, www.motor1.com:2, www.webull.com:1} and another df called df_2 that looks like this: id date_created rank_1 rank_2 rank_3 2223 4/9/21 5:15 www.yahoo.com www.whatever.com www.google.com 1112 8/20/21 2:30 www.gm.com
Join columns in a single Pandas DataFrame
I’ve DataFrame with 4 columns and want to merge the first 3 columns in a new DataFrame. The data is identical, the order is irrelevant and any duplicates must remain. Desired DataFrame How do I get this done? Answer Here is one way of merging the first three columns with the help of numpy:
Pandas DataFrame adding two zeros
Hi can some one explain why it adds two 0 0 to my data frame in this function the output looks like Answer You may want to revisit how you are creating the dataframe. Here are some changes for you to consider. I have limited information about what you are doing so my answer is catering to just the code
dropna() got an unexpected keyword argument ‘thresh’
I have a list of column names & want to drop the rows that have more than 1 NaN values but this error occurs: dropna() got an unexpected keyword argument ‘thresh’. My pandas is updated, the version is 1.1.5 Previously I’ve done a little data cleaning, think it caused my df rows to become str, but I converted them to
how to split a column based on a character and append the rest of columns with each split
Consider I have a dataframe: First, how do I print all the rows that has “|” in column 1? I am trying the following but it prints all rows of the frame: Second, how do I split the column 1 and column 2 on “|”, so that each split in column 1 gets its corresponding split from column 2 and
How to create the Numpy array X of shape (2638, 1838) for a dataframe has shape (2638, 1840)?
Hi, can someone please help me with this? What should do if I want to use NumPy to get an array X which has a shape (2638, 1838) while the dataframe has a shape of (2638, 1840)? Here is my code: Answer Conversion to Numpy and back to Pandas, as advised in one of comments to your post, is not