I am having trouble with a for loop inside a function. I am calculating cosine distances for a list of word vectors. with each vector, I am calculating the cosine distance and then appending it as a new column to the pandas dataframe. the problem is that there are several models, so i am comparing a word vector from model
Tag: dataframe
Pandas skipping lines when in read_csv, can I record these to variable/log file
I’ve seen similar questions on here but nothing that is quite what I want to do. I’m reading in a tsv/csv file using I have clearly defined headers within the file but sometimes I see that the file has unexpected additional columns and get the following messages in the console Skipping line 251643: Expected 20 fields in line 251643, saw
Creating new columns within a dataframe, based on the latest value from previous columns
I’ve just completed a beginner’s course in python, so please bear with me if the code below doesn’t make sense or my issue is because of some rookie mistake. I’ve been trying to put the learning to use by working with college production of NFL players, with a view to understanding which statistics can be predictive or at least correlate
How to groupby multiple columns with count unique value in Python Pandas
I have a DataFrame df_data: I have a function and parameter like this: Explain Parameters: with CustID = 1 the parameters should be list_minor = [3,1] (position is not important), list_major = [1] because with LocationID = 324 he get 3 times and LocationID = 490 he get 1 time (324,490 gets isMajor = 0 so it should be into
How to assign a value to a column for a subset of dataframe based on a condition in Pandas?
I have a data frame: df: index A class label 0 4 0 0 1 5 1 0 2 6 0 0 3 7 1 0 I want to change the label to 1, if the mean of A column of rows with class 0 is bigger than the mean of all data in column A? How to do this
Create a new list of dictionary from the index in dataframe Python with the fastest way
I have a ~200mil data in dictionary index_data: Key is a value in CustId and Value is an index of CustID in df_data: I have a DataFrame df_data: NOTE: If CustID is duplicate, only column Score have different data in each row I want to create a new list of dict(Total_Score is an avg Score of each CustID, Number is
How to convert a list with dictionaries into new pandas columns?
I have a dataframe which has a list of dictionaries as a column: This column has the following format: How can I convert this column into 4 new columns? I mean: route_id (x2), stop_id(x2) as new columns. Thanks in advance! Answer You can use df.explode with df.apply:
Transpose 3 column excel with K:V into column Pandas
I have a 3 column excel file I’m reading into pandas with basically k:v pairs in columns I need to not only tie the information in unnamed:1 & unnamed:2 to the unique animal ID as this is how I will track the animal but also transpose these columns where everything to the left of the “:” is the column header
extract new columns and fill values based on categorical values data frame in python
I have a data frame where one column is categorical strings and the next one is the values corresponding to it: I want to create new columns based on df.status column, and fill empty ones with np.nan, requires pivot on multiple columns: I am looking for an efficient solution that works for large data frames. Answer You want:
Check if Dataframe is empty and print results
I would like to go over an excel file with different stock symbols. How can I check after reading the stocks values (Open,Close,High,Low,Volume) in a dataframe with yahoo, if the dataframe is empty? In this excel list are more than 700 Symbols and some times yahoo have no data for some symbols. So I would like to exclude this symbols,