Tag: dataframe

Pandas .loc[].index

What is the most efficient way (using the least amount of lines possible) to locate and drop multiple strings in a specified column? Information regarding the .tsv dataset that may help: ‘tconst’ = movie ID ‘region’ = region in which the movie was released in ‘language’ = language of movie Here is what I have right now: I am trying

Pandas add missing weeks from range to dataframe

dataframe pandas python

I am computing a DataFrame with weekly amounts and now I need to fill it with missing weeks from a provided date range. This is how I’m generating the dataframe with the weekly amounts: Which outputs: If a date range is given as start=’2020-08-30′ and end=’2020-10-30′, then I would expect the following dataframe: So far, I have managed to just

Python : Dropping specific rows in a dataframe and keep a specif one

dataframe drop duplicates pandas python

Let’s say that I have this dataframe I want to reduce this dataframe ! I want to reduce only the rows that contains the string “info” by keeping the ones that have the highest level in the column “Group”. So in this dataframe, it would mean that I keep the row “ID_info_1” in the group 4, and “ID_info_1” in the

Difference of letting DataFrame’s column

dataframe pandas python

I don’t know the difference of two ways that I let columns of DataFrame. the codes are here: when I printed A[‘ftr3’] to see elements of ftr3 of A, there was no problem. But when I printed B[‘ftr3’], the problem occured: Moreover, the reason I’m confused with this result was that print(A) and print(B) prints exactly same results. the results

Joining two dataframes on columns they match

dataframe pandas python

I have two dataframes. df1 has more elements (3) in column ‘Table_name’ than df2 (2). I want a resultant dataframe that only outputs the rows where df1 and df2 share the same column names. df1 df2 I want this to be the result. df_result This is what i tried but it doesn’t work: Answer You need loc here

Getting error in dataframe typeError: ‘Series’ objects are mutable, thus they cannot be hashed

dataframe pandas python

I am trying to apply this operation on my dataframe df: where data types of a,b,c are: But I am getting the error TypeError: ‘Series’ objects are mutable, thus they cannot be hashed Is it happening because of na value present in column b or c? If yes, is there a way to ignore the operation for na values? Thanks.

Is there a quick way in python to convert a string ‘1/100’ to float 0.01?

dataframe pandas python

I have this df: which I would like to convert to decimal odds. I know i could use .split(‘/’) to achieve this but was wondering if there was a quicker way to do this. Answer As suggested by @ch3steR, use pd.eval and try this

Error when trying to set column as index in pandas dataframe

data-science dataframe numpy pandas python

I have the following code: which works fine until I do (trying to set column ‘idx’ as in index for the dataframe) which throws an error What does this mean ? Answer The error is when you create A with If you print A.columns you will get: So ‘idx’ is not really in your column for you to set index.

Calculate rolling average for all columns pandas

dataframe pandas python rolling-average

I have the below dataframe: I want to replace the NaN values with the 3 month rolling average. How should I got about this? Answer If you take NaNs as 0 into your means, can do: This will give you: