Tag: dataframe

How to access the complete string of a dtype object in a dataframe?

I am doing some webscraping (getting the plot of books on goodreads). I have this info in a tsv file. When i get that tsv file into a dataframe it looks like i loose the best part of my string. How can i access the whole string? Cheers Answer The problem is that data[‘Plot’] is returning a Series with 1

Get several dataframe from an original one

dataframe pandas python

I have the following dataframe: I need to get a number of dataframes for each category. For instance, as output for category A: Answer Let’s split the categories, explode the data frame and groupby: And you get, for example df_dicts[‘A’]:

pandas not converting an object dtype to float64 even after error free execution of df.astype(‘float64’)

dataframe pandas python

I have tried to convert an object dtype column to float64 using .astype(‘float64’) It ran without raising any error, but when I check the dtype using .dtype or .dtypes it is showing that converted column again as object. real_estate.dtypes Why is it not converting and why isn’t it giving any error? also, real_estate[‘Age at time of purchase’].dtype this is giving

How to vectorize pandas operation

dataframe pandas python

I have a dataset of house sales with timestamped Periods(per quarter). I want to adjust the price according to the house pricing index change per region. I have a separate dataframe with 3 columns, the Quarter, the Region and the % change in price. I am currently achieving this by iterating over both dataframes. Is there a better way? Minimal

How to merge two dataframes and eliminate dupes

dataframe merge python python-3.x

I am trying to merge two dataframes together. One has 1.5M rows and one has 15M rows. I was expecting the merged dataframe to haev 15M rows, but it actually has 178M rows!! I think my merge is doing some kind of Cartesian product, and this isn not what I want. This is what I tried, and got 178M rows.

Is there a more efficient way to find and downgrade int64 columns with to_numeric() in Python Pandas?

dataframe pandas python

tl;dr: Need help cleaning up my downcast_int(df) function below. Hello, I’m trying to write my own downcasting functions to save memory usage. I am curious about alternatives to my (frankly, quite messy, but functioning) code, to make it more readable – and, perhaps, faster. The downcasting function directly modifies my dataframe, something I am not sure I should be doing.

How to perform index/match excel function equivalent using pandas?

dataframe pandas python

I am facing the below challenge. For instance, let the dummy dataframes be, Let another dataframe be, The output dataframe should be the following, My train of thought was to create dictionary(s), in this case, would be, followed by this function, I am always getting the following error, Also I think this is not an efficient solution at all. Are

How to create sum of columns in Pandas based on a conditional of multiple columns?

dataframe numpy pandas python

I am trying to sum two columns of the DataFrame to create a third column where the value in the third column is equal to the sum of the positive elements of the other columns. I have tried the below and just receive a column of NaN values DataFrame: Answer You can use df.mask here and fill value less than

Adding an increment to duplicates within a python dataframe

dataframe pandas python python-3.x

I’m looking to concatenate two columns in data frame and, where there are duplicates, append an integer number at the end. The wrinkle here is that I will keep receiving feeds of data and the increment needs to be aware of historical values that were generated and not reuse them. I’ve been trying to do this with an apply function

Iterate through a dictionary and update dataframe values

dataframe pandas python

i have a dictionary and a df column contains the country code “BHR”,”SAU”,”ARE”..etc how to update this column so if it find any of the dict keys it will create new column [“TIMEZONE”] row to the dict value. also add if statement that if the row is not equal to the key add a default value here is my try