I have a JSON file coming in, which I am doing some operations/trimming on. The result looks like this: When applying df = pd.DataFrame(user, index=[0]) I get the following Dataframe: When applying df = pd.DataFrame(user) I get: I am aware, as to why that happens, however none is what I want. I’d like t…
Tag: pandas
What is pandas equivalent of the following SQL?
OK, I have a dataframe that looks like the following: In SQL, to filter unique segments (segment_id) by travelmode I will do: What is the pandas equivalent of this expression? Answer Maybe: as suggested in this post.
Replace column values between two dataframes according to index
I have a dataframe named calls, each call is recorded by a date (calls)[datetime]. each call has an answer status (calls)[Status]. A second Dataframe named NewStatus with same column (NewStatus)[datetime] and the column (NewStatus)[New_Status] that I want to replace in the first dataframe with a date join Des…
Pandas get difference from first row at a set dtime with groupby
If I have a dataframe with [Group], [DTime] and [Value] columns For each [Group] I’m trying to find the difference between the first [Value] and every subsequent value from a set [DTime], for this example say it’s the start of the df at 2015-01-01. Ultimately I would like to plot a timeseries of […
How to query/filter cells against single values when cells have multiple values?
I have a csv file that follows the following format Columns one Column two Key1 Value1,Value2,value3 Key2 value5 I can easily use a list and .isin to filter the data-frame as follows: Which gives me the second row, but if there are cells with multiple values (like in the first row in the example table above w…
Looping through a second column using a probability input
I have a similar question to one I posed here, but subtly different as it includes an extra step to the process involving a probability: Using a Python pandas dataframe column as input to a loop through another column I’ve got two pandas dataframes: one has these variables Another is a table with these …
Compare variables through a range
I’ve 2 DataFrames where one gives me 3 Dates and a slope, and the other one gives me a calendar with prices. I’m want to know if any High price in Dates breaks the line between Date1 and Date3 of the slope_and_dates DataFrame. I expect an extra column in slopes_and_dates called Break where I can s…
Add averages to existing plot with pandas.DataFrame
I have a pandas data-frame of the form and I want to plot the last 7 days together with the average over the weekdays. I can create / plot the average by using and I can create / plot the last 7 days by using but I fail to combine them to a single plot since the average uses weekday
Pandas Dataframe: Retrieve the Maximum Value in a Pandas Dataframe using .groupby and .idxmax()
I have a Pandas Dataframe that contains a series of Airbnb Prices grouped by neighbourhood group neighbourhood and room_type. My objective is to return the Maximum Average Price for each room_type per Neighbourhood and return only this. My approach to this was to use .groupby and .idxmax() to get the maximum …
Pandas: Is better aggregation possible
I have sample dataframe above. I wish to calculate percentage True for each date. I am able to do as below. But, feel it can be done with groupby + agg. Is it possible? My attempt: Answer You can do groupby like this: Output: You can get both percentages for T and F with crosstab: Output: Note 1: Extra commen…