Tag: pandas

Sum rows based on columns inside pandas dataframe

I am quite new to pandas, but I use python at a good level. I have a pandas dataframe which is organized as follows It is a fairly large dataframe (7 columns and ~600k rows). What I would like to do is: given a tuple containing values referring to the idbasin column (e.g. (1,2)), if the idrun value is the

Trouble With SQL Query in Python

pandas python sql sqlite

Hello I’m getting an error: near “join”: syntax error. Is there an obvious issue with this that I’m not picking up on? I’ve changed names in the query but I’ve gone over and checked for spelling errors already. Answer In SQL, the order of clauses is SELECT, FROM, JOIN, WHER…

How to make this code not to consume so much RAM memory?

dataframe low-memory memory pandas python

I have these two function and when I run them my kernel dies so freaking quickly. What can I do to prevent it? It happens after appending about 10 files to the dataframe. Unfortunately json files are such big (approx. 150 MB per one, having dozens of them) and I have no idea how to join it together. EDIT: Due

Get values from dataframe with MultiIndex index containg NaNs

multi-index pandas pandas-groupby python

I cannot access the values of an index position that has a nan in it and wonder how I could solve this. (In my project this index has a very special meaning and I really need to keep it, otherwise I would need to make some dirty manual modifications: “there is always a solution” even if it is a ve…

create new column based on weekly change, based on ID

calculated-columns dataframe pandas python

I have the above data for 1 month and I want to create a new column delta_rank_7 which tells me the change in rank in last 7 days for each id (NaNs for 2021-06-01 to 2021-06-07) I can do something like mentioned here Calculating difference between two rows in Python / Pandas but I have multiple entries for ea…

Arranging call data from salesforce in 15 minute intervals

datetime explode intervals pandas python

I am new in python and pandas and also in stackoverflow so I apologize for any mistakes I make in advance. I have this dataframe output is and my desired outcome is to have something like in below I have this code from another topic link But it does not take “interval_start” into consideration, I …

Dropping rows at specific minutes

pandas python timestamp

I am trying to drop rows at specific minutes ( 05,10, 20 ) I have datetime as an index Then I run below it returnes invalid syntax error. Answer You can just do it using boolean indexing, assuming that the index is already parsed as datetime. Or the opposite of the same answer:

Lookup Values by Corresponding Column Header in Pandas 1.2.0 or newer

dataframe pandas python

The operation pandas.DataFrame.lookup is “Deprecated since version 1.2.0”, and has since invalidated a lot of previous answers. This post attempts to function as a canonical resource for looking up corresponding row col pairs in pandas versions 1.2.0 and newer. Standard LookUp Values With Default …

How to efficiently do operation on pandas each group

dataframe numpy pandas python

So I have a data frame like this– What I am doing is grouping by id and doing rolling operation on the delay column like below– It is working just fine but I am curious whether .apply on grouped data frame is vectorized or not. Since my dataset is huge, is there a better-vectorized way to do this …

Pandas: AttributeError: ‘float’ object has no attribute ‘MACD’

dataframe pandas python row

I would like to compare 2 rows in a pandas dataframe but I always get an Error saying: AttributeError: ‘float’ object has no attribute ‘MACD’. This is the df: Now I want to count on how many times it would buy and sell based on some information in the rows so I’m trying to iterat…