I am quite new to pandas, but I use python at a good level. I have a pandas dataframe which is organized as follows It is a fairly large dataframe (7 columns and ~600k rows). What I would like to do is: given a tuple containing values referring to the idbasin column (e.g. (1,2)), if the idrun value is the
Tag: pandas
Trouble With SQL Query in Python
Hello I’m getting an error: near “join”: syntax error. Is there an obvious issue with this that I’m not picking up on? I’ve changed names in the query but I’ve gone over and checked for spelling errors already. Answer In SQL, the order of clauses is SELECT, FROM, JOIN, WHER…
How to make this code not to consume so much RAM memory?
I have these two function and when I run them my kernel dies so freaking quickly. What can I do to prevent it? It happens after appending about 10 files to the dataframe. Unfortunately json files are such big (approx. 150 MB per one, having dozens of them) and I have no idea how to join it together. EDIT: Due
Get values from dataframe with MultiIndex index containg NaNs
I cannot access the values of an index position that has a nan in it and wonder how I could solve this. (In my project this index has a very special meaning and I really need to keep it, otherwise I would need to make some dirty manual modifications: “there is always a solution” even if it is a ve…
create new column based on weekly change, based on ID
I have the above data for 1 month and I want to create a new column delta_rank_7 which tells me the change in rank in last 7 days for each id (NaNs for 2021-06-01 to 2021-06-07) I can do something like mentioned here Calculating difference between two rows in Python / Pandas but I have multiple entries for ea…
Arranging call data from salesforce in 15 minute intervals
I am new in python and pandas and also in stackoverflow so I apologize for any mistakes I make in advance. I have this dataframe output is and my desired outcome is to have something like in below I have this code from another topic link But it does not take “interval_start” into consideration, I …
Dropping rows at specific minutes
I am trying to drop rows at specific minutes ( 05,10, 20 ) I have datetime as an index Then I run below it returnes invalid syntax error. Answer You can just do it using boolean indexing, assuming that the index is already parsed as datetime. Or the opposite of the same answer:
Lookup Values by Corresponding Column Header in Pandas 1.2.0 or newer
The operation pandas.DataFrame.lookup is “Deprecated since version 1.2.0”, and has since invalidated a lot of previous answers. This post attempts to function as a canonical resource for looking up corresponding row col pairs in pandas versions 1.2.0 and newer. Standard LookUp Values With Default …
How to efficiently do operation on pandas each group
So I have a data frame like this– What I am doing is grouping by id and doing rolling operation on the delay column like below– It is working just fine but I am curious whether .apply on grouped data frame is vectorized or not. Since my dataset is huge, is there a better-vectorized way to do this …
Pandas: AttributeError: ‘float’ object has no attribute ‘MACD’
I would like to compare 2 rows in a pandas dataframe but I always get an Error saying: AttributeError: ‘float’ object has no attribute ‘MACD’. This is the df: Now I want to count on how many times it would buy and sell based on some information in the rows so I’m trying to iterat…