Dataframe: I have columns “group” and “val” and I don’t know how to write pandas code to get column “count”? The logic is like this, it should count the number of consecutive values that are on the same side (either positive or negative) grouped by column “group…
Tag: pandas
Websocket Json Data to DataFrame
I am learning how to work with APIs and web sockets in finance. My goal for this code is to access data and create a DataFrame with only columns (index, ask, bid & quote) I have tried appending values to the DataFrame but it creates a new DataFrame every time I receive a message similar to the df = new_df…
How to compare each value of column B with the value of column A?
Compare each value in B column with the first value in A column until it is greater than it, then set the expected column to true. Then compare the value of A column with the expected column that is true until B column value is greater than it,then set the expected column to true. Input: Expected Output Answe…
Pandas groupby column and sum nulls of all other columns
I have a dataframe with the following structure: I’d like to know, grouping by group, how many nulls there are in each column. In this case, the output should be: I don’t have control on how many columns I have or their names. Thanks! Answer Convert column group to index, test all another values f…
Pandas – How to use multiple cols for mapping (without merging)?
I have a dataframe like as below I would like to do the below a) Attach the location column from key_df to data_df based on two fields – p_id and company So, I tried the below But this resulted in error like below KeyError: “None of [Index([‘p_id’,’company’], dtype=’o…
Apply T-Test test per group
I have dataframe like this: And i want to calculate p-value from T-Test for each variable between groups. I can manually calculate each p-value like this: So the question is how can i get a result dataframe like shown below for all variables automatically? Answer There are several ways, the core idea is to us…
Add additional timestamp to Pandas DataFrame items based on item timestamp/index
I have a large time-indexed Pandas DataFrame with time-series data of a couple of devices. The structure of this DataFrame (in code below self._combined_data_frame) looks like this: The DateTimeIndex and device_name are filled for every row, the other columns contain nan values. Sample data is available on Go…
In Jupyter notebooks, how to connect to MS SQL with a different Windows user
I have Select access to a MS SQL database that I would like to extract data into a Pandas dataframe running inside a Jupyter notebook. For reasons out of my control, I have access to the database from a different user. How can I query the database from Jupyter while connected to my current user account? Answe…
Creating another column in pandas based on a pre-existing column
I have a third column in my data frame where I want to be able to create a fourth column that looks almost the same, except it has no double quotes and there is a ‘user/’ prefix before each ID in the list. Also, sometimes it is just a single ID vs. list of IDs (as shown in example DF).
Update column based on grouped date values
Edited/reposted with correct sample output. I have a dataframe that looks like the following: This dataframe is split into groups by ID. I would like to make an updated combined column based on if df[‘bool’] == True, but only if df[‘bool’] == True AND there is another ‘finished&#…