Skip to content

Tag: dataframe

Python in Databricks

How to even start a basic query in databricks using python? The data I need is in databricks and so far I have been using Juypterhub to pull the data and modify few things. But now I want to eliminate a step of pulling the data in Jupyterhub and directly move my python code in databricks then schedule the job.

Compare two DataFrames and find missing timestamps

I have the following two dataframes: and in df2 I have some missing timestamps compared to df1. I am able to find those timestamps using the following code: I want to populate those missing timestamps in df2 and fill in the values of the columns with the average value of the two previous rows. So the new df2 should look

Filling empty months in pandas dataframe not working

I have a pandas DataFrame exclusively with dates: Using groupby I get a count for the number of monthly occurrences as seen below: (date is only used for plotting reasons). My issue is, come 09-2021 I have zero monthly counts and I want to obtain my gh dataframe such that the missing rows look something like: All the way through

how to transform dataframe into data set/object

I have a data set in a dataframe that’s almost 9 million rows and 30 columns. As the columns count up, the data becomes more specific thus leading the data in the first columns to be very repetitive. See example: park_code camp_ground parking_lot acad campground1 parking_lot1 acad campground1 parking_lot2 acad campground2 parking_lot3 bisc campground3 parking_lot4 I’m looking to feed that