Skip to content

Tag: dataframe

Python in Databricks

How to even start a basic query in databricks using python? The data I need is in databricks and so far I have been using Juypterhub to pull the data and modify few things. But now I want to eliminate a step of pulling the data in Jupyterhub and directly move my python code in databricks then schedule the job…

Compare two DataFrames and find missing timestamps

I have the following two dataframes: and in df2 I have some missing timestamps compared to df1. I am able to find those timestamps using the following code: I want to populate those missing timestamps in df2 and fill in the values of the columns with the average value of the two previous rows. So the new df2 …

how to transform dataframe into data set/object

I have a data set in a dataframe that’s almost 9 million rows and 30 columns. As the columns count up, the data becomes more specific thus leading the data in the first columns to be very repetitive. See example: park_code camp_ground parking_lot acad campground1 parking_lot1 acad campground1 parking_lo…