I am brand new to coding, and was given a web scraping tutorial (found here) to help build my skills as I learn. I’ve already had to make several adjustments to the code in this tutorial, but I digress. I’m scraping off of http://books.toscrape.com/ and, when I try to export a Dataframe of just the book categories into Excel, I
Tag: dataframe
Write Dataframe outputs from a for loop to Excel without overwriting Pandas
I have an extensive set of code that results in 6 DataFrames per for-loop run. The column names are tailored to what vehicle I have running through the loops so the column names are different but the sizes of the dataframes are the same. I want to print a couple dataframes to the same sheet but I have issues with
Python in Databricks
How to even start a basic query in databricks using python? The data I need is in databricks and so far I have been using Juypterhub to pull the data and modify few things. But now I want to eliminate a step of pulling the data in Jupyterhub and directly move my python code in databricks then schedule the job.
Compare two DataFrames and find missing timestamps
I have the following two dataframes: and in df2 I have some missing timestamps compared to df1. I am able to find those timestamps using the following code: I want to populate those missing timestamps in df2 and fill in the values of the columns with the average value of the two previous rows. So the new df2 should look
Filling empty months in pandas dataframe not working
I have a pandas DataFrame exclusively with dates: Using groupby I get a count for the number of monthly occurrences as seen below: (date is only used for plotting reasons). My issue is, come 09-2021 I have zero monthly counts and I want to obtain my gh dataframe such that the missing rows look something like: All the way through
how to extract values based upon month in xarray
I have an array of dimensions (9131,101,191). The first dimension is the days from 1/1/2075 till 31/12/2099. I want to extract all the days which are in the month of July. How can I do this in xarray? I have tried using loops and numpy but not getting the desired result. Ultimately, I want to extract all the arrays which
from dataframe to the body of Email automatically,several formatting issues: thousand separator, color(red for negative number and green for positive)
I have a dataframe look like this I wish to send it as the BODY of the Email with Outlook, it would be great to automate it in the future (as daily report without human intervention) but for the moment I just struggle to achieve some formatting how to get it directly to the body of Email or I have
how to transform dataframe into data set/object
I have a data set in a dataframe that’s almost 9 million rows and 30 columns. As the columns count up, the data becomes more specific thus leading the data in the first columns to be very repetitive. See example: park_code camp_ground parking_lot acad campground1 parking_lot1 acad campground1 parking_lot2 acad campground2 parking_lot3 bisc campground3 parking_lot4 I’m looking to feed that
Slice Dataframe in sub-dataframes when specific string in column is found
Assume I have the dataframe df and I want to slice this in multiple dataframes and store each in a list (list_of_dfs). Each sub-dataframe should only contain the rows “Result”. One sub-dataframe starts, when in column “Point” the value “P1” and in column “X_Y” the value “X” is given. I tried this with first finding the indicies of each “P1”
Use fields of one dataframe as conditions to fill a field of another dataframe
I have 2 dataframes, the first is a small dataframe (df1) with information to use to fill a field (named Flag) of the second dataframe (df2). I need to write a function that uses each row of df1 as parameters to fill each row of df2 with a certain value (Y or N). df1 = type q25 q75 A 13