My data set is much larger so I have simplified it. I want to convert the dataframe into a time-series. The bit I am stuck on: I have overlapping date ranges, where I have a smaller date range inside a larger one, as shown by row 0 and row 1, where row 1 and row 2 are inside the date
Tag: pandas
Create table by grouping mean values by column and list of one-hot encoded columns (Python, pandas)
I am working with tweets and I would like to report the mean sentiment score by topic and by community. This is what my dataframe looks like where each row is a document (tweet): I want to create a dataframe that contains a mean sentiment value in each cell like this: Any thoughts on how to go about this plea…
Read excel file in python using pandas
I am trying to read excel file in pycharm using pandas. I installed the package successfully. My issue is that I am trying to use file location in addition to its name I tried many thing as follow: However I keep receiving the following error Any Idea? Thanks in advance Answer Your fileLocation variable inclu…
How to iterate through a nested for loop in pandas dataframe?
I am attempting to iterate through a Hacker News dataset and was trying to create 3 categories (i.e types of posts) found on the HN forum viz, ask_posts, show_posts and other_posts. In short, I am trying to find out the average number of comments per posts per category(described below). The results respective…
Type hints for a pandas DataFrame with mixed dtypes
I’ve been looking for robust type hints for a pandas DataFrame, but cannot seem to find anything useful. This question barely scratches the surface Pythonic type hints with pandas? Normally if I want to hint the type of a function, that has a DataFrame as an input argument I would do: What I cannot seem…
How to upsert pandas DataFrame to PostgreSQL table?
I’ve scraped some data from web sources and stored it all in a pandas DataFrame. Now, in order harness the powerful db tools afforded by SQLAlchemy, I want to convert said DataFrame into a Table() object and eventually upsert all data into a PostgreSQL table. If this is practical, what is a workable met…
Datetime object in DataFrame with just time
This is the plain column And the I would like to put that column in the index, the problem is when I try to use the method resample() I always get the same problem: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of ‘Index’ I’ve been using this to…
Pandas FutureWarning: Columnar iteration over characters will be deprecated in future releases
I have an existing solution to split a dataframe with one column into 2 columns. Recently, I got the following warning FutureWarning: Columnar iteration over characters will be deprecated in future releases. How to fix this warning? I’m using python 3.7 Answer That’s not entirely correct, plus the…
INSERT INTO SELECT based on a dataframe
I have a dataframe df and I want to to execute a query to insert into a table all the values from the dataframe. Basically I am trying to load as the following query: For that I have the following code: However, I am getting the following error: Does anyone know what I am doing wrong? Answer See below my
Importing data from URL using Python (into pandas dataframe)?
I’ve gone around in circles on this one. A bit frustrating as the solution is probably close at hand. Anyway, I found a URL that returns some data in CSV format. However, the URL itself does not contain the csv file name. In a web browser, I can easily go to the link and them I’m asked whether I w…