Tag: pandas

How to count pandas datetime months by continuous season

I have a large time-series dataframe. The column has already been formatted as datetime. Such as I want to plot the sample numbers for each season. Such as the following. The values are the count number of samples in that season. I do make a little search and realize I can create a dictionary to convert the months into seasons.

How to plot a bar-plot with only one bar colored?

matplotlib pandas python seaborn

name grade chandler A joey B phoebe B monica C ross A rachel B mike C gunther A How to proceed from here if I want to make 8 different report cards (small graph in A4 size paper), and highlight the grade category in which the student belongs? Edit: I want to show gunther in which group he falls in.

Enumerate rows in each group starting from one

count dataframe pandas pandas-groupby python

I have a dataframe (which is sorted on date, date column is not included in the example for simplicity) that looks like this: I want to create a new column that counts the occurrence of each value in the letters column, increasing 1 by 1 as the value occurs in the letters column. The data frame I want to reach

Can I use numpy.polyfit(x, y, deg) for multiple linear regression

numpy olsmultiplelinearregression pandas python

Is there any way I can fit two independent variables and one dependent variable in numpy.polyfit()? I have a panda data frame that I loaded from a csv file. I wish to include two columns as independent variables to run multiple linear regression using NumPy. Currently my simple linear regression looks like this: model_combined = np.polyfit(data.Exercise, y, 1) I wish

Exporting data as CSV file from ServiceNow instance using Python

pandas python servicenow servicenow-rest-api

I have some data in an instance that I would like to export to a CSV file using Python and the REST API. I wish to use REST, because there are some rows missing when emailed as a .CSV file. The query gives me 12,000 rows, however, the file that is emailed to me only contains 10,001 rows. Here is

duplicated rows in pandas append inside for loop

append dataframe pandas python

I am having trouble with a for loop inside a function. I am calculating cosine distances for a list of word vectors. with each vector, I am calculating the cosine distance and then appending it as a new column to the pandas dataframe. the problem is that there are several models, so i am comparing a word vector from model

Find duplicate values in two arrays, Python

numpy pandas python

I have two arrays (A and B) with about 50 000 values in each. Every value represents an ID. I want to create a pandas dataframe with three columns, col1: values from array A, col2: values from array B, col3: a string with the labels “unique” or “duplicate”. In each array the ID:s are unique. The arrays is of different

Pandas skipping lines when in read_csv, can I record these to variable/log file

dataframe pandas python

I’ve seen similar questions on here but nothing that is quite what I want to do. I’m reading in a tsv/csv file using I have clearly defined headers within the file but sometimes I see that the file has unexpected additional columns and get the following messages in the console Skipping line 251643: Expected 20 fields in line 251643, saw

Creating new columns within a dataframe, based on the latest value from previous columns

dataframe pandas python

I’ve just completed a beginner’s course in python, so please bear with me if the code below doesn’t make sense or my issue is because of some rookie mistake. I’ve been trying to put the learning to use by working with college production of NFL players, with a view to understanding which statistics can be predictive or at least correlate