Skip to content

Tag: pandas

Create a new category by using a value from another column

My dataset currently has 1 column with different opportunity types. I have another column with a dummy variable as to whether or not the opportunity is a first time client or not. I would like to create a new category within col_opptype based on col_first. Where only 1 category (i.e. a) will be matched to its corresponding col_first I.e., col_opptype

python, How to get smoother value?

Somehow seaborn draws smoother line than actual data. For example, for x-value 0.18, actual data is like 11 but value on smoother line is about 3. How would I get value 3 for the x-value when given the list of data? The actual data are: Answer You can access the plot data with: out:

Repeat pattern using python regex

Well, I’m cleaning a dataset, using Pandas. I have a column called “Country”, where different rows could have numbers or other information into parenthesis and I have to remove them, for example: Australia1, PerĂº (country), 3Costa Rica, etc. To do this, I’m getting the column and I make a mapping over it. But I have a problem with this regex,

make correlation plot on time series data in python

I want to see a correlation on a rolling week basis in time series data. The reason because I want to see how rolling correlation moves each year. To do so, I tried to use pandas.corr(), pandas.rolling_corr() built-in function for getting rolling correlation and tried to make line plot, but I couldn’t correct the correlation line chart. I don’t know

Issue with conversion of text data into a dataframe

I have a text file where I have several lines and between them, some data which I need to convert to the dataframe(useful data). I iterated the text file line by line and captured the useful data with the help of a regex. Something like this, The data captured look like this I thought to iterate each captured row and