Skip to content
Advertisement

Tag: pandas

plyr or dplyr in Python

This is more of a conceptual question, I do not have a specific problem. I am learning python for data analysis, but I am very familiar with R – one of the great things about R is plyr (and of course ggplot2) and even better dplyr. Pandas of course has split-apply as well however in R I can do things

Python pandas apply function if a column value is not NULL

I have a dataframe (in Python 2.7, pandas 0.15.0): I want to apply a simple function for rows that does not contain NULL values in a specific column. My function is as simple as possible: And my apply code is the following: It works perfectly. If I want to check column ‘B’ for NULL values the pd.notnull() works perfectly as

Read specific columns with pandas or other python module

I have a csv file from this webpage. I want to read some of the columns in the downloaded file (the csv version can be downloaded in the upper right corner). Let’s say I want 2 columns: 59 which in the header is star_name 60 which in the header is ra. However, for some reason the authors of the webpage

pandas: Convert string column to ordered Category?

I’m working with pandas for the first time. I have a column with survey responses in, which can take ‘strongly agree’, ‘agree’, ‘disagree’, ‘strongly disagree’, and ‘neither’ values. This is the output of describe() and value_counts() for the column: I want to do a linear regression on this question versus overall score. However, I have a feeling that I should

Plotting CDF of a pandas series in python

Is there a way to do this? I cannot seem an easy way to interface pandas series with plotting a CDF. Answer In case you are also interested in the values, not just the plot. This will always work (discrete and continuous distributions) Alternative example with a sample drawn from a continuous distribution or you have a lot of individual

Advertisement