I am trying to find linear regression plot for the data provided when I try to plot it the plot was completely empty and when I printed the type of X it shows the type is string.. Where am I standing wrong??? Answer No need to make new DataFrames for X and y. Try astype(float) if you want them as
Tag: pandas
How to use get_dummies or one hot encoding to encode a categorical feature with multiple elements?
I’m working on a dataset which has a feature called categories. The data for each observation in that feature consists of semi-colon delimited list eg. Rows categories Row 1 “categorya;categoryb;categoryc” Row 2 “categorya;categoryb” Row 3 “categoryc” Row 4 “cat…
Matplotlib: how to classify values/data in a scatter plot?
I’m trying to create a scatter plot that, on the graph, you can differentiate two things: By color. For example, if the value is negative the color is red and if the value is positive the color is blue. By marker size. For example, if the value it’s between -0.20 and 0 size is 100, if the value is…
Groupby aggregate and transpose in pandas
df= Off all the genres in the genre field, I only need to consider ‘Rock’, ‘Latin’, ‘Metal’, ‘Blues’ and build a new dataframe based on the following requirements a.how many songs the singer has from that genre (count of each genre must be in a separate column).…
New column in dataset based em last value of item
I have this dataset I want to add a new column in dataset based em last value of item, like this A New Column 1 2 1 3 2 4 3 5 4 I tryed to use apply with iloc, but it doesn’t worked Can you help Thank you Answer With your shown samples, could you please try following. You
Create a matrix of pairwise comparisons between columns
I would like to create a matrix showing the number of row-wise differences for each pairwise comparison of columns. This is what I’m starting with: This is what I want to end up with: How can I do this in Python or R? Answer Try adist like below
Function to move specific row to top or bottom of pandas dataframe
I have two functions which shift a row of a pandas dataframe to the top or bottom, respectively. After applying them more then once to a dataframe, they seem to work incorrectly. These are the 2 functions to move the row to top / bottom: Note: I don’t want to reset_index for the returned df. Example: Th…
Loading CSV into dataframe results in all records becoming “NaN”
I’m new to python (and posting on SO), and I’m trying to use some code I wrote that worked in another similar context to import data from a file into a MySQL table. To do that, I need to convert it to a dataframe. In this particular instance I’m using Federal Election Comission data that is …
How to merge two dataframes where the second one has different column names and length?
I have two dataframes. The first one is just a column of daily datetime, whereas the second one has both dates and data. This is an example: What I want to do is to merge df1 and df2 to get a dataframe (dataset) where: when the data exist it takes the date position; when it doesn’t exist, it just gets
set an index while merging two dataframe
I have a dataframe like this : dte res year 1995-01-01 65.3 1995 1995-01-02 65.5 1995 … … … 2019-01-03 55.2 2019 2019-01-04 52.2 2019 and I’m trying to create another file in this format : basically I want every year in a different column. Here is what I already did : when I write in m…