Tag: dataframe

Python – Parse text file with no delimiter and dynamic width values

My goal is to parse a text file in Python that has no headers therefore no columns names and no delimiters. A sample of the original file looks as follows: I tried to import the file into an Excel file but since it has no delimiter nor fixed-width, each value of the row is wrapped within one cell (cell A).

pandas.read_csv() returns strings from columns instead numbers

dataframe jupyter-notebook matplotlib pandas python

I am trying to find linear regression plot for the data provided when I try to plot it the plot was completely empty and when I printed the type of X it shows the type is string.. Where am I standing wrong??? Answer No need to make new DataFrames for X and y. Try astype(float) if you want them as

Matplotlib: how to classify values/data in a scatter plot?

dataframe matplotlib pandas python scatter-plot

I’m trying to create a scatter plot that, on the graph, you can differentiate two things: By color. For example, if the value is negative the color is red and if the value is positive the color is blue. By marker size. For example, if the value it’s between -0.20 and 0 size is 100, if the value is between

Groupby aggregate and transpose in pandas

dataframe pandas pandas-groupby python python-3.x

df= Off all the genres in the genre field, I only need to consider ‘Rock’, ‘Latin’, ‘Metal’, ‘Blues’ and build a new dataframe based on the following requirements a.how many songs the singer has from that genre (count of each genre must be in a separate column). b.Count of how many albums the singer has in the data. c.Count of

Create a matrix of pairwise comparisons between columns

dataframe pandas python r

I would like to create a matrix showing the number of row-wise differences for each pairwise comparison of columns. This is what I’m starting with: This is what I want to end up with: How can I do this in Python or R? Answer Try adist like below

Function to move specific row to top or bottom of pandas dataframe

dataframe indexing pandas python row

I have two functions which shift a row of a pandas dataframe to the top or bottom, respectively. After applying them more then once to a dataframe, they seem to work incorrectly. These are the 2 functions to move the row to top / bottom: Note: I don’t want to reset_index for the returned df. Example: This is my dataframe:

Loading CSV into dataframe results in all records becoming “NaN”

csv dataframe mysql pandas python

I’m new to python (and posting on SO), and I’m trying to use some code I wrote that worked in another similar context to import data from a file into a MySQL table. To do that, I need to convert it to a dataframe. In this particular instance I’m using Federal Election Comission data that is pipe-delimited (It’s the “Committee

How to merge two dataframes where the second one has different column names and length?

dataframe merge pandas python

I have two dataframes. The first one is just a column of daily datetime, whereas the second one has both dates and data. This is an example: What I want to do is to merge df1 and df2 to get a dataframe (dataset) where: when the data exist it takes the date position; when it doesn’t exist, it just gets

set an index while merging two dataframe

dataframe pandas python

I have a dataframe like this : dte res year 1995-01-01 65.3 1995 1995-01-02 65.5 1995 … … … 2019-01-03 55.2 2019 2019-01-04 52.2 2019 and I’m trying to create another file in this format : basically I want every year in a different column. Here is what I already did : when I write in my loop df=pd.DataFrame(myDict,index=[row.dte]) and

Segregate a column data based on regex using pandas

dataframe pandas python regex string

I have a dataframe like as shown below I would like to create 3 new columns val_num – will store ONLY NUMBER values that comes along with symbols ex: 1234 (from >1234) and 1000 (from <1000) but WILL NOT STORE 31 (from 31sadj) because it doesn’t have any symbol val_str – will store only values a mix of NUMBER,symbols,ALPHABETS or