Skip to content
Advertisement

Merging two dataframes in pandas without column names (new to pandas)

Short explanation:

If you have duplicate column names in your data, be sure to rename one column when you read the file.

If you have NaN etc in your data, remove those.

Then merge using correct answer below.


Probably a pretty simple question.

I have two datasets that I read in using pandas.read_csv().

My data is in two separate csv.

With the following code:

JavaScript

My two data heads look like this:

JavaScript

and

JavaScript

So my column 0 on either data set is the key i want to merge on, and i want to keep all data from both result sets.

How would I go about doing this? All the examples I am finding online require keys, but I do not have that in my results.

But on the join I get the following errors:

JavaScript

I searched through my data sets and there are no duplicates.

Thank you!

Advertisement

Answer

You should still be able to merge on the columns:

JavaScript

This will perform an inner merge so only the intersection of both datasets, i.e. where the values in column 0 exist in both, if you want all values, then specifcy outer:

JavaScript

You would have to rename or move the columns that clashed 1_x and 1_y above.

It is probably better to rename the columns to something sensible before hand. When reading the csv you can pass a list of column names:

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement