Skip to content
Advertisement

Pandas : DataFrame columns are not unique when making dictionary

I have a dataframe like this:

Name Alt_01 Alt_02
AAPL Apple apple Inc.
AMZN Amazon NaN

in order to check if string contains alt names, I build code like:

JavaScript

Since not all the names have same amount of alternative names, I put dropna() function to remove NaN values.

But after I do this, I receive message like:

UserWarning: DataFrame columns are not unique, some columns will be omitted.

and returns dict with only first alt name, eg.) {AAPL : [‘Apple’], AMZN : [‘Amazon’]}

Is there any good idea for solving this?

Advertisement

Answer

If I interpret your question correctly you want a resulting dictionary that looks like this:

{'AAPL': ['Apple', 'apple Inc.'], 'AMZN': ['Amazon']}.

If that is the case then the following code will work:

JavaScript

The reason why pandas’ dropna() doesn’t work is because it will either delete a whole column (so in your example ‘apple Inc.’ would also deleted) or a whole row (in your example the whole ‘AMZN’ row would be deleted).

In case the way search_dict is created is alien to you: it is comprised of a dictionary comprehension and a list comprehension. For more info see: https://realpython.com/list-comprehension-python/

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement