Pandas : DataFrame columns are not unique when making dictionary

Question

I have a dataframe like this: Name Alt_01 Alt_02 AAPL Apple apple Inc. AMZN Amazon NaN in order to check if string contains alt names, I build code like: Since not all the names have same amount of alternative names, I put dropna() function to remove NaN values. But after I do this, I receive message like: Us…

Accepted Answer

If I interpret your question correctly you want a resulting dictionary that looks like this:{'AAPL': ['Apple', 'apple Inc.'], 'AMZN': ['Amazon']}.If that is the case then the following code will work:temp = df.set_index('Name').T.to_dict('list')search_dict = {k: [elem for elem in v if elem is not np.nan] for k,v in temp.items()}The reason why pandas&#8217; dropna() doesn&#8217;t work is because it will either delete a whole column (so in your example &#8216;apple Inc.&#8217; would also deleted) or a whole row (in your example the whole &#8216;AMZN&#8217; row would be deleted).In case the way search_dict is created is alien to you: it is comprised of a dictionary comprehension and a list comprehension. For more info see: https://realpython.com/list-comprehension-python/

Advertisement

Answer