I have a huge dataframe which I get from a .csv file. After defining the columns I only want to use the one I need. I used Python 3.8.1 version and it worked great, although raising the “FutureWarning:
Passing list-likes to .loc or [] with any missing label will raise KeyError in the future, you can use .reindex() as an alternative.”
If I try to do the same in Python 3.10.x I get a KeyError now: “[’empty’] not in index”
In order to get slice/get rid of columns I don’t need I use the .loc function like this:
df = df.loc[:, ['laenge','Timestamp', 'Nick']]
How can I get the same result with .reindex function (or any other) without getting the KeyError?
Thanks
Advertisement
Answer
If need only columns which exist in DataFrame use numpy.intersect1d
:
df = df[np.intersect1d(['laenge','Timestamp', 'Nick'], df.columns)]
Same output is if use DataFrame.reindex
with remove only missing values columns:
df = df.reindex(['laenge','Timestamp', 'Nick'], axis=1).dropna(how='all', axis=1)
Sample:
df = pd.DataFrame({'laenge': [0,5], 'col': [1,7], 'Nick': [2,8]}) print (df) laenge col Nick 0 0 1 2 1 5 7 8 df = df[np.intersect1d(['laenge','Timestamp', 'Nick'], df.columns)] print (df) Nick laenge 0 2 0 1 8 5