Skip to content
Advertisement

How to remove the first n character from all the cells in a column using python pandas?

df['opposition'].apply(lambda x: x[2:])

Please help me to understand the lambda function and how it works in this case.

Advertisement

Answer

Few things at play here:

  • df[column].apply(f) takes a function f as argument and applies that function to every value in column, returning the new column with modified values.
  • lambda x: x[2:] defines a function that takes a value x and returns the slice x[2:]. I.e., when x is a string, it returns x without the first two characters.
  • Hence, df['opposition'].apply(lambda x: x[2:]) returns the 'opposition' column modified by removing the first 2 characters from all strings in it.

However, for this particular use case, there is a much better way to do this. You can use .str.slice() to perform the same operation:

df['opposition'].str.slice(start=2)

The methods in .str are specific for columns with string values. See here for more info.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement