What is the difference between str(Series).split() and Series.str.split()?

Question

I wanted to know conceptually why there is difference in output using str(Series).split() and Series.str.split(), when using it on the series object. I was looking to split the date based on the punctuation: the str(Series).split() didn't give me the desired output while the other method, using Series.str.split() but I heard that using the [dot] accessor is frowned upon. I've searched

Accepted Answer

str(series).split() functions similar to concatenating the series object into a string and then splits it on a specified delimiter (in this case, since it is empty, it&#8217;ll use space as a delimiter).On the other hand, series.str.split() will function similar to mapping each string of the series object to the split function which would give you a series object with a list of strings for each string in the original series object.Here is the official documentation for series.str.split() for more info.Also, the dot operator is generally frowned upon when it&#8217;s used to access a dataframe column, as it won&#8217;t work if the column has a whitespace in the name.

Advertisement

Answer