Skip to content
Advertisement

Adding a full stop to text when missing

How to add a full stop to a text please? I am not able to get the desired combined text.

# Import libraries
import pandas as pd
import numpy as np
 
# Initialize list of lists
data = [['text with a period.', '111A.'], 
        ['text without a period', '222B'], 
        ['text with many periods...', '333C'],
        [np.NaN, '333C'],
        [np.NaN, np.NaN]]
 
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['text1', 'text2'])

combined_df=df.copy()
combined_df["combined_text"]=df["text1"].fillna("") + ". " + df["text2"].fillna("") + '.'
combined_df

Desired output

combined_df snapshot

Advertisement

Answer

You can use where and cat:

df['combined_text'] = df.text1.where(df.text1.str.endswith('.'),  df.text1 + '.').str.cat(
                        df.text2.where(df.text2.str.endswith('.'),  df.text2 + '.'),
                        sep=' ',
                        na_rep=''
                      ).str.strip().replace('', np.nan)

Result:

                       text1  text2                    combined_text
0        text with a period.  111A.        text with a period. 111A.
1      text without a period   222B     text without a period. 222B.
2  text with many periods...   333C  text with many periods... 333C.
3                        NaN   333C                            333C.
4                        NaN    NaN                              NaN

(this also works for the case when text1 is given and text2 is NaN)

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement