Get rid of default text

Question

I am trying to parse a user's event descriptions after obtaining access to their google calendar through the google calendar API. When I input the description into my program, I want to get rid of default (and useless) text such as Zoom meeting invitations. If the following below is the description string How can I parse it so that only

Accepted Answer

MethodologyI think this would be a good use for Regular Expressions or RegEx. This is essentially a pattern-matching standard that allows for generalizing a certain structure in a string. While use in HTML and XML is not a good idea as it is not designed to extract any information you may be looking for, it should work if all you want to do is discard certain sections.ExplanationIf I understand correctly, you would like to be left with
Hi, please keep this text.

Also not Zoom default text.
Which means we need to come up with a pattern to match the following portion(the brackets indicating the information that will swap every time):
[Name] is inviting you to a scheduled Zoom meeting.

Topic: [Name]'s Personal Meeting Room

Join Zoom Meeting
[Link]

Meeting ID: [ID]
Password: [Password]
Important Pieces:The beginning: [Name] will be some string of at least one character. To make sure you don’t match
Hi, please keep this text.
, the part we want to match any characters that aren’t “
” (this is represented in regex with [^(?:
)]), where “character” means anything other than a line break. The rest of the sentence should be matched word for word, so we’re not just matching anything.The end: [Password], like [Name], is just [^(?:
)] for the same reason.This string starts and ends with “
”. This should be reflected in the regexEverything between that first sentence and the password portion, eventhough they have a format, they are wildcards, some mix of at leastone character or linebreak (represented in regex with(.|n)+)Replacing all of the appropriate portions in the text, you get the following:
[^(?:
)]+? is inviting you to a scheduled Zoom meeting.+?Password: [^(?:
)]+?
CodeAs for the Python, the re module will come in handy here as your regex aid:We want to save the above pattern into a variable, and use the information to cut the appropriate portion out of the string.To “save” the pattern, the re module allows you to compile the regex into an object (the r before the string indicates that it contains regex)import rezoom_pattern = re.compile(r"
[^(?:
)]+? is inviting you to a scheduled Zoom meeting.+?Password: [^(?:
)]+?
")The module also provides the ability to split replace regex matches within strings, and we can replace our match with nothing to cut it out of the string:import res = " - string with zoom meeting stuff - "zoom_pattern = re.compile(r"
[^(?:
)]+? is inviting you to a scheduled Zoom meeting.+?Password: [^(?:
)]+?
")clean_string = zoom_pattern.sub("", s)Since we compiled the pattern, you now have a reusable way to clean up your string!If you’d like to change your regex to match each individual thing, just adjust the “Important points” from earlier to match your goal. If you want to test your ideas, this is a wonderful resource!

Advertisement

Answer

Methodology

Explanation

Code