I want to split a whatsapp chat backup text by date and keep the date as part of messages. I tried and couldn’t achieve the exact result i want. If anyone can suggest me a way to achieve this, that would be a big help. (I don’t know much about regex) the above code does the job and keep the
Tag: regex
Dashes with(out) spaces with python’s regex
What I’ve managed to do I’m new to both python and regex. With python’s re.compile, in massive number of text files, I wanted to find all kinds of dashes surrounded by spaces. I used: (Yeah, I know about the regex module on PyPI, but I’m trying to use what I know better) It seems to have worked fine: I got
Python regular expression how to deal with multiple back slash
I’m dealing with text data and having problem erasing multiple back slashes. I found out that using .sub works quite well. So I coded as below to erase back slash+r n t f v However, the code above can’t deal with the string below. So coded as this: But it’s showing result like this.. I don’t know why this happens.
Overlapping regular expression substitution in Python, but contingent on values of capture groups
I’m currently writing a program in Python that is supposed to transliterate all the characters in a language from one orthography into another. There are two things at hand here, one of which is already solved, and the second is the problem. In the first step, characters from the source orthography are converted into the target orthography, e.g. (ffr: the
Returning empty string for missing capture group Python regex
I’m working on parsing string text containing information on university, year, degree field, and whether or not a person graduated. Here are two examples: What I am struggling to accomplish is to have an entry for each school experience regardless of whether specific information is missing. In particular, imagine I wanted to pull whether each degree was finished from ex1
How can convert struct column timestamp with start and end into normal pythonic stamp column?
I have a time-series pivot table with struct timestamp column including start and end of time frame of records as follow: Since later I will use timestamps as the index for time-series analysis, I need to convert it into timestamps with just end/start. I have tried to find the solution using regex maybe unsuccessfully based on this post as follows:
Put comma after a pattern in python regex
Just like the question says I am trying to add a comma at the end of a pattern or sub string. I found 3 solutions that should do the job, and look logical too. But they are not changing anything. I will show you all those codes. The goal is to find out if there is something that I am
Convert strings with an unknown number of hex strings embedded in them to strings using regex
So I have a list of strings (content from Snort rules), and I am trying to convert the hex portions of them to UTF-8/ASCII, so I can send the content over netcat. The method I have now works fine for strings with single hex characters (i.e. 3A), but breaks when there’s a series of hex characters (i.e. 3A 4B 00
Regex expression for words with length of even number
I want to write a regex expression for words with even-numbered length. For example, the output I want from the list containing the words: {“blue”, “ah”, “sky”, “wow”, “neat”} is {“blue”, “ah”, “neat}. I know that the expression w{2} or w{4} would produce 2-worded or 4-worded words, but what I want is something that could work for all even numbers.
Disable metacharacters in regular expressions
Perl has Q and E operators in its regular expression toolkit: Does such a facility exist in Python? I am aware of the in operator to do literal string comparison, but my usecase is that I’m using the lineinfile Ansible module which relies on Python’s regular expression library. I’d like something like this (if you can see the intent even