I’m looking for emails where the title has information on how many Bitcoin I received, but as there’s a number in the email title, I want a way to find emails where the number is equal to or greater than that number.
Example… I have an email title like “You received 0.000666703 BTC” but I want to search if the title is this one or has a larger amount of numbers, for example, I want to be able to find this title “You received 0.002719281 BTC”, but I don’t want to find this “You received 0.000028181 BTC” because the number is smaller. I want to be able to find numbers greater than or equal to the first title, this is my code:
import imaplib import credentials import email from bs4 import BeautifulSoup imap_ssl_host = 'imap.gmail.com' imap_ssl_port = 993 username = "myemail" password = "mypass" server = imaplib.IMAP4_SSL(imap_ssl_host, imap_ssl_port) server.login(username, password) server.select('INBOX') typ, data = server.search(None, '(FROM "no-reply@coinbase.com" SUBJECT "You received 0,00066703 BTC" SINCE "24-Sep-2021")') for num in data[0].split(): typ, data = server.fetch(num,'(RFC822)') msg = email.message_from_bytes(data[0][1]) print(msg.get_payload(decode=True))
The beginning of the subject will always be “You received” but after that there are numbers, and letters that will be the amount of btc and “BTC” as well as my example in the question, but how can I extract only the numbers?
The console output is HTML content, I just want to know if the title (like I explained before) exists so I can do the rest, is there any way to do this more efficiently?
Advertisement
Answer
If you only care about the subject, only fetch the subject.
import imaplib from email.parser import HeaderParser from email.policy import default # use Python >= 3.6 EmailMessage API ... parser = HeaderParser(policy=default) server.select('INBOX') typ, data = server.search(None, '(FROM "no-reply@coinbase.com" SUBJECT "You received" SINCE "24-Sep-2021")') if typ == 'ok': for num in data[0].split(): ok, fetched = server.fetch(num, '(BODY.PEEK[HEADER.FIELDS (SUBJECT)])') if ok == 'ok': subj = parser.parsestr(fetched[0][1].decode('us-ascii')) if not subj.startswith('Subject: You received'): continue try: amount = float(subj.split()[2]) except IndexError, ValueError: continue if amount > 0.000666703: print('Message %i: %s', num, subj)
The Subject: header is a bytes
string which at a minimum you have to decode
. However, there may also be a MIME wrapping (like maybe Subject: =?UTF-8?B?WW91IHJlY2VpdmVkIDAuMTIzIEJUQw==
) which you need to decode using the email.parser.HeaderParser
methods or something similar. The interface is a bit messy (you really wish there was a way to pass it bytes
so you don’t have to separately decode
).
The BODY.PEEK
method does not modify the message’s flags (whereas just BODY
would mark the message as read, etc).
Some IMAP servers support more complex search syntax (perhaps even regex) but this should be reasonably portable and robust, I hope.