I am running an HPC simulation on amazon AWS with spot instances. Spot instances can be terminated with 2 minutes notice by AWS. In order to check for termination you need to exectute curl
on a spefiic URL every 5 seconds. It is a simple request that returns a json with the termination time, if AWS have initiated the termination process.
Currently I am using subprocess
to run the script:
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=1, universal_newlines=True) for line in p.stdout: if "Floating point exception" in line: print(line.rstrip()) log.write(line) log.flush() p.wait() status = p.returncode print(status)
Is it possible to add a callback that is called every 5 seconds?
The callback would check the return of the curl
command, if it finds a termination time it would set a flag in a file and exit. The main process will then end gracefully because of this flag.
To clarify, I do not want to interact or kill the main process. This particular process (not written by me) checks continuously the content of a file and exits gracefully if it finds a specific keyword. The callback would set this keyword.
Is this the right approach?
Advertisement
Answer
Write a function that runs the following loop;
- launches
curl
in a subprocess and processes the returned JSON. - If the sim should terminate, it writes the required file and returns.
- Otherwise
sleep
for 4.5 minutes.
Start that function in a threading.Thread
before you launch the simulation.
You’d have to test what happens to your for line in p.stdout
loop if the program running in p
exits.
Maybe it will generate an exception. Or you might want to check p.poll()
in the loop to handle that gracefully.