I have one web scraping function which fetches data of 190 URL’s. To complete it fast I used concurrent.future.Threadpool.executor. I am saving that data to SQL Server database. I have to do these all process repeatedly to every 3 mins from 9AM to 4PM. But when I use while loop or scheduler that concurr…
Tag: concurrent.futures
Retrying failed futures in Python’s ThreadPoolExecutor
I want to implement retry logic with Python’s concurrent.futures.ThreadPoolExecutor. I would like the following properties: A new future is added to the work queue as soon as it fails. A retried future can be retried again, either indefinitely or up to a maximum retry count. A lot of existing code I fou…
faster processing of a for loop with put requests python
I have a json file – that i am reading and processing each element due to the api and the payload that is required. I am looking for ways to speed this up through multi processing / concurrency – not for sure the proper approach. I think either of those would work as they are individual request up…
How to force close ProcessPoolExecutor even when there is deadlock
I’m trying to use a separate process to stream data by concurrent futures. However, on the otherside, sometimes the other party stops the datafeed. But as long as I restart this threadable then it would work again. So I design something like this, to be able to keep streaming data without intervention. …
Identify current thread in concurrent.futures.ThreadPoolExecutor
the following code has 5 workers …. each opens its own worker_task() BUT ….. inside each worker_task() …… I cannot identify … which of the 5 workers is currently being used (Worker_ID) If I want to print(‘worker 3 has finished’) inside worker_task() ….. I cannot…
Predicting in parallel using concurrent.futures of tensorflow.keras models
I am trying to implement some parallel jobs using concurrent.futures. Each worker requires a copy of a TensorFlow model and some data. I implemented it in the following way (MWE) simple_model() creates the model. clone_model clones a TensorFlow model. work represents an MWE of possible work. worker assigns th…
Unable to use dynamic classes with concurrent.futures.ProcessPoolExecutor
In the code below, I am dynamically creating an object of the class inside the _py attribute by using the generate_object method. The code works perfectly if I am not using a concurrent approach. However, if I use concurrency from concurrent.futures, I do not get the desired result because of an error saying …
How to create a continuous stream of Python’s concurrent.futures.ProcessPoolExecutor.submits()?
I am able to submit batches of concurrent.futures.ProcessPoolExecutor.submits() where each batch may contain several submit(). However, I noticed that if each batch of submits consumes a significant about of RAM, there can be quite a bit of RAM usage inefficiencies; need to wait for all futures in the batch t…
Using concurrent.futures to call a fn in parallel every second
I’ve been trying to get to grips with how I can use concurrent.futures to call a function 3 times every second, without waiting for it to return. I will collect the results after I’ve made all the calls I need to make. Here is where I am at the moment, and I’m surprised that sleep() within t…
Create a separate logger for each process when using concurrent.futures.ProcessPoolExecutor in Python
I am cleaning up a massive CSV data dump. I was able to split the single large file into smaller ones using gawk initially using a unix SE Query as a following flow: I have about 12 split csv files that are created using the above mentioned flow and each with ~170K lines in them. I am using python3.7.7 on