Skip to content
Advertisement

How to schedule concurrent.future.Threadpool.executor map() function in python

I have one web scraping function which fetches data of 190 URL’s. To complete it fast I used concurrent.future.Threadpool.executor. I am saving that data to SQL Server database. I have to do these all process repeatedly to every 3 mins from 9AM to 4PM. But when I use while loop or scheduler that concurrent future not works. No error and no output.

# required libraries
import request

urls = []

def data_fetched(url):
    # data fetching
    # operations on data
    # data saving to SQL server
    return ''

while True: 
    with concurrent.future.ThreadPool.executor() as executor:
        executor.map(data_fetched, url)
    time.sleep(60)

I want to repeat all these things to every 3 mins, explained flow of code. Please help me how to schedule it.

start = dt.strptime("09:15:00", "%H:%M:%S")
end = dt.strptime("15:30:00", "%H:%M:%S")
# min_gap
min_gap = 3

# compute datetime interval
arr = [(start + timedelta(hours=min_gap*i/60)).strftime("%H:%M:%S")
       for i in range(int((end-start).total_seconds() / 60.0 / min_gap))]
while True:
    weekno = datetime.datetime.today().weekday()
    now = dt.now() # gets current datetime
    hour = str(now.hour) # gets current hour
    minute = str(now.minute) # gets current minute
    second = str(now.second)
    current_time = f"{hour}:{minute}:{second}" # combines current hour and minute

    # checks if current time is in the hours list
    if weekno < 5 and current_time in arr:
            print('data_loaded')
    else:  # 5 Sat, 6 Sun
        pass
    time.sleep(60)

So under these while loop I want to call that function using concurrent.futures.

Advertisement

Answer

You can create a seperate function and schedule it to execute the data_fetched(). I hope your urls variable contains the list of urls and not empty list.

from schedule import every, repeat, run_pending
import time
import request


urls = []

def data_fetched(url):
    # data fetching
    # operations on data
    # data saving to SQL server
    return ''

@repeat(every(3).minutes)
def execute_script():
    with concurrent.future.ThreadPool.executor() as executor:
        executor.map(data_fetched, urls)

while True:
    run_pending()
    time.sleep(1)
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement