I have one web scraping function which fetches data of 190 URL’s. To complete it fast I used concurrent.future.Threadpool.executor. I am saving that data to SQL Server database. I have to do these all process repeatedly to every 3 mins from 9AM to 4PM. But when I use while loop or scheduler that concurrent future not works. No error and no output.
JavaScript
x
16
16
1
# required libraries
2
import request
3
4
urls = []
5
6
def data_fetched(url):
7
# data fetching
8
# operations on data
9
# data saving to SQL server
10
return ''
11
12
while True:
13
with concurrent.future.ThreadPool.executor() as executor:
14
executor.map(data_fetched, url)
15
time.sleep(60)
16
I want to repeat all these things to every 3 mins, explained flow of code. Please help me how to schedule it.
JavaScript
1
23
23
1
start = dt.strptime("09:15:00", "%H:%M:%S")
2
end = dt.strptime("15:30:00", "%H:%M:%S")
3
# min_gap
4
min_gap = 3
5
6
# compute datetime interval
7
arr = [(start + timedelta(hours=min_gap*i/60)).strftime("%H:%M:%S")
8
for i in range(int((end-start).total_seconds() / 60.0 / min_gap))]
9
while True:
10
weekno = datetime.datetime.today().weekday()
11
now = dt.now() # gets current datetime
12
hour = str(now.hour) # gets current hour
13
minute = str(now.minute) # gets current minute
14
second = str(now.second)
15
current_time = f"{hour}:{minute}:{second}" # combines current hour and minute
16
17
# checks if current time is in the hours list
18
if weekno < 5 and current_time in arr:
19
print('data_loaded')
20
else: # 5 Sat, 6 Sun
21
pass
22
time.sleep(60)
23
So under these while loop I want to call that function using concurrent.futures.
Advertisement
Answer
You can create a seperate function and schedule it to execute the data_fetched()
. I hope your urls
variable contains the list of urls and not empty list.
JavaScript
1
22
22
1
from schedule import every, repeat, run_pending
2
import time
3
import request
4
5
6
urls = []
7
8
def data_fetched(url):
9
# data fetching
10
# operations on data
11
# data saving to SQL server
12
return ''
13
14
@repeat(every(3).minutes)
15
def execute_script():
16
with concurrent.future.ThreadPool.executor() as executor:
17
executor.map(data_fetched, urls)
18
19
while True:
20
run_pending()
21
time.sleep(1)
22