I am having a problem with a loop in python to get the desired result. Here is my issue.
First, I have 1 text file: urls.txt. This file has multiple URLs. Second, I have multiple json files. Lets say there are 5 json files.
I want to process first n lines of the urls.txt file with 1.json file and then next n lines of urls.txt file with 2.json file and so on. After all the 5 json files are used, I want to start from 1.json file again and repeat the process until all the lines in urls.txt files are processed.
In my case I wanted to rotate the json files after each 100 lines of urls.txt
I have written some code to do that but unfortunately, I am not able to figure out how to repeat the operation once all the json files are used.
batch_size = 100 JSON_KEY_FILE_PATH = "json_files/" JSON_FILENAME = '*.json' json_file_list = glob.glob(JSON_KEY_FILE_PATH + JSON_FILENAME, recursive=True) itr_length = len(json_file_list) from itertools import count def UrlCall(URL_FILE): with open(URL_FILE, mode='r') as urllist: for j in range(0,itr_length): for i in count(): line = urllist.readlines(20) print ('===>' + str(i) + '===>' + str(line)) if (i/batch_size).is_integer() and line != '' and i != 0 and j != itr_length: # #define the json message print ("Value of J" + str(j)) print ("JSON FILE IN USE:" + str(json_file_list[j])) if j == itr_length-1: print ("====>RESTARTING JSON EXECUTION") time.sleep(10) print ("Value of J" + str(j)) print ('===>' + str(i) + '===>' + str(line)) print ("JSON FILE IN USE:" + str(json_file_list[j])) return break
This code is existing after after all the json files are used. But I want to restart using the json files range again and process the next n line in urls.txt file.
Advertisement
Answer
You can use itertools.cycle(json_file_list)
to loop through the list repeatedly.
You can use one of the techniques in What is the most “pythonic” way to iterate over a list in chunks? to iterate over the file in groups of N lines.
Then you can zip them to process them together.
from itertools import cycle, zip_longest def grouper(iterable, n, fillvalue=None): args = [iter(iterable)] * n return zip_longest(*args, fillvalue=fillvalue) def url_call(url_file, json_file_list): with open(url_file) as f: for json_file, lines in zip(cycle(json_file_list), grouper(f, batch_size)): json_data = json.load(open(json_file)) for line in lines: if line: # do something with line and json_data