Skip to content
Advertisement

faster processing of a for loop with put requests python

I have a json file – that i am reading and processing each element due to the api and the payload that is required.

I am looking for ways to speed this up through multi processing / concurrency – not for sure the proper approach. I think either of those would work as they are individual request updating a specific role within the API. The calls can be run concurrently without impacting the role itself.

the function that i currently have iterates through the following:

newapproach.py

import requests
import json
#from multiprocessing import Pool
import urllib3
import cpack_utility
from cpack_utility import classes
#import concurrent.futures
import time
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def update_role(data):
    url, header, verifySSL = mApi.role()
    session = requests.Session()
    session.headers.update(header)
    def update_role_permissions():
        start = time.time()
        for k,v in permissions_roleprivs.items():
            perm_code = v["permissionCode"]
            perm_access = v["access"]
            payload = json.dumps(
                {"permissionCode": perm_code, "access": perm_access}
            )
            result = session.put(url, verify=verifySSL, headers=header, data=payload)
            response = result.json()
            logger.debug(response)
        end = time.time()
        print(f"Time to complete: {round(end - start, 2)}")
    update_role_permissions()

def main(file):
    global mApi 
    global logger
    logger = cpack_utility.logging.get_logger("role")
    mApi = classes.morphRequests_config()

    with open(file, 'r') as f:
      data = f.read()
      data = json.loads(data)
      update_role(data)

    f.close()

if __name__ == '__main__':
    main()

The length of time right now is around 60 seconds to process all of the required payloads that need sent.

logs
2022-06-20 14:39:16,925:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'none'}
2022-06-20 14:39:17,509:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'full'}
2022-06-20 14:39:17,953:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'none'}
2022-06-20 14:39:18,449:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'full'}
2022-06-20 14:39:19,061:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'none'}
2022-06-20 14:39:19,493:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'none'}
2022-06-20 14:39:19,899:88:update_role:update_role_permissions:DEBUG:{'success': True, 'access': 'none'}
Time to complete: 63.22

The json file that gets read in contains a number of api calls that are needed updating.

data.json

{
  "rolePermissions":{
    "roleprivs": {
      "admin-appliance": {
        "permissionCode": "admin-appliance",
        "access": "none"
      },
      "admin-backupSettings": {
        "permissionCode": "admin-backupSettings",
        "access": "none"
      }
     }
    }
}

The old version that i was testing was something like the following and using yaml – which yaml was kind of a nightmare to manage.
oldversion.py

def background(f):
  def wrapped(*args, **kwargs):
    return asyncio.get_event_loop().run_in_executor(None, f, *args, **kwargs)
  return wrapped

@background
def role_update_post(strRoleID, access, code):
    url, header, verifySSL = mApi.role()
    session.headers.update(header)
    url = f'{url}/{strRoleID}{mApi.updateRolePermissions()}'
    payload = classes.cl_payload.pl_permissionsRole(code, access)
    result = session.put(url, verify=verifySSL, headers=header, data=payload)
    response = result.json()
    if response["success"] == False:
      logger.debug("Error updating permission.  Enable Debugging")
      logger.debug(f"Result: {response}")
      logger.debug(f"Access: {access}")
      logger.debug(f"Code: {code}")
    elif response["success"] == True:
      logger.debug(f"Permission updated: {code}")

However this would complete the script – but push the role update to the background – and the script would complete and stall at the end waiting for the background to complete. still took the same amount of time just not as noticeable.

Ideally – I think multiprocessing is the route i would want to do but still not quite grasping how to make that a proper for loop and multiprocess that for loop.

OR – i am just crazy and there is a much better way to do it all – and i am currently an anti pattern.

UPDATED: so this concurrent config actually processes properly – however its still at the same speed as the other.

    def testing2():
        def post_req(payload):
            result = session.put(url, verify=verifySSL, headers=header, data=payload)
            response = result.json()
            logger.debug(response)
            logger.debug('post_req')
            return result
        start = time.time()
        futures = []
        with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
            for k,v in permissions_roleprivs.items():
                perm_code = v["permissionCode"]
                perm_access = v["access"]
                payload = json.dumps(
                    {"permissionCode": perm_code, "access": perm_access}
                )
                futures.append(executor.submit(post_req,payload)) #for k,v in permissions_roleprivs.items()
                for future in futures:
                    future.result()
        end = time.time()
        logger.debug('intesting 2')
        print(f"Time to complete: {round(end - start, 2)}")

Advertisement

Answer

so concurrent.futures – is the ideal sweetness that is needed to process this. I had to do a little more testing but now a process that used to take 60 to 80 seconds depending on the server i was hitting. Now takes 10 seconds.

    def testing2():
        def post_req(payload):
            result = session.put(url, verify=verifySSL, headers=header, data=payload)
            response = result.json()
            logger.debug(response)
            return result
        start = time.time()
        futures = []
        with concurrent.futures.ThreadPoolExecutor() as executor:
            for k,v in permissions_roleprivs.items():
                perm_code = v["permissionCode"]
                perm_access = v["access"]
                payload = json.dumps(
                    {"permissionCode": perm_code, "access": perm_access}
                )
                futures.append(executor.submit(post_req,payload)) #for k,v in permissions_roleprivs.items()
            for future in concurrent.futures.as_completed(futures):
                future.result()
        end = time.time()
        logger.debug('intesting 2')
        print(f"Time to complete: {round(end - start, 2)}")

One of the key screw ups that i found in my previous attempts at this was

for future in concurrent.futures.as_completed(futures):
                future.result()

I didn’t have this line of code – properly setup or in my initial tests it didn’t exist. When i finally got this working – i was still seeing 60 seconds.

The next problem was it was in the for loop for roleprivs.items() – pulled that out of the initial for loop and was able to process much faster.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement