Skip to content
Advertisement

Calculating a for loop with different indexes simultaneosuly

I have the following for function:

JavaScript

This for loop takes a long time to calculate the values for the data frame as it has to loop 50 times for each row (it takes approximately 62 seconds)

I tried to use multiprocessor pool from this question. My code looks like this now:

JavaScript

I run the function asynchronously with different values for the for loop. this takes 19 seconds to complete and I can see the result of each function printed correctly but the final value of dfClosePirce is a dataframe with only 1 column (Trade Close) and the new columns from each async function will not be added to the dataframe. How can I do it the right way?

Advertisement

Answer

Solution Using Numpy vectorization

Issue

  1. Line if(index-i > 0): should be if(index-i >= 0): otherwise we miss the difference of 1
  2. Use ‘Close’ rather than ‘Trade Close’ (doesn’t matter for performance but avoid renaming column after pulling data from web)

Code

JavaScript

Usage

JavaScript

Performance

Summary

  • Posted Code: 37.9 s ± 143 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Numpy Code: 1.56 ms ± 27.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  • Result: 20K times speed up

Test Code

JavaScript
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement