Skip to content
Advertisement

Convert a normal python code to an MPI code

I have this code that I would like to edit and run it as an MPI code. The array in the code mass_array1 is a multi-dimensional array with total ‘iterations’ i*j around 80 million. I mean if I flatten the array into 1 dimensional array, there are 80 million elements.

The code takes almost 2 days to run which is quite annoying as it is only small part of the whole project. Since I can log into a cluster and run the code through 20 or so processors (or even more), can someone help me edit this code to an MPI code?

Even writing the MPI code in C language works.

#Alotting Black Holes at z=6
from tqdm import tqdm
bhs=[0]*1000

for i in tqdm(range(0,1000),leave=True):
    bhs[i]=np.zeros(len(mass_array1[i]))
    for j in range (len(mass_array1[i])):
        bhs[i][j]=np.random.lognormal(np.log(Mbhthfit6(mass_array1[i],6)[j]),np.log(5))

Current C program using MPI on that cluster:

int main(int argc,char **argv){
float epsran;
FILE *fp;
char str[256];
fp=fopen("parameterfile.dat","w");
fprintf(fp,
   " cosmological parametern"
       "h:%%fn"
   "omegam:%%fn"
   "omegab:%%fn"
   "omegal:%%fn"
   "sigma8:%%fn"
   "rho0mMpc:%%en"
   "alpha:%%fn"
   "deltac:%%fn",ndh,
   omegam,omegab,omegal,sigma8,rho0mMpc,alpha,deltac);
fclose(fp);
/* MPI test */
int i,Petot,MyRank;
clock_t start,end;
start = clock();
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &Petot);
MPI_Comm_rank(MPI_COMM_WORLD, &MyRank);
srand((unsigned)(time(NULL)+MyRank));
//printf ("Hello World %%dn%%d", MyRank,Petot);
float samples[100];
for(i=0;i<100/Petot;i++){
  samples[i]=halo_samples(1.68,1000);
    outputS(235,30,varipsapp(samples[i],0),MyRank*(100/Petot)+i);
}
printf("Length:%%d",(int)(sizeof(samples)/sizeof(samples[0])));
/*    FILE *fpw;
fpw = fopen("Minitial.dat","w");
for(i=0;i<MyRank*(100/Petot);i++){
  fprintf(fpw,"%%fn",samples[i]);
  }
  fclose(fpw);*/
MPI_Finalize();
end = clock();
  }

Submitting a job

After this, there is a job.sh file that looks something like this:

#!/bin/sh     
#$ -S /bin/sh                                                                  
#$ -cwd                                          
#$ -V
#$ -N mergertree
#$ -q all.q@messier04
#$ -q all.q@messier05
#$ -pe openmpi10 20 
#$ -o resultfile/out.txt
#$ -e resultfile/error.txt
                                                       
mpirun -np $NSLOTS ./a.out

Mbhfit6

This is how I have define Mbhfit6 in my code:

def Mbhthfit6(Mdm,z):
    a= 5.00041824
    b= 0.31992748
    Mbhth=(10**a)*(Mdm**b)
    return Mbhth

mass_array1

Here, I have uploaded one of the files (in zip format) that contains the data for mass_array1. https://drive.google.com/file/d/1C-G28OSND7jxqkFZQS3dlW6_40yBN6Fy/view?usp=sharing

You need to unzip the file into a folder and then use the code below to import it in Python

This is my code to import the file: (its only 3 MB)

#import all the files from directory
dirlist=["bh2e8"]
import time

mass_array1=[0]*1000
#print(mass_array)
#read all the files 
for i,X in enumerate(dirlist):
    exec('filelist=glob.glob("%%s/test*.dat")'%%(X))
    #exec("mass_array%%s=[]"%%X)
    initial_mass=[]
    for j,Y in tqdm(enumerate(filelist),position=0, leave=True, total=1000):
        Y=Y.replace(os.sep, '/')
        #Z=int(Y[10:13])
        Z=int(re.findall("d+", Y)[2])
        #print(Z)
        mass_array1[Z]=[]
        #print('i=',Z,end="r")
        #print('i=',Z,end="r")
        exec("initial_partial=np.loadtxt('%%s',max_rows=1)"%%(Y))
        exec("initial_mass=np.append(initial_mass,initial_partial)")
        exec("mass_partial=np.loadtxt('%%s',skiprows=1)"%%(Y))
        mass_array1[Z]=np.append(mass_partial,mass_array1[Z])
        #mass_array1[Z]=mass_partial

Advertisement

Answer

I don’t view this as a big enough set of data to require mpi provided you take an efficient approach to processing the data.

As I mentioned in the comments, I find the best approach to processing large amounts of numerical data is first to use numpy vectorization, then try using numba jit compiling, then use multi-core processing as a last resort. In general that’s following the order of easiest to hardest, and will also get you the most speed for the least work. In your case I think vectorization is truly the way to go, and while I was at it, I did some re-organization which isn’t really necessary, but helped me to keep track of the data.

import numpy as np
from pathlib import Path
import re

dirlist=[r"C:UsersaaronDownloadsbh2e8"]
dirlist = [Path(d) for d in dirlist] #convert directory paths to pathlib.Path objects for ease of file system manipulation

initial_mass = {} #use a dictionary so we don't have to preallocate indices
mass_array = {} #use a dictionary so we don't have to preallocate indices

for dir_path in dirlist:
    for child in dir_path.iterdir():
        m = re.match(".*?test(?P<index>d+).dat$", str(child))
        if m: #if we match the end of the child path as a testxxx.dat file (not another directory or some other file type)
            file_index = int(m["index"])
            with child.open() as f:
                arr = [float(line) for line in f if line.strip()] #1d array of float numbers skipping any empty lines
            initial_mass[file_index] = arr[0]
            mass_array[file_index] = np.array(arr[1:])

I started off reading in the data in a slightly different way because I found it more natural to create a dictionary of arrays so the order they were created wouldn’t matter. The index of the file (number at the end of the file name) is used as the key of the dictionary, so it is easy to convert it back to a list if you want with something like: mass_array = list(mass_array[i] for i in range(1000))

Then looking at the rest of your code, all the numpy functions you used are able to process an entire array of data at a time much faster than one at a time using your inner loop (j), so I simply removed the inner loop, and re-wrote the body to use vectorization:


#Alotting Black Holes at z=6

bhs={} #use a dictionary to avoid the need for preallocation

for i, arr in mass_array.items(): #items in python3 iteritems in python2
    
    #inline Mbhthfit6 function, and calculate using vectorization (compute an entire array at once per iteration of `i`)
    bhs[i] = np.random.lognormal(
                                np.log((10**5.00041824)*(arr**0.31992748)),
                                np.log(5)
                                )

again if you want to convert the bhs dictionary back to a list like you previously had, it’s quite simple: bhs = list(bhs[i] for i in range(1000))

With these changes (and a relatively powerful PC) the code executed on the data files you provided in under half a second. with just over 700,000 values in the example dataset, if we extrapolate out to 80 million, that should be on the order of a minute or two.

P.S. if you find yourself using exec a lot with generated strings of code, you’ll almost always find there’s a better way to accomplish the same thing usually with just a slightly different data structure.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement