Skip to content
Advertisement

How to find which doctor a patient is using, when only given a list of doctor’s patients? (code improvement request) [closed]

I need to create a dataframe which lists all patients and their matching doctors.

I have a txt file with doctor/patient records organized in the following format:

Doctor_1: patient23423,patient292837,patient1232423...
Doctor_2: patient456785,patient25363,patient23425665...

And a list of all unique patients.

To do this, I imported the txt file into a doctorsDF dataframe, separated by a colon. I also created a patientsDF dataframe with 2 columns: ‘Patients’ filled from the patient list, and ‘Doctors’ column empty.

I then ran the following:

for pat in patientsDF['Patient']:
    for i, doc in enumerate(doctorsDF[1]):
        if doctorsDF[1][i].find(str(pat)) >= 0 :
            patientsDF['Doctor'][i] = doctorsDF.loc[i,0]
        else: 
            continue

This worked fine, and now all patients are matched with the doctors, but the method seems clumsy. Is there any function that can more cleanly achieve the result? Thanks!

(First StackOverflow post here. Sorry if this is a newb question!)

Advertisement

Answer

If you use Pandas, try:

df = pd.read_csv('data.txt', sep=':', header=None, names=['Doctor', 'Patient'])
df = df[['Doctor']].join(df['Patient'].str.strip().str.split(',')
                                      .explode()).reset_index(drop=True)

Output:

>>> df

     Doctor          Patient
0  Doctor_1     patient23423
1  Doctor_1    patient292837
2  Doctor_1   patient1232423
3  Doctor_2    patient456785
4  Doctor_2     patient25363
5  Doctor_2  patient23425665

How to search:

>>> df.loc[df['Patient'] == 'patient25363', 'Doctor'].squeeze()
'Doctor_2'
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement