Skip to content
Advertisement

Checking Previous elements in a list with Python and based on the previous element store a value in a new column with Pandas

list_Crashes = ['Startup', 'Crash in A', 'Shutdown', 'Crash in B', 'Crash in C', 'Startup', 'Crash in D',
                'Startup', 'Crash in E', 'Crash in F', 'Crash in G', 'Shutdown', 'Crash in X', 'Crash in Y', 'Crash in Z']

I have a table which contains 2 columns. the code will check the previous element of list and look for ( Startup / Shutdown ) : Example : if a Crash is after a Startup ; State column will be filled with Startup in front of that Crash as the table below :

Crashes State
Crash in A Startup
Crash in B Shutdown
Crash in C Shutdown
Crash in D Startup
Crash in E Startup
Crash in F Startup
Crash in G Startup
Crash in X Shutdown
Crash in Y Shutdown
Crash in Z Shutdown

the challenge I’m having is that the letters are random each time so i have to use “Crash in” in my code and not specific letters !

Any suggestions on how to do this?

EDIT : Real life example ( each line is an element of a list) :

 12:33:04.1753    | Startup Configuration dazdazdazd
 12:35:15.0142    | Crash in A <546464>, thread 61
 12:35:53.0396    | Crash in B <5>, 3e9fc dazdazd
 12:35:54.1664    | Crash in C <70>,bfc690dasfff
 12:35:55.3817    | Crash in D <80>,de5484sdazdazd
 12:36:01.6642    | Crash in E <50>,bfc428fdsfsgdgsgsd
 12:53:34.6462    | System Shutdown
 12:53:48.1724    | Exception: Crash in Y <01>, 38310dazdazdafaga

Code used from @mozway’s Answer :

def gen(lst):
    last_non_crash =''
    for x in lst:
        if  'Crash in' in x:
            last_non_crash = x
        else:
            yield [x, last_non_crash]
dataf = pd.DataFrame(gen(Crashtype), columns = ['Crashes', 'State'])

Output :

                                            Crashes                                              State
0   12:53:34.6462    | [1230.490] System shutdownn   12:36:01.6642    | Exception: Crash in E<50>,...

Expected Output :

      Crashes     State
0  Crash in A   Startup
1  Crash in B   Startup
2  Crash in C   Startup
3  Crash in D   Startup
4  Crash in E   Startup
5  Crash in Y   Shutdown

Advertisement

Answer

IIUC, you can use a generator:

def gen(lst):
    last_non_crash = ''
    for x in lst:
        if not x.startswith('Crash in'):
            last_non_crash = x
        else:
            yield [x, last_non_crash]

        
pd.DataFrame(gen(list_Crashes), columns=['Crashes', 'State'])

output:

      Crashes     State
0  Crash in A   Startup
1  Crash in B  Shutdown
2  Crash in C  Shutdown
3  Crash in D   Startup
4  Crash in E   Startup
5  Crash in F   Startup
6  Crash in G   Startup
7  Crash in X  Shutdown
8  Crash in Y  Shutdown
9  Crash in Z  Shutdown

input:

list_Crashes = ['Startup', 'Crash in A', 'Shutdown', 'Crash in B', 'Crash in C', 'Startup', 'Crash in D',
                'Startup', 'Crash in E', 'Crash in F', 'Crash in G', 'Shutdown', 'Crash in X', 'Crash in Y', 'Crash in Z']
updated answer
import re

def gen(lst):
    last_non_crash = ''
    for x in lst:
        m = re.search(r'(Crash in w+|Shutdown|Startup)', x)
        x = m.group() if m else 'unknown'
        if not 'Crash in' in x:
            last_non_crash = x
        else:
            yield [x, last_non_crash]

        
pd.DataFrame(gen(list_Crashes), columns=['Crashes', 'State'])

output:

      Crashes     State
0  Crash in A   Startup
1  Crash in B   Startup
2  Crash in C   Startup
3  Crash in D   Startup
4  Crash in E   Startup
5  Crash in Y  Shutdown
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement