I have a column in a dataframe with different strings.
JavaScript
x
7
1
Additional Information |
2
IP=192.168.1.1, MAC ADDR=00:0a:95:9d:68:16, USER=kwfinn
3
IP=192.168.0.1, MAC ADDR=00:0a:95:9d:68:17, USER=wattray
4
Undefined System Error
5
Specific groupname=CUSTGR1
6
IP=192.168.1.2, MAC ADDR=00:1B:44:11:3A:B7, USER=stwnck
7
What I want to do is to create new columns, IP Address and MAC Address with the corresponding values from the column above.
So that the expected output looks like this:
JavaScript
1
7
1
Additional Information |IP Address | MAC Address |
2
IP=192.168.1.1, MAC ADDR=00:0a:95:9d:68:16, USER=kwfinn |192.168.1.1 |00:0a:95:9d:68:16|
3
IP=192.168.0.1, MAC ADDR=00:0a:95:9d:68:17, USER=wattray|192.168.0.1 |00:0a:95:9d:68:17|
4
Undefined System Error | | |
5
Specific groupname=CUSTGR1 | | |
6
IP=192.168.1.2, MAC ADDR=00:1B:44:11:3A:B7, USER=stwnck |192.168.1.2 |00:1B:44:11:3A:B7|
7
The problem is, that I cannot deal with the rows that does not contain IP and MAC. I tried splitting using np.where as well as finding partial matches but didn’t succeed.
Advertisement
Answer
Idea is use list comprehension with filtering if not missing value or None and exist ,
and =
, pass to DataFrame
constructor and last use DataFrame.join
to original:
JavaScript
1
30
30
1
L = [dict(y.split("=") for y in v.split(", "))
2
if pd.notna(v) and ('=' in v) and (', ' in v)
3
else {}
4
for v in df['Additional Information']]
5
6
df1 = pd.DataFrame(L, index=df.index)
7
print (df1)
8
IP MAC ADDR USER
9
0 192.168.1.1 00:0a:95:9d:68:16 kwfinn
10
1 192.168.0.1 00:0a:95:9d:68:17 wattray
11
2 NaN NaN NaN
12
3 NaN NaN NaN
13
4 192.168.1.2 00:1B:44:11:3A:B7 stwnck
14
15
df = df.join(df1[['IP','MAC ADDR']])
16
print (df)
17
Additional Information IP
18
0 IP=192.168.1.1, MAC ADDR=00:0a:95:9d:68:16, US 192.168.1.1
19
1 IP=192.168.0.1, MAC ADDR=00:0a:95:9d:68:17, US 192.168.0.1
20
2 Undefined System Error NaN
21
3 Specific groupname=CUSTGR1 NaN
22
4 IP=192.168.1.2, MAC ADDR=00:1B:44:11:3A:B7, US 192.168.1.2
23
24
MAC ADDR
25
0 00:0a:95:9d:68:16
26
1 00:0a:95:9d:68:17
27
2 NaN
28
3 NaN
29
4 00:1B:44:11:3A:B7
30