I have a huge file that I splitted in a series of lines with the function text.splitlines()
. From these lines I need to specifically extract some informations corresponding to a keyword: “ref-p”. What I did is:
for index, line in enumerate(tpr_linee): ref = "ref-p" a = [] if ref in line: a.append(line) print(a)
what I obtained is:
1 [' ref-p (3x3):'] 2 [' ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}'] 3 [' ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}'] 4 [' ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}']
now I need to move the three series of number into a dictionary in the form:
{ref-p: [[number, number, number], [number, number, number], etc]}
.
Also, in the larger dataset the array 3×3 may be a different shape in different files.
So my main goal is to find a way to extract all the numbers corresponding to ref-p
, taking only the numbers and ignoring the first appearance of ref-p
key.
Advertisement
Answer
I have edited the first part of your code, so that the list a
will contain a list of strings to be analysed.
Then I split each string based on “=” (equal) sign, and strip the curly braces “{” and “}” to extract only the string of numbers.
When converting to float, the numbers are just 0.0 and 1.0. Try this:
a = [] for index, line in enumerate(tpr_linee): if 'ref-p' in line: a.append(line) print(a) a = [' ref-p (3x3):', ' ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}', ' ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}', ' ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}' ] result = {'ref-p': []} for strg in a: if '=' in strg: num_list = strg.split('=')[-1].strip('{').strip('}').split(',') print(num_list) result['ref-p'].append([float(e.strip()) for e in num_list]) print(result)
Output
[' 1.00000e+00', ' 0.00000e+00', ' 0.00000e+00'] [' 0.00000e+00', ' 1.00000e+00', ' 0.00000e+00'] [' 0.00000e+00', ' 0.00000e+00', ' 1.00000e+00'] {'ref-p': [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]}