Skip to content
Advertisement

Issue with conversion of text data into a dataframe

I have a text file where I have several lines and between them, some data which I need to convert to the dataframe(useful data).

I iterated the text file line by line and captured the useful data with the help of a regex.

Something like this,

JavaScript

The data captured look like this

JavaScript

I thought to iterate each captured row and split on the basis of whitespace, but the issue is, the units have white space in between them and the value, for example….

-300.0000 mV, -100.0000 uA etc

Also one more issue is the trailing newline character, it is also been treated as a new element in the .split(” “).

Can someone please help to find some smarter way to do this?

All I want is to have the values as a separate column value.

For example in the first string,

100 becomes 1st col, 0 – 2nd, PASS – 3rd, Continuity_PPMU_mV – 4th, etc…

Thanks.

Edit:

The raw data somewhat look like this –

JavaScript

EDIT:

The top rows are not fixed, they are dynamically generated. Also, some other text data can appear in between the relevant data, like between two useful rows. So, I don’t think skipping rows will work here.

Advertisement

Answer

  • Read the file and look for the row the starts with 'Number', and then append those rows after that to data.
  • In the data rows, only the units are separated by a space.
  • It’s better to have the unit separate from the numeric value, so we can split the rows on spaces.
  • Create a new header, with new columns for the units.
  • This will allow the numeric values to be interpreted as floats.
JavaScript

enter image description here

Advertisement