Skip to content
Advertisement

How do I iterate through an entire directory and select only one class from a multi-class file in Python?

I could use some help iterating through a directory with multi-class files. Each sample contains two classes (for example, the first sample in my database is 1001, and this file includes 1001.dat and 1001.hea), and I want to iterate through my directory and access all .dat files separately from .hea files. Right now, simply iterating through the directory produces a File-Not-Found error.

I’ll supply additional source code to give this some context, but first let me show you where I’m stuck.

Using a PhysioNet ECG database, the goal right now is to analyze every .dat file (my example below implements the Dickey-Fuller test, using adfuller from statsmodels.tsa.stattools). I have uploaded my data onto Google colab using the following:

JavaScript

I am able to access a specific sample from my database easily. For example, if I want to read a sample using WFDB, I can do this without a problem:

JavaScript

But when I try to iterate through all of these samples, I run into an issue. Here is what I have so far:

JavaScript

At the commented line, I get the following error:

JavaScript

I believe this is because each file contains two classes, as you can see when I print the type of my file…

Sourcecode:

JavaScript

Result:

JavaScript

So, what I want to do is specify the first class ‘str’ (which is .dat). I only need to use the data contained in 1001.dat, etc. I just don’t know how to specify this in Python.

Now, as promised, some more code for more context.

All this stuff works:

JavaScript

This is what I’m working on now. My syntax might not be entirely correct for the body of my for loop (again, I’m a Python newbie) but I can figure out the rest if I can access the correct samples for each iteration:

JavaScript

Thank you, and absolutely let me know how I could have formatted this post better. I am still learning how to post useful questions on this platform. This is my 3rd question ever on StackOverflow.

Advertisement

Answer

I found the answer through a little more exploration of the method rdsamp()

rdsamp() does not need an extension to read the correct .dat file. This is why rdsamp('1001') works.

The solution, then, is to take out the last 4 characters in the string:

for file in uploaded: print(file[:-4]) file = wfdb.rdsamp(file[:-4])

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement