Skip to content
Advertisement

Trying to detect speech using VAD(Voice Activity Detector)

I am able to read the audio but I am getting an error message while passing it to VAD(Voice Activity Detector). I think the error message is because the frames is in bytes, when feeding it to vad.is_speech(frame, sample_rate), should this frame be in bytes? Here is the code below:

JavaScript

Here is the error message:

TypeError Traceback (most recent call last) in 16 speech_frame = [] 17 for frame in frames: —> 18 is_speech = vad.is_speech(frame, sample_rate) 19 #print(frames)

C:Program FilesPython38libsite-packageswebrtcvad.py in is_speech(self, buf, sample_rate, length) 20 21 def is_speech(self, buf, sample_rate, length=None): —> 22 length = length or int(len(buf) / 2) 23 if length * 2 > len(buf): 24 raise IndexError(

TypeError: object of type ‘int’ has no len()

Advertisement

Answer

I have solved it, you know vad.is_speech(buf=frame, sample_rate), it takes the buf and calculates it length, but an integer value does not posses the len() attributes in python. This throws an error for example:

JavaScript

Use this instead:

JavaScript

So here is the correction to the code below:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement