The signal is 1 second long with sampling rate of 16000, I compute 13 MFCC with 400 hop length. The output dimensions are (13,41). Why do I get 41 frames, isn’t it supposed to be (time*sr/hop_length)=40? Answer TL;DR answer Yes, it is correct. Long answer You are using a time-series as input (signal), which means that librosa first computes a