tflite: get_tensor on non-output tensors gives random values

Question

I'm trying to debug my tflite model, that uses custom ops. I've found the correspondence between op names (in *.pb) and op ids (in *.tflite), and I'm doing a layer-per-layer comparison (to make sure the outputs difference are always in range 1e-4 (since it blows up at the end, I want to find the exact place where my custom layer

Accepted Answer

I had the same problem few month ago. The thing is, TF-Lite is completely different from TensorFlow – it uses static memory and execution plans, memory mapping files for faster loading, and it is supposed to optimize everything possible in the network&#8217;s forward propagation pipeline.I&#8217;m not a developer of TF-Lite, but I suppose it keeps its memory footprint extremely low by re-using the memory areas that were used for previously computed ops. Let&#8217;s see the idea on following illustration:Step 1: first, we&#8217;re feeding the inputs to a symbolic tensor I (in parentheses). Let&#8217;s say the value of it is stored in a buffer called buffer_1.     op1       op2       op3(I) ---->  A  ---->  B  ---->  O_________________________________^^^        ^^^^^^^^^^^^       ^^^input      intermediate    outputtensor     tensors         tensorStep 2: Now, we need to compute op1 on symbolic tensor I to attain the symbolic tensor A. We compute on buffer_1 and store the value of symbolic tensor A in a buffer called buffer_2.    [op1]      op2       op3(I) ----> (A) ---->  B  ---->  OStep 3: Now, we&#8217;re computing op2 on symbolic tensor A to attain the symbolic tensor B. We compute on buffer_2 and store the value of symbolic tensor B in a buffer called buffer_3&#8230;     op1      [op2]      op3 I  ----> (A) ----> (B) ---->  OBut wait! Why waste our memory to store in buffer_3 if we now have buffer_1 that is unused, and the value of which is now useless for getting the output O? So, instead of storing in buffer_3, we will actually store results of this operation in buffer_1!That&#8217;s the basic idea of efficient memory re-usage, which I think is implemented in TF-Lite, given its built-in static graph analyzer in toco and other stuffs. And that&#8217;s why you can&#8217;t simply apply get_tensor on non-output tensors.An easier way to debug?You&#8217;ve mentioned that you&#8217;re writing a custom op, so I suppose you&#8217;ve built tflite with bazel, right? Then you can actually inject some logging code to Interpreter::Invoke() in the file tensorflow/lite/interpreter.cc. An ugly hack, but it works.PS: I would be glad if any TensorFlow Lite developers come across and give a comment on this :)

Advertisement

Answer