I’m trying to inspect a buffer which contains a binary formatted message, but also contains string data. As an example, I’m using this C code:
int main (void) { char buf[100] = "x01x02x03x04String DataxAAxBBxCC"; return 0; }
I’d like to get a hex dump of what’s in buf
, of a format similar to xxd
(I don’t care if it’s an exact match, what I’m really looking for is a hex dump side by side with printable chars).
Inside GDB I can use something like:
(gdb) x /100bx buf 0x7fffffffdf00: 0x01 0x02 0x03 0x04 0x53 0x74 0x72 0x69 0x7fffffffdf08: 0x6e 0x67 0x20 0x44 0x61 0x74 0x61 0xaa 0x7fffffffdf10: 0xbb 0xcc 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf18: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf20: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf28: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf30: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf38: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf40: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf48: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf50: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7fffffffdf58: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
which is fine, but it’s hard to pick out strings that way… or I can use
(gdb) x /100bs buf 0x7fffffffdf00: "01020304String Data252273314" 0x7fffffffdf13: "" 0x7fffffffdf14: "" 0x7fffffffdf15: "" 0x7fffffffdf16: "" 0x7fffffffdf17: "" ...
which makes it hard to read the binary part… the actual messages I’m dealing with have plenty of ascii nul’s in them, too, so really it just looks like a mess.
The best I can come up with is to do this:
(gdb) dump binary memory dump.bin buf buf+100
and then
$ xxd dump.bin 0000000: 0102 0304 5374 7269 6e67 2044 6174 61aa ....String Data. 0000010: bbcc 0000 0000 0000 0000 0000 0000 0000 ................ 0000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000060: 0000 0000 ....
but that’s a pain to do that every time. I figured somebody out there has wanted this before, so wondering if anybody has found a way to do it inside gdb. Plus you lose the addresses from the original memory this way.
I’m using GDB 7.4 with python support built in, so I’m open to the idea of using a pretty printer or similar, but I don’t know how to set that up.
Advertisement
Answer
(gdb) define xxd >dump binary memory dump.bin $arg0 $arg0+$arg1 >shell xxd dump.bin >end (gdb) xxd &j 10 0000000: 0000 0000 0000 0000 0000 0000 4d8c a7f7 ............M... 0000010: ff7f 0000 0000 0000 0000 0000 c8d7 ffff ................ 0000020: ff7f 0000 0000 0000
Seems easy enough ;-)
You could likely write a Python script (modern GDB versions have embedded Python interpreter) to do the same, and get rid of the need to “shell out”.
Update:
Here is a possible Python implementation (save this into xxd.py
):
class XXD(gdb.Command): def __init__(self): super(XXD, self).__init__("xxd", gdb.COMMAND_USER) def _PrintLine(self, offset, bytes, size): print('{:08x}: '.format(offset), end='') todo = size while todo >= 4: print(''.join('{:02x}'.format(b) for b in bytes[0:4]), end='') todo -= 4 bytes = bytes[3:] if todo: print(' ', end='') # Print any remaining bytes print(''.join('{:02x}'.format(b) for b in bytes[0:todo]), end='') print() return size def invoke(self, arg, from_tty): args = arg.split() if len(args) != 2: print("xxd: <addr> <count>") return size = int(args[1]) addr = gdb.parse_and_eval(args[0]) inferior = gdb.inferiors()[0] bytes = inferior.read_memory(addr, size).tobytes() offset = int(addr) while size > 0: n = self._PrintLine(offset, bytes, min(len(bytes), 16)) size -= n offset += n bytes = bytes[n:] XXD()
Use it like so:
// Sample program x.c char foo[] = "abcdefghijklmopqrstuvwxyz"; int main() { return 0; } gcc -g x.c gdb -q ./a.out (gdb) source xxd.py Temporary breakpoint 1, main () at x.c:3 3 int main() { return 0; } (gdb) xxd &foo[0] 18 00404030: 61626364 64656667 6768696a 6a6b6c6d 00404040: 7273