Java 2 Ada - Tag analysis2015-05-25T09:51:00+00:00Stephane Carrezurn:md5:d12e23c53b2436d6becce3d51ddbdf38AWAUsing MAT the Memory Analysis Toolurn:md5:044573387eb20de387dfa113a789f17f2015-05-25T09:51:00+00:002015-05-25T09:51:00+00:00Stephane Carrezmemoryanalysisbinutils
<div class="post-text"><p><a href="https://github.com/stcarrez/mat/wiki/mat">mat</a> will assign a unique number to each event that is collected. The tool will reconcile the events to find those that are related based on the allocation address so that it becomes possible to find forward and backward who allocates or releases the memory. When started, the tool provides a set of interactive commands that you can enter with the <a href="http://cnswww.cns.cwru.edu/php/chet/readline/rltop.html">readline</a> editing capabilities.</p><h4>Instrumenting your application: the file event mode</h4><p>The first method to instrument your application is by running your program and collecting the information into a file. Later, when the program is finished, use the <code>mat</code> tool to analyze the results.</p><p><div class="wiki-img-center"><div class="wiki-img-inner"><img src="/images/Ada/mat-analyse.png" longdesc="mat-analyse.png" alt="mat-analyse.png"></img></div></div></p><p>To instrument a program and save the results into a file, you can use the <a href="https://github.com/stcarrez/mat/wiki/matl">matl</a> launcher as follows:</p><pre><code>matl -o name command
</code></pre><p>where <code>command</code> is the command to instrument and <code>name</code> is the prefix file name. The generated file will have the process ID in its name with the <code>.mat</code> extension.</p><p>To start the analysis, you have to launch <code>mat</code> with the name of the generated file:</p><pre><code>mat name-xxx.mat
</code></pre><h4>Instrumenting live application: the TCP/IP socket mode</h4><p>You can also instrument your application and do some analysis while your program is running. For this, you will use the TCP/IP socket mode provided by <code>libmat.so</code> and <code>mat</code>. You must first start the <code>mat</code> tool so that the TCP/IP server provided by <code>mat</code> is started before the program connects to it through the <code>libmat.so</code> shared library.</p><p><div class="wiki-img-center"><div class="wiki-img-inner"><img src="/images/Ada/mat-analyse-tcp.png" longdesc="mat-analyse-tcp.png" alt="mat-analyse-tcp.png"></img></div></div></p><p>To use this mode, start <code>mat</code> in a first terminal console with either the <code>-s</code> or the <code>-b</code> option to start the TCP/IP server and wait for a program to connect. For example:</p><pre><code>mat -b 192.168.0.17:4096
</code></pre><p>Then, in a second terminal console start your program through the <a href="https://github.com/stcarrez/mat/wiki/matl">matl</a> launcher as follows:</p><pre><code>matl -s 192.168.0.17:4606 command
</code></pre><p>Here you will give the IP address (you may use <code>localhost</code> or any IP) and the TCP/IP port (the default port used being the port <code>4606</code>). The <a href="https://github.com/stcarrez/mat/wiki/mat">mat</a> server may run on a different host with a different architecture.</p><h4>General information</h4><p>When <code>mat</code> reads the events, it detects the endianness and the size of pointers so that it is able to analyze 32-bit as well as 64-bit applications with little endian or big endian formats. The <b>info</b> command gives information about the program and the events that have been collected.</p><p>The output presented below come from the analysis of <a href="http://www.gnu.org/software/gdb/">gdb 7.9.1</a> that was debugging <a href="https://github.com/stcarrez/mat/wiki/mat">mat</a>.</p><pre><code>mat>info
Pid : 5291
Path : /data/ext/gnu/i386/gdb-7.9.1/gdb/gdb
Endianness : LITTLE_ENDIAN
Memory regions : 24
Events : 0..586514
Duration : 41.26s
Stack frames : 0
Malloc count : 279768
Realloc count : 36272
Free count : 270462
Memory allocated : 8341613
Memory slots : 10484
</code></pre><p>The number of collected events can be quite high, 586514 in the above example, and it can easily exceed several millions.</p><h3>Timeline</h3><p>With so many events, you may not know where to start. You can use the <code>timeline</code> command to analyse the events and find interesting groups and report information about them. The command takes a <i>duration</i> parameter to control the groups by defining the maximum duration in seconds of a group. For each group, the command indicates the event ID range, the number of <code>malloc</code>, <code>realloc</code> and <code>free</code> calls as well as the memory growth or shrink during the period.</p><pre><code>mat>timeline 5
Start End time Duration Event range # malloc # realloc # free Memory
0us 4.32s 4.32s 0..533645 261393 20423 251923 +4618592
15.60s 20.57s 4.96s 533646..542619 4495 102 4379 +438803
20.62s 21.05s 432.68ms 542620..581167 11555 15663 11338 +3198744
26.16s 28.86s 2.70s 581168..582388 539 4 678 -9425
31.40s 32.41s 1.01s 582389..583133 288 11 446 +19246
</code></pre><p>In this sample, the first group correspond to <code>gdb</code>'s startup during which it loads the symbol table. The second group correspond to the command <code>b main</code> followed by <code>run</code> in gdb. The third group is when the program reaches the breakpoint and gdb handles it and read the DWARF2 debugging information. The two last groups correspond to a <code>step</code>, <code>where</code>, <code>cont</code> and <code>quit</code> commands.</p><h3>Looking at allocation sizes</h3><p>The <code>sizes</code> command is another command that helps with many events as it counts and groups the events by their allocation size. Let's say we want to look further in the second group identified by the <code>timeline 5</code> command, we can use the event range ID to filter out the events so that the <code>sizes</code> command only takes into account these events. By using the <code>-c</code> option, only a summary is printed by the command:</p><pre><code>mat>sizes -c 542620..581167
Found 9550 different sizes, +3198744 bytes, with 11555 malloc, 15663 realloc, 11338 free
</code></pre><p>There are still many allocation and we can change the filter to report only the memory allocations greater than 75000 bytes. Now, we want to have all the information and the <code>-c</code> option is not given:</p><pre><code>mat>sizes 542620..581167 and size > 75000
Event Id range Time Event Size Count Total size Memory
554191..554243 20.75s realloc 75248 4 300992 -75240
565067 20.88s malloc 161323 1 161323 +161323
562816 20.85s malloc 179035 1 179035 +179035
564877 20.87s malloc 213328 1 213328 +213328
Found 7 different sizes, +730796 bytes, with 7 malloc, 1 realloc, 2 free
</code></pre><h3>Looking at the event</h3><p>When you identify some interesting event, you can use the <code>event</code> command and give it the event ID to dump the information with the complete stack frame. This is probably the most useful command as the stack frame helps you point out where in the code and by which flow the allocation call is made.</p><pre><code>mat>event 554191
75248 bytes reallocated after 20.75s, freed 817us after by event 554243 +8 bytes
Id Frame Address Function
1 0x0000000000408898 _start
2 0x00007F688D18AEC5 __libc_start_main (libc-start.c:321)
3 0x0000000000408855 main (gdb.c:33)
4 0x000000000054999B gdb_main (main.c:1161)
5 0x00000000005459F5 catch_errors (exceptions.c:237)
6 0x0000000000549506 captured_main (main.c:1150)
7 0x00000000005459F5 catch_errors (exceptions.c:237)
8 0x00000000005484C3 captured_command_loop (main.c:329)
9 0x000000000054EABE start_event_loop (event-loop.c:334)
10 0x000000000054EA25 gdb_do_one_event (event-loop.c:296)
11 0x000000000054E755 gdb_wait_for_event (event-loop.c:773)
12 0x0000000000550482 inferior_event_handler (inf-loop.c:57)
13 0x000000000053A218 fetch_inferior_event (infrun.c:3273)
14 0x0000000000537E22 handle_signal_stop (infrun.c:4264)
15 0x000000000061B750 get_current_frame (frame.c:1486)
16 0x000000000054582C catch_exceptions_with_msg (exceptions.c:189)
17 0x000000000061E36C unwind_to_current_frame (frame.c:1451)
18 0x000000000061E081 get_prev_frame (frame.c:2212)
19 0x000000000061D949 get_prev_frame_always_1 (frame.c:1954)
20 0x000000000061B63B compute_frame_id (frame.c:454)
21 0x000000000061EFAF frame_unwind_find_by_frame (frame-unwind.c:157)
22 0x000000000061EBF6 frame_unwind_try_unwinder (frame-unwind.c:106)
23 0x00000000005CCEB4 dwarf2_frame_sniffer (dwarf2-frame.c:1405)
24 0x00000000005CCB38 dwarf2_frame_find_fde (dwarf2-frame.c:1772)
25 0x00000000005CC8E9 dwarf2_build_frame_info (dwarf2-frame.c:2313)
26 0x00000000005CAC4B add_fde (dwarf2-frame.c:1812)
</code></pre><p>This event is a <code>realloc</code> call that reallocates a memory slot to 75248 bytes, the previous slot size was 75240 bytes. The operation that makes the call is <code>add_fde</code> in gdb.</p><h3>Looking at frames</h3><p>The <code>frames</code> command analyzes all the event stack frames and report the function with the number of calls and memory growth that they created. The command takes a level parameter that indicates the stack frame level to take into account. The level counts the stack frame starting from the bottom, so that level 1 should give you functions that directly call <a href="http://man7.org/linux/man-pages/man3/malloc.3.html">malloc</a>, <a href="http://man7.org/linux/man-pages/man3/realloc.3.html">realloc</a> and <a href="http://man7.org/linux/man-pages/man3/free.3.html">free</a> functions.</p><p>A filter can be defined to control the allocation events to take into account. For example if we want to look at functions that allocate or free large memory blocks and which are in the second group reported by the <code>timeline</code> command, we can use the following command:</p><pre><code>mat>frames 1 542620..581167 and size > 50000
Level Size Count Function
1 -53328 1 bfd_elf64_slurp_symbol_table (elfcode.h:1172)
1 +743794 8 __GI__obstack_newchunk (obstack.c:269)
1 -187272 3 dwarf2_build_frame_info (dwarf2-frame.c:2451)
1 +53328 1 bfd_elf_get_elf_syms (elf.c:419)
1 -71104 1 _bfd_elf_canonicalize_dynamic_symtab (elf.c:7215)
1 +461056 4 bfd_alloc (opncls.c:956)
1 +25248 3156 add_fde (dwarf2-frame.c:1812)
1 =75248 2 dwarf2_build_frame_info (dwarf2-frame.c:2422)
1 -75248 1 dwarf2_frame_find_fde (dwarf2-frame.c:1772)
1 +71104 1 bfd_elf_get_elf_syms (elf.c:453)
</code></pre><h3>Download</h3><p>You can download the first release of MAT and get mat sources at <a href="http://download.vacs.fr/mat/mat-1.0.0.tar.gz">http://download.vacs.fr/mat/mat-1.0.0.tar.gz</a> and follow the <a href="https://github.com/stcarrez/mat/wiki/Build">Build</a> instructions.</p><p>Ubuntu 14.04 packages for 64-bit platforms are available:</p><p><a href="http://download.vacs.fr/mat/mat_1.0.0_amd64.deb">http://download.vacs.fr/mat/mat_1.0.0_amd64.deb</a> and <a href="http://download.vacs.fr/mat/libmat_1.0.0_amd64.deb">http://download.vacs.fr/mat/libmat_1.0.0_amd64.deb</a></p><p>Ubuntu 14.04 packages for 32-bit platforms are available:</p><p><a href="http://download.vacs.fr/mat/mat_1.0.0_i386.deb">http://download.vacs.fr/mat/mat_1.0.0_i386.deb</a> and <a href="http://download.vacs.fr/mat/libmat_1.0.0_i386.deb">http://download.vacs.fr/mat/libmat_1.0.0_i386.deb</a></p><h3>Conclusion</h3><p>Unlike <a href="http://valgrind.org/">valgrind</a>, <a href="https://github.com/stcarrez/mat/wiki/mat">mat</a> does not instrument the program. Instead, it overrides the <a href="http://man7.org/linux/man-pages/man3/malloc.3.html">malloc</a>, <a href="http://man7.org/linux/man-pages/man3/realloc.3.html">realloc</a> and <a href="http://man7.org/linux/man-pages/man3/free.3.html">free</a> calls and only monitors these events. This makes the implementation easier to port and allows to use <a href="https://github.com/stcarrez/mat/wiki/mat">mat</a> in some embedded systems where <a href="http://valgrind.org/">valgrind</a> is more difficult to use (due to portability and memory resource constraints).</p><p>I've used <a href="https://github.com/stcarrez/mat/wiki/mat">mat</a> on several Mips boards (BCM6362, BCM63168, Vox185) and it was very useful to understand the memory allocations and reduce the memory used by several programs. It does not yet provide a graphical front end but this will come one day.</p><p>Give it a try, you may find some interesting features in it!</p></div>