aboutsummaryrefslogtreecommitdiff
path: root/en/devices/tech/debug/native-crash.html
blob: fbebb9341d1dfdc862aac99a6bb70a80f2cdf980 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
<html devsite>
  <head>
    <title>Diagnosing Native Crashes</title>
    <meta name="project_path" value="/_project.yaml" />
    <meta name="book_path" value="/_book.yaml" />
  </head>
  <body>
  <!--
      Copyright 2017 The Android Open Source Project

      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at

          http://www.apache.org/licenses/LICENSE-2.0

      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
  -->



<p>
If you've never seen a native crash before, start with
<a href="/devices/tech/debug/index.html">Debugging Native Android
Platform Code</a>.
</p>

<h2 id=crashtypes>Types of native crash</h2>
<p>
The sections below detail the most common kinds of native crash. Each includes
an example chunk of <code>debuggerd</code> output, with the key evidence that helps you
distinguish that specific kind of crash highlighted in orange italic text.
</p>
<h3 id=abort>Abort</h3>
<p>
Aborts are interesting because they're deliberate. There are many different ways
to abort (including calling <code><a
href="http://man7.org/linux/man-pages/man3/abort.3.html">abort(3)</a></code>,
failing an <code><a
href="http://man7.org/linux/man-pages/man3/assert.3.html">assert(3)</a></code>,
using one of the Android-specific fatal logging types), but they all involve
calling <code>abort</code>. A call to <code>abort</code> basically signals the
calling thread with SIGABRT, so a frame showing "abort" in <code>libc.so</code>
plus SIGABRT are the things to look for in the <code>debuggerd</code> output to
recognize this case.</p>

<p>
As mentioned above, there may be an explicit "abort message" line. You
should also look in the <code>logcat</code> output to see what this thread logged before
deliberately killing itself, because unlike assert(3) or high level fatal logging
facilities, abort(3) doesn't accept a message.
</p>

<p>
Current versions of Android inline the <code><a
href="http://man7.org/linux/man-pages/man2/tgkill.2.html">tgkill(2)</a></code>
system call, so their stacks are the easiest to read, with the call to abort(3)
at the very top:
</p>
<pre class="devsite-click-to-copy">
pid: 4637, tid: 4637, name: crasher  >>> crasher <<<
signal 6 (<i style="color:Orange">SIGABRT</i>), code -6 (SI_TKILL), fault addr --------
<i style="color:Orange">Abort message</i>: 'some_file.c:123: some_function: assertion "false" failed'
    r0  00000000  r1  0000121d  r2  00000006  r3  00000008
    r4  0000121d  r5  0000121d  r6  ffb44a1c  r7  0000010c
    r8  00000000  r9  00000000  r10 00000000  r11 00000000
    ip  ffb44c20  sp  ffb44a08  lr  eace2b0b  pc  eace2b16
backtrace:
    #00 pc 0001cb16  /system/lib/<i style="color:Orange">libc.so</i> (<i style="color:Orange">abort</i>+57)
    #01 pc 0001cd8f  /system/lib/libc.so (__assert2+22)
    #02 pc 00001531  /system/bin/crasher (do_action+764)
    #03 pc 00002301  /system/bin/crasher (main+68)
    #04 pc 0008a809  /system/lib/libc.so (__libc_init+48)
    #05 pc 00001097  /system/bin/crasher (_start_main+38)
</pre>

<p>
Older versions of Android followed a convoluted path between the original abort call (frame 4 here)
and the actual sending of the signal (frame 0 here). This was especially true on 32-bit ARM,
which added <code>__libc_android_abort</code> (frame 3 here) to the other platforms' sequence of
<code>raise</code>/<code>pthread_kill</code>/<code>tgkill</code>:
</p>
<pre class="devsite-click-to-copy">
pid: 1656, tid: 1656, name: crasher  >>> crasher <<<
signal 6 (<i style="color:Orange">SIGABRT</i>), code -6 (SI_TKILL), fault addr --------
<i style="color:Orange">Abort message</i>: 'some_file.c:123: some_function: assertion "false" failed'
    r0 00000000  r1 00000678  r2 00000006  r3 f70b6dc8
    r4 f70b6dd0  r5 f70b6d80  r6 00000002  r7 0000010c
    r8 ffffffed  r9 00000000  sl 00000000  fp ff96ae1c
    ip 00000006  sp ff96ad18  lr f700ced5  pc f700dc98  cpsr 400b0010
backtrace:
    #00 pc 00042c98  /system/lib/libc.so (tgkill+12)
    #01 pc 00041ed1  /system/lib/libc.so (pthread_kill+32)
    #02 pc 0001bb87  /system/lib/libc.so (raise+10)
    #03 pc 00018cad  /system/lib/libc.so (__libc_android_abort+34)
    #04 pc 000168e8  /system/lib/<i style="color:Orange">libc.so</i> (<i style="color:Orange">abort</i>+4)
    #05 pc 0001a78f  /system/lib/libc.so (__libc_fatal+16)
    #06 pc 00018d35  /system/lib/libc.so (__assert2+20)
    #07 pc 00000f21  /system/xbin/crasher
    #08 pc 00016795  /system/lib/libc.so (__libc_init+44)
    #09 pc 00000abc  /system/xbin/crasher
</pre>
<p>
You can reproduce an instance of this type of crash using <code>crasher
abort</code>.
</p>
<h3 id=nullpointer>Pure null pointer dereference</h3>
<p>
This is the classic native crash, and although it's just a special case of the
next crash type, it's worth mentioning separately because it usually requires
the least thought.
</p>
<p>
In the example below, even though the crashing function is in
<code>libc.so</code>, because the string functions just operate on the pointers
they're given, you can infer that <code><a
href="http://man7.org/linux/man-pages/man3/strlen.3.html">strlen(3)</a></code>
was called with a null pointer; and this crash should go straight to the author
of the calling code. In this case, frame #01 is the bad caller.
</p>

<pre class="devsite-click-to-copy">
pid: 25326, tid: 25326, name: crasher  >>> crasher <<<
signal 11 (<i style="color:Orange">SIGSEGV</i>), code 1 (SEGV_MAPERR), <i style="color:Orange">fault addr 0x0</i>
    r0 00000000  r1 00000000  r2 00004c00  r3 00000000
    r4 ab088071  r5 fff92b34  r6 00000002  r7 fff92b40
    r8 00000000  r9 00000000  sl 00000000  fp fff92b2c
    ip ab08cfc4  sp fff92a08  lr ab087a93  pc efb78988  cpsr 600d0030

backtrace:
    #00 pc 00019988  /system/lib/libc.so (strlen+71)
    #01 pc 00001a8f  /system/xbin/crasher (strlen_null+22)
    #02 pc 000017cd  /system/xbin/crasher (do_action+948)
    #03 pc 000020d5  /system/xbin/crasher (main+100)
    #04 pc 000177a1  /system/lib/libc.so (__libc_init+48)
    #05 pc 000010e4  /system/xbin/crasher (_start+96)
</pre>
<p>
You can reproduce an instance of this type of crash using <code>crasher
strlen-NULL</code>.
</p>
<h3 id=lowaddress>Low-address null pointer dereference</h3>
<p>
In many cases the fault address won't be 0, but some other low number. Two- or
three-digit addresses in particular are very common, whereas a six-digit address
is almost certainly not a null pointer dereference&#8212that would require a 1MiB
offset. This usually occurs when you have code that dereferences a null pointer
as if it was a valid struct. Common functions are <code><a
href="http://man7.org/linux/man-pages/man3/fprintf.3.html">fprintf(3)</a></code>
(or any other function taking a FILE*) and <code><a
href="http://man7.org/linux/man-pages/man3/readdir.3.html">readdir(3)</a></code>,
because code often fails to check that the <code><a
href="http://man7.org/linux/man-pages/man3/fopen.3.html">fopen(3)</a></code> or
<code><a
href="http://man7.org/linux/man-pages/man3/opendir.3.html">opendir(3)</a></code>
call actually succeeded first.
</p>

<p>
Here's an example of <code>readdir</code>:
</p>

<pre class="devsite-click-to-copy">
pid: 25405, tid: 25405, name: crasher  >>> crasher <<<
signal 11 (<i style="color:Orange">SIGSEGV</i>), code 1 (SEGV_MAPERR), <i style="color:Orange">fault addr 0xc</i>
    r0 0000000c  r1 00000000  r2 00000000  r3 3d5f0000
    r4 00000000  r5 0000000c  r6 00000002  r7 ff8618f0
    r8 00000000  r9 00000000  sl 00000000  fp ff8618dc
    ip edaa6834  sp ff8617a8  lr eda34a1f  pc eda618f6  cpsr 600d0030

backtrace:
    #00 pc 000478f6  /system/lib/libc.so (pthread_mutex_lock+1)
    #01 pc 0001aa1b  /system/lib/libc.so (readdir+10)
    #02 pc 00001b35  /system/xbin/crasher (readdir_null+20)
    #03 pc 00001815  /system/xbin/crasher (do_action+976)
    #04 pc 000021e5  /system/xbin/crasher (main+100)
    #05 pc 000177a1  /system/lib/libc.so (__libc_init+48)
    #06 pc 00001110  /system/xbin/crasher (_start+96)
</pre>
<p>
Here the direct cause of the crash is that <code><a
href="http://man7.org/linux/man-pages/man3/pthread_mutex_lock.3p.html">pthread_mutex_lock(3)</a></code>
has tried to access address 0xc (frame 0). But the first thing
<code>pthread_mutex_lock</code> does is dereference the <code>state</code>
element of the <code>pthread_mutex_t*</code> it was given. If you look at the
source, you can see that element is at offset 0 in the struct, which tells you
that <code>pthread_mutex_lock</code> was given the invalid pointer 0xc. From the
frame 1 you can see that it was given that pointer by <code>readdir</code>,
which extracts the <code>mutex_</code> field from the <code>DIR*</code> it's
given. Looking at that structure, you can see that <code>mutex_</code> is at
offset <code>sizeof(int) + sizeof(size_t) + sizeof(dirent*)</code> into
<code>struct DIR</code>, which on a 32-bit device is 4 + 4 + 4 = 12 = 0xc, so
you found the bug: <code>readdir</code> was passed a null pointer by the caller.
At this point you can paste the stack into the stack tool to find out
<em>where</em> in logcat this happened.</p>

<pre class="prettyprint">
  struct DIR {
    int fd_;
    size_t available_bytes_;
    dirent* next_;
    pthread_mutex_t mutex_;
    dirent buff_[15];
    long current_pos_;
  };
</pre>
<p>
In most cases you can actually skip this analysis. A sufficiently low fault
address usually means you can just skip any <code>libc.so</code> frames in the
stack and directly accuse the calling code. But not always, and this is how you
would present a compelling case.
</p>
<p>
You can reproduce instances of this kind of crash using <code>crasher
fprintf-NULL</code> or <code>crasher readdir-NULL</code>.
</p>
<h3 id=fortify>FORTIFY failure</h3>
<p>
A FORTIFY failure is a special case of an abort that occurs when the C library
detects a problem that might lead to a security vulnerability. Many C library
functions are <em>fortified</em>; they take an extra argument that tells them how large
a buffer actually is and check at run time whether the operation you're trying
to perform actually fits. Here's an example where the code tries to
<code>read(fd, buf, 32)</code> into a buffer that's actually only 10 bytes
long...
</p>
<pre class="devsite-click-to-copy">
pid: 25579, tid: 25579, name: crasher  >>> crasher <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Abort message: '<i style="color:Orange">FORTIFY: read: prevented 32-byte write into 10-byte buffer'</i>
    r0 00000000  r1 000063eb  r2 00000006  r3 00000008
    r4 ff96f350  r5 000063eb  r6 000063eb  r7 0000010c
    r8 00000000  r9 00000000  sl 00000000  fp ff96f49c
    ip 00000000  sp ff96f340  lr ee83ece3  pc ee86ef0c  cpsr 000d0010

backtrace:
    #00 pc 00049f0c  /system/lib/libc.so (tgkill+12)
    #01 pc 00019cdf  /system/lib/libc.so (abort+50)
    #02 pc 0001e197  /system/lib/libc.so (<i style="color:Orange">__fortify_fatal</i>+30)
    #03 pc 0001baf9  /system/lib/libc.so (__read_chk+48)
    #04 pc 0000165b  /system/xbin/crasher (do_action+534)
    #05 pc 000021e5  /system/xbin/crasher (main+100)
    #06 pc 000177a1  /system/lib/libc.so (__libc_init+48)
    #07 pc 00001110  /system/xbin/crasher (_start+96)
</pre>
<p>
You can reproduce an instance of this type of crash using <code>crasher
fortify</code>.
</p>
<h3 id=stackcorruption>Stack corruption detected by -fstack-protector</h3>
<p>
The compiler's <code>-fstack-protector</code> option inserts checks into
functions with on-stack buffers to guard against buffer overruns. This option is
on by default for platform code but not for apps. When this option is enabled,
the compiler adds instructions to the <a
href="https://en.wikipedia.org/wiki/Function_prologue">function prologue</a> to
write a random value just past the last local on the stack and to the function
epilogue to read it back and check that it's not changed. If that value changed,
it was overwritten by a buffer overrun, so the epilogue calls
<code>__stack_chk_fail</code> to log a message and abort.
</p>
<pre class="devsite-click-to-copy">
pid: 26717, tid: 26717, name: crasher  >>> crasher <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
<i style="color:Orange">Abort message: 'stack corruption detected'</i>
    r0 00000000  r1 0000685d  r2 00000006  r3 00000008
    r4 ffd516d8  r5 0000685d  r6 0000685d  r7 0000010c
    r8 00000000  r9 00000000  sl 00000000  fp ffd518bc
    ip 00000000  sp ffd516c8  lr ee63ece3  pc ee66ef0c  cpsr 000e0010

backtrace:
    #00 pc 00049f0c  /system/lib/libc.so (tgkill+12)
    #01 pc 00019cdf  /system/lib/libc.so (abort+50)
    #02 pc 0001e07d  /system/lib/libc.so (__libc_fatal+24)
    #03 pc 0004863f  /system/lib/libc.so (<i style="color:Orange">__stack_chk_fail</i>+6)
    #04 pc 000013ed  /system/xbin/crasher (smash_stack+76)
    #05 pc 00001591  /system/xbin/crasher (do_action+280)
    #06 pc 00002219  /system/xbin/crasher (main+100)
    #07 pc 000177a1  /system/lib/libc.so (__libc_init+48)
    #08 pc 00001144  /system/xbin/crasher (_start+96)
</pre>
<p>
You can distinguish this from other kinds of abort by the presence of
<code>__stack_chk_fail</code> in the backtrace and the specific abort message.
</p>
<p>
You can reproduce an instance of this type of crash using <code>crasher
smash-stack</code>.
</p>
<h3 id="seccomp">Seccomp SIGSYS from a disallowed system call</h3>
<p>
The <a href="https://en.wikipedia.org/wiki/Seccomp">seccomp</a> system (specifically seccomp-bpf)
restricts access to system calls. For more information about seccomp for platform developers, see
the blog post
<a href="https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html">Seccomp filter in Android O</a>.
A thread that calls a restricted system call
will receive a SIGSYS signal with code SYS_SECCOMP. The system call number will be shown in the
cause line, along with the architecture. It is important to note that system call numbers vary
between architectures. For example, the readlinkat(2) system call is number 305 on x86
but 267 on x86-64. The call number is different again on both arm and arm64. Because system call
numbers vary between architectures, it's usually easier to use the stack trace to find out which
system call was disallowed rather than looking for the system call number in the headers.
</p>
<pre class="devsite-click-to-copy">
pid: 11046, tid: 11046, name: crasher  >>> crasher <<<
signal 31 (SIGSYS), code 1 (<i style="color:Orange">SYS_SECCOMP</i>), fault addr --------
<i style="color:Orange">Cause: seccomp prevented call to disallowed arm system call 99999</a>
    r0 cfda0444  r1 00000014  r2 40000000  r3 00000000
    r4 00000000  r5 00000000  r6 00000000  r7 0001869f
    r8 00000000  r9 00000000  sl 00000000  fp fffefa58
    ip fffef898  sp fffef888  lr 00401997  pc f74f3658  cpsr 600f0010

backtrace:
    #00 pc 00019658  /system/lib/libc.so (syscall+32)
    #01 pc 00001993  /system/bin/crasher (do_action+1474)
    #02 pc 00002699  /system/bin/crasher (main+68)
    #03 pc 0007c60d  /system/lib/libc.so (__libc_init+48)
    #04 pc 000011b0  /system/bin/crasher (_start_main+72)
</pre>
<p>
You can distinguish disallowed system calls from other crashes by the presence of
<code>SYS_SECCOMP</code> on the signal line and the description on the cause line.
</p>
<p>
You can reproduce an instance of this type of crash using <code>crasher
seccomp</code>.
</p>


<h2 id=crashdump>Crash dumps</h2>

<p>If you don't have a specific crash that you're investigating right now,
the platform source includes a tool for testing <code>debuggerd</code> called crasher. If
you <code>mm</code> in <code>system/core/debuggerd/</code> you'll get both a <code>crasher</code>
and a <code>crasher64</code> on your path (the latter allowing you to test
64-bit crashes). Crasher can crash in a large number of interesting ways based
on the command line arguments you provide. Use <code>crasher --help</code>
to see the currently supported selection.</p>

<p>To introduce the different pieces in a crash dump, let's work through this example crash dump:</p>

<pre class="devsite-click-to-copy">
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Android/aosp_flounder/flounder:5.1.51/AOSP/enh08201009:eng/test-keys'
Revision: '0'
ABI: 'arm'
pid: 1656, tid: 1656, name: crasher  &gt;&gt;&gt; crasher &lt;&lt;&lt;
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Abort message: 'some_file.c:123: some_function: assertion "false" failed'
    r0 00000000  r1 00000678  r2 00000006  r3 f70b6dc8
    r4 f70b6dd0  r5 f70b6d80  r6 00000002  r7 0000010c
    r8 ffffffed  r9 00000000  sl 00000000  fp ff96ae1c
    ip 00000006  sp ff96ad18  lr f700ced5  pc f700dc98  cpsr 400b0010
backtrace:
    #00 pc 00042c98  /system/lib/libc.so (tgkill+12)
    #01 pc 00041ed1  /system/lib/libc.so (pthread_kill+32)
    #02 pc 0001bb87  /system/lib/libc.so (raise+10)
    #03 pc 00018cad  /system/lib/libc.so (__libc_android_abort+34)
    #04 pc 000168e8  /system/lib/libc.so (abort+4)
    #05 pc 0001a78f  /system/lib/libc.so (__libc_fatal+16)
    #06 pc 00018d35  /system/lib/libc.so (__assert2+20)
    #07 pc 00000f21  /system/xbin/crasher
    #08 pc 00016795  /system/lib/libc.so (__libc_init+44)
    #09 pc 00000abc  /system/xbin/crasher
Tombstone written to: /data/tombstones/tombstone_06
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
</pre>

<p>The line of asterisks with spaces is helpful if you're searching a log
for native crashes. The string "*** ***" rarely shows up in logs other than
at the beginning of a native crash.</p>

<pre class="devsite-click-to-copy">
Build fingerprint:
'Android/aosp_flounder/flounder:5.1.51/AOSP/enh08201009:eng/test-keys'
</pre>

<p>The fingerprint lets you identify exactly which build the crash occurred
on. This is exactly the same as the <code>ro.build.fingerprint</code> system property.</p>

<pre class="devsite-click-to-copy">
Revision: '0'
</pre>

<p>The revision refers to the hardware rather than the software. This is
usually unused but can be useful to help you automatically ignore bugs known
to be caused by bad hardware. This is exactly the same as the <code>ro.revision</code>
system property.</p>

<pre class="devsite-click-to-copy">
ABI: 'arm'
</pre>

<p>The ABI is one of arm, arm64, mips, mips64, x86, or x86-64. This is
mostly useful for the <code>stack</code> script mentioned above, so that it knows
what toolchain to use.</p>

<pre class="devsite-click-to-copy">
pid: 1656, tid: 1656, name: crasher &gt;&gt;&gt; crasher &lt;&lt;&lt;
</pre>

<p>This line identifies the specific thread in the process that crashed. In
this case, it was the process' main thread, so the process ID and thread
ID match. The first name is the thread name, and the name surrounded by
&gt;&gt;&gt; and &lt;&lt;&lt; is the process name. For an app, the process name
is typically the fully-qualified package name (such as com.facebook.katana),
which is useful when filing bugs or trying to find the app in Google Play. The
pid and tid can also be useful in finding the relevant log lines preceding
the crash.</p>

<pre class="devsite-click-to-copy">
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
</pre>

<p>This line tells you which signal (SIGABRT) was received, and more about
how it was received (SI_TKILL). The signals reported by <code>debuggerd</code> are SIGABRT,
SIGBUS, SIGFPE, SIGILL, SIGSEGV, and SIGTRAP. The signal-specific codes vary
based on the specific signal.</p>

<pre class="devsite-click-to-copy">
Abort message: 'some_file.c:123: some_function: assertion "false" failed'
</pre>

<p>Not all crashes will have an abort message line, but aborts will. This
is automatically gathered from the last line of fatal logcat output for
this pid/tid, and in the case of a deliberate abort is likely to give an
explanation of why the program killed itself.</p>

<pre class="devsite-click-to-copy">
r0 00000000 r1 00000678 r2 00000006 r3 f70b6dc8
r4 f70b6dd0 r5 f70b6d80 r6 00000002 r7 0000010c
r8 ffffffed r9 00000000 sl 00000000 fp ff96ae1c
ip 00000006 sp ff96ad18 lr f700ced5 pc f700dc98 cpsr 400b0010
</pre>

<p>The register dump shows the content of the CPU registers at the time the
signal was received. (This section varies wildly between ABIs.) How useful
these are will depend on the exact crash.</p>

<pre class="devsite-click-to-copy">
backtrace:
    #00 pc 00042c98 /system/lib/libc.so (tgkill+12)
    #01 pc 00041ed1 /system/lib/libc.so (pthread_kill+32)
    #02 pc 0001bb87 /system/lib/libc.so (raise+10)
    #03 pc 00018cad /system/lib/libc.so (__libc_android_abort+34)
    #04 pc 000168e8 /system/lib/libc.so (abort+4)
    #05 pc 0001a78f /system/lib/libc.so (__libc_fatal+16)
    #06 pc 00018d35 /system/lib/libc.so (__assert2+20)
    #07 pc 00000f21 /system/xbin/crasher
    #08 pc 00016795 /system/lib/libc.so (__libc_init+44)
    #09 pc 00000abc /system/xbin/crasher
</pre>

<p>The backtrace shows you where in the code we were at the time of
crash. The first column is the frame number (matching gdb's style where
the deepest frame is 0). The PC values are relative to the location of the
shared library rather than absolute addresses. The next column is the name
of the mapped region (which is usually a shared library or executable, but
might not be for, say, JIT-compiled code). Finally, if symbols are available,
the symbol that the PC value corresponds to is shown, along with the offset
into that symbol in bytes. You can use this in conjunction with <code>objdump(1)</code>
to find the corresponding assembler instruction.</p>

<h2 id=tombstones>Tombstones</h2>

<pre class="devsite-click-to-copy">
Tombstone written to: /data/tombstones/tombstone_06
</pre>

<p>This tells you where <code>debuggerd</code> wrote extra information.
<code>debuggerd</code> will keep up to 10 tombstones, cycling through
the numbers 00 to 09 and overwriting existing tombstones as necessary.</p>

<p>The tombstone contains the same information as the crash dump, plus a
few extras. For example, it includes backtraces for <i>all</i> threads (not
just the crashing thread), the floating point registers, raw stack dumps,
and memory dumps around the addresses in registers. Most usefully it also
includes a full memory map (similar to <code>/proc/<i>pid</i>/maps</code>). Here's an
annotated example from a 32-bit ARM process crash:</p>

<pre class="devsite-click-to-copy">
memory map: (fault address prefixed with ---&gt;)
---&gt;ab15f000-ab162fff r-x 0 4000 /system/xbin/crasher (BuildId:
b9527db01b5cf8f5402f899f64b9b121)
</pre>

<p>There are two things to note here. The first is that this line is prefixed
with "---&gt;". The maps are most useful when your crash isn't just a null
pointer dereference. If the fault address is small, it's probably some variant
of a null pointer dereference. Otherwise looking at the maps around the fault
address can often give you a clue as to what happened. Some possible issues
that can be recognized by looking at the maps include:</p>

<ul>
<li>Reads/writes past the end of a block of memory.</li>
<li>Reads/writes before the beginning of a block of memory.</li>
<li>Attempts to execute non-code.</li>
<li>Running off the end of a stack.</li>
<li>Attempts to write to code (as in the example above).</li>
</ul>

<p>The second thing to note is that executables and shared libraries files
will show the BuildId (if present) in Android M and later, so you can see
exactly which version of your code crashed. (Platform binaries include a
BuildId by default since Android M. NDK r12 and later automatically pass
<code>-Wl,--build-id</code> to the linker too.)</p>

<pre class="devsite-click-to-copy">
ab163000-ab163fff r--      3000      1000  /system/xbin/crasher
ab164000-ab164fff rw-         0      1000
f6c80000-f6d7ffff rw-         0    100000  [anon:libc_malloc]
</pre>

<p>On Android the heap isn't necessarily a single region. Heap regions will
be labeled <code>[anon:libc_malloc]</code>.</p>

<pre class="devsite-click-to-copy">
f6d82000-f6da1fff r--         0     20000  /dev/__properties__/u:object_r:logd_prop:s0
f6da2000-f6dc1fff r--         0     20000  /dev/__properties__/u:object_r:default_prop:s0
f6dc2000-f6de1fff r--         0     20000  /dev/__properties__/u:object_r:logd_prop:s0
f6de2000-f6de5fff r-x         0      4000  /system/lib/libnetd_client.so (BuildId: 08020aa06ed48cf9f6971861abf06c9d)
f6de6000-f6de6fff r--      3000      1000  /system/lib/libnetd_client.so
f6de7000-f6de7fff rw-      4000      1000  /system/lib/libnetd_client.so
f6dec000-f6e74fff r-x         0     89000  /system/lib/libc++.so (BuildId: 8f1f2be4b37d7067d366543fafececa2) (load base 0x2000)
f6e75000-f6e75fff ---         0      1000
f6e76000-f6e79fff r--     89000      4000  /system/lib/libc++.so
f6e7a000-f6e7afff rw-     8d000      1000  /system/lib/libc++.so
f6e7b000-f6e7bfff rw-         0      1000  [anon:.bss]
f6e7c000-f6efdfff r-x         0     82000  /system/lib/libc.so (BuildId: d189b369d1aafe11feb7014d411bb9c3)
f6efe000-f6f01fff r--     81000      4000  /system/lib/libc.so
f6f02000-f6f03fff rw-     85000      2000  /system/lib/libc.so
f6f04000-f6f04fff rw-         0      1000  [anon:.bss]
f6f05000-f6f05fff r--         0      1000  [anon:.bss]
f6f06000-f6f0bfff rw-         0      6000  [anon:.bss]
f6f0c000-f6f21fff r-x         0     16000  /system/lib/libcutils.so (BuildId: d6d68a419dadd645ca852cd339f89741)
f6f22000-f6f22fff r--     15000      1000  /system/lib/libcutils.so
f6f23000-f6f23fff rw-     16000      1000  /system/lib/libcutils.so
f6f24000-f6f31fff r-x         0      e000  /system/lib/liblog.so (BuildId: e4d30918d1b1028a1ba23d2ab72536fc)
f6f32000-f6f32fff r--      d000      1000  /system/lib/liblog.so
f6f33000-f6f33fff rw-      e000      1000  /system/lib/liblog.so
</pre>

<p>Typically a shared library will have three adjacent entries. One will be
readable and executable (code), one will be read-only (read-only
data), and one will be read-write (mutable data). The first column
shows the address ranges for the mapping, the second column the permissions
(in the usual Unix <code>ls(1)</code> style), the third column the offset into the file
(in hex), the fourth column the size of the region (in hex), and the fifth
column the file (or other region name).</p>

<pre class="devsite-click-to-copy">
f6f34000-f6f53fff r-x         0     20000  /system/lib/libm.so (BuildId: 76ba45dcd9247e60227200976a02c69b)
f6f54000-f6f54fff ---         0      1000
f6f55000-f6f55fff r--     20000      1000  /system/lib/libm.so
f6f56000-f6f56fff rw-     21000      1000  /system/lib/libm.so
f6f58000-f6f58fff rw-         0      1000
f6f59000-f6f78fff r--         0     20000  /dev/__properties__/u:object_r:default_prop:s0
f6f79000-f6f98fff r--         0     20000  /dev/__properties__/properties_serial
f6f99000-f6f99fff rw-         0      1000  [anon:linker_alloc_vector]
f6f9a000-f6f9afff r--         0      1000  [anon:atexit handlers]
f6f9b000-f6fbafff r--         0     20000  /dev/__properties__/properties_serial
f6fbb000-f6fbbfff rw-         0      1000  [anon:linker_alloc_vector]
f6fbc000-f6fbcfff rw-         0      1000  [anon:linker_alloc_small_objects]
f6fbd000-f6fbdfff rw-         0      1000  [anon:linker_alloc_vector]
f6fbe000-f6fbffff rw-         0      2000  [anon:linker_alloc]
f6fc0000-f6fc0fff r--         0      1000  [anon:linker_alloc]
f6fc1000-f6fc1fff rw-         0      1000  [anon:linker_alloc_lob]
f6fc2000-f6fc2fff r--         0      1000  [anon:linker_alloc]
f6fc3000-f6fc3fff rw-         0      1000  [anon:linker_alloc_vector]
f6fc4000-f6fc4fff rw-         0      1000  [anon:linker_alloc_small_objects]
f6fc5000-f6fc5fff rw-         0      1000  [anon:linker_alloc_vector]
f6fc6000-f6fc6fff rw-         0      1000  [anon:linker_alloc_small_objects]
f6fc7000-f6fc7fff rw-         0      1000  [anon:arc4random _rsx structure]
f6fc8000-f6fc8fff rw-         0      1000  [anon:arc4random _rs structure]
f6fc9000-f6fc9fff r--         0      1000  [anon:atexit handlers]
f6fca000-f6fcafff ---         0      1000  [anon:thread signal stack guard page]
</pre>

<p>
Note that since Android 5.0 (Lollipop), the C library names most of its anonymous mapped
regions so there are fewer mystery regions.
</p>

<pre class="devsite-click-to-copy">
f6fcb000-f6fccfff rw- 0 2000 [stack:5081]
</pre>

<p>
Regions named <code>[stack:<i>tid</i>]</code> are the stacks for the given threads.
</p>

<pre class="devsite-click-to-copy">
f6fcd000-f702afff r-x         0     5e000  /system/bin/linker (BuildId: 84f1316198deee0591c8ac7f158f28b7)
f702b000-f702cfff r--     5d000      2000  /system/bin/linker
f702d000-f702dfff rw-     5f000      1000  /system/bin/linker
f702e000-f702ffff rw-         0      2000
f7030000-f7030fff r--         0      1000
f7031000-f7032fff rw-         0      2000
ffcd7000-ffcf7fff rw-         0     21000
ffff0000-ffff0fff r-x         0      1000  [vectors]
</pre>

<p>Whether you see <code>[vector]</code> or <code>[vdso]</code> depends on the architecture. ARM uses [vector], while all other architectures use <a href="http://man7.org/linux/man-pages/man7/vdso.7.html">[vdso].</a></p>
  </body>
</html>