Android uses a wide variety of audio data formats internally, and exposes a subset of these in public APIs, file formats, and the Hardware Abstraction Layer (HAL).
The audio data formats are classified by their properties:
Fixed point is the most common representation for uncompressed PCM audio data, especially at hardware interfaces.
A fixed-point number has a fixed (constant) number of digits before and after the radix point. All of our representations use base 2, so we substitute bit for digit, and binary point or simply point for radix point. The bits to the left of the point are the integer part, and the bits to the right of the point are the fractional part.
We speak of integer PCM, because fixed-point values are usually stored and manipulated as integer values. The interpretation as fixed-point is implicit.
We use two's complement for all signed fixed-point representations, so the following holds where all values are in units of one LSB:
|largest negative value| = |largest positive value| + 1
There are various notations for fixed-point representation in an integer. We use Q notation: Qm.n means m integer bits and n fractional bits. The "Q" counts as one bit, though the value is expressed in two's complement. The total number of bits is m + n + 1.
Um.n is for unsigned numbers: m integer bits and n fractional bits, and the "U" counts as zero bits. The total number of bits is m + n.
The integer part may be used in the final result, or be temporary. In the latter case, the bits that make up the integer part are called guard bits. The guard bits permit an intermediate calculation to overflow, as long as the final value is within range or can be clamped to be within range. Note that fixed-point guard bits are at the left, while floating-point unit guard digits are used to reduce roundoff error and are on the right.
Floating point is an alternative to fixed point, in which the location of the point can vary. The primary advantages of floating-point include:
Historically, floating-point arithmetic was slower than integer or fixed-point arithmetic, but now it is common for floating-point to be faster, provided control flow decisions aren't based on the value of a computation.
The major Android formats for audio are listed in the table below:
Property | Q0.15 | Q0.7 1 | Q0.23 | Q0.31 | float | |
---|---|---|---|---|---|---|
Container bits |
16 | 8 | 24 or 32 2 | 32 | 32 | |
Significant bits including sign |
16 | 8 | 24 | 24 or 32 2 | 25 3 | |
Headroom in dB |
0 | 0 | 0 | 0 | 126 4 | |
Dynamic range in dB |
90 | 42 | 138 | 138 to 186 | 900 5 |
All fixed-point formats above have a nominal range of -1.0 to +1.0 minus one LSB. There is one more negative value than positive value due to the two's complement representation.
Footnotes:
0.10000000
.
This section discusses data conversions between various representations.
To convert a value from Qm.n format to floating point:
For example, to convert a Q4.27 internal value to floating point, use:
float = integer * (2 ^ -27)
Conversions from floating point to fixed point follow these rules:
Conversions between different Qm.n formats follow these rules:
For example, to convert a Q4.27 value to Q0.15 (without dither or rounding), right shift the Q4.27 value by 12 bits, and clamp any results that exceed the 16-bit signed range. This aligns the point of the Q representation.
To convert Q7.24 to Q7.23, do a signed divide by 2, or equivalently add the sign bit to the Q7.24 integer quantity, and then signed right shift by 1. Note that a simple signed right shift is not equivalent to a signed divide by 2.
A conversion is lossless if it is
invertible:
a conversion from A
to B
to
C
results in A = C
.
Otherwise the conversion is lossy.
Lossless conversions permit round-trip format conversion.
Conversions from fixed point representation with 25 or fewer significant bits to floating point are lossless. Conversions from floating point to any common fixed point representation are lossy.