TEKNOMARKET: Advanced Audio Coding

Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital audio. AAC is promoted as the successor to the MP3 format by MP3’s creator, Fraunhofer IIS.

(Advanced Audio Coding (AAC) , Türkçe anlamıyla Gelişmiş Ses Kodlama sayısal ses için kayıplı sıkıştırma yapan kodlayıcı ya da çözücüdür.)

Depending on the encoder used, AAC generally achieves better sound quality than MP3 at the same bitrate, particularly below 192 kbit/s.^[1]

AAC’s most famous usage is as the default audio format of Apple's iPhone, iPod, iTunes, and the format used for all iTunes Store audio (with extensions for proprietary Digital Rights Management (DRM) where used).

AAC is also the standard audio format for Sony’s PlayStation 3, the MPEG-4 video standard, and HE-AAC is part of digital radio standards like DAB+ and Digital Radio Mondiale.

History

AAC was developed with the cooperation and contributions of companies including Dolby, Fraunhofer IIS, AT&T, Sony and Nokia, and was officially declared an international standard by the Moving Pictures Experts Group in April 1997.

Standardization

It is specified both as Part 7 of the MPEG-2 standard, and Part 3 of the MPEG-4 standard. As such, it can be referred to as MPEG-2 Part 7 and MPEG-4 Part 3 depending on its implementation, however it is most often referred to as MPEG-4 AAC, or AAC for short.

AAC was first specified in the standard MPEG-2 Part 7 (known formally as ISO/IEC 13818-7:1997) in 1997 as a new "part" (distinct from ISO/IEC 13818-3) in the MPEG-2 family of international standards.

It was updated in MPEG-4 Part 3 (known formally as ISO/IEC 14496-3:1999) in 1999. The reference software is specified in MPEG-4 Part 4 and the conformance bitstreams are specified in MPEG-4 Part 5. A notable addition in this version of the standard is Perceptual Noise Substitution (PNS).

HE-AAC (AAC with SBR) was first standardized in ISO/IEC 14496-3:2001/Amd.1. HE-AAC v2 (AAC with Parametric Stereo) was first specified in ISO/IEC 14496-3:2001/Amd.4. ^[2]

The current version of the AAC standard is ISO/IEC 14496-3:2005 (with 14496-3:2005/Amd.2. for HE-AAC v2^[3])

AacPlus v2 is also standardized by ETSI (European Telecommunications Standards Institute) as TS 102005.^[2]

The MPEG4 standard also contains other ways of compressing sound. These are low bit rate and generally used for speech.

AAC’s improvements over MP3

AAC was designed to have better performance than MP3 (which was specified in MPEG-1 and MPEG-2) by the ISO/IEC in 11172-3 and 13818-3.

Improvements include:

More sample frequencies (from 8 kHz to 96 kHz) than MP3 (16 kHz to 48 kHz)
Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1 channels in MPEG-2 mode)
Arbitrary bitrates and variable frame length. Standardized constant bit rate with bit reservoir.
Higher efficiency and simpler filterbank (hybrid → pure MDCT)
Higher coding efficiency for stationary signals (blocksize: 576 → 1024 samples)
Higher coding efficiency for transient signals (blocksize: 192 → 128 samples)
Can use Kaiser-Bessel derived window function to eliminate spectral leakage at the expense of widening the main lobe
Much better handling of audio frequencies above 16 kHz
More flexible joint stereo (separate for every scale band)
Adds additional modules (tools) to increase compression efficiency: TNS, Backwards Prediction, PNS etc... These modules can be combined to constitute different encoding profiles.

Overall, the AAC format allows developers more flexibility to design codecs than MP3 does. This increased flexibility often leads to more concurrent encoding strategies and, as a result, to more efficient compression. However in terms of whether AAC is better than MP3, the advantages of AAC are not entirely conclusive, and the MP3 specification, while outdated, has proven surprisingly robust. AAC and HE-AAC are better than MP3 at low bitrates (typically less than 192 kbit/s).

How AAC works

AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to represent high-quality digital audio.

Signal components that are perceptually irrelevant are discarded;
Redundancies in the coded audio signal are eliminated.

Furthermore:

The signal is processed by a modified discrete cosine transform (MDCT) according to its complexity;
Internal error correction codes are added;
The signal is stored or transmitted.
In order to prevent corrupt samples, a modern implementation of the Luhn mod N algorithm is applied to each frame

The MPEG-4 audio standard does not define a single or small set of highly efficient compression schemes but rather a complex toolbox to perform a wide range of operations from low bitrate speech coding to high-quality audio coding and music synthesis.

The MPEG-4 audio coding algorithm family spans the range from low bitrate speech encoding (down to 2 kbit/s) to high-quality audio coding (at 64 kbit/s per channel and higher).
AAC offers sampling frequencies between 8 kHz and 96 kHz and any number of channels between 1 and 48.
In contrast to MP3's hybrid filter bank, AAC uses the modified discrete cosine transform (MDCT) together with the increased window lengths of 1024 points. AAC is much more capable of encoding audio with streams of complex pulses and square waves than MP3 or MP2.

AAC encoders can switch dynamically between a single MDCT block of length 1024 points or 8 blocks of 128 points.

If a signal change or a transient occurs, 8 shorter windows of 128 points each are chosen for their better temporal resolution.
By default, the longer 1024-point window is otherwise used because the increased frequency resolution allows for a more sophisticated psychoacoustic model, resulting in improved coding efficiency.

Modular encoding

AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded, the desired performance and the acceptable output, implementers may create profiles to define which of a specific set of tools they want use for a particular application. The standard offers four default profiles:

Low Complexity (LC) - the simplest and most widely used and supported;
Main Profile (MAIN) - like the LC profile, with the addition of backwards prediction;
Sample-Rate Scalable (SRS), a.k.a. Scalable Sample Rate (MPEG-4 AAC-SSR);
Long Term Prediction (LTP); added in the MPEG-4 standard – an improvement of the MAIN profile using a forward predictor with lower computational complexity.

Depending on the AAC profile and the MP3 encoder, 96 kbit/s AAC can give nearly the same or better perceptional quality as 128 kbit/s MP3.

AAC Low Delay

The MPEG-4 Low Delay Audio Coder (AAC-LD) is designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) format.

The most stringent requirements are a maximum algorithmic delay of only 20 ms and a good audio quality for all kind of audio signals including speech and music. The AAC-LD coding scheme bridges the gap between speech coding schemes and high quality audio coding schemes.

Licensing and patents

No licenses or payments are required to be able to stream or distribute content in AAC format. ^[4] This reason alone makes AAC a much more attractive format for distributing content, particularly streaming content (such as Internet radio).

However, a patent license is required for all manufacturers or developers of AAC codecs, that require encoding or decoding. ^[5] It is for this reason FOSS implementations such as FAAC and FAAD are distributed in source form only, in order to avoid patent infringement.

AAC requires a patent license, and thus uses proprietary technology. But contrary to popular belief, it is not the property of a single company, having been developed in a standards-making organization.

TEKNOMARKET

5 Eylül 2007 Çarşamba

Advanced Audio Coding

Hiç yorum yok:

Diğer sitelerde arayın

Technorati

Blog Arşivi

Google AdSense

Google AdSense

Takipteyim

Ziyaretçi Sayacı

Arkadaslarım