How a CODEC works

A CODEC simply means enCOder-DECoder. It generally refers to the process of changing information from one form to another digitally.

In todays world of sound and video, CODECs are used to take a programme signal and condition it for storage and retrieval or transfer it via a communications or broadcast medium.

The encoding process can – and often does – apply some form of data reduction to the source signal in order to make effective use of the medium or to allow the signal to be conveyed in real time. For example, a basic rate ISDN circuit offers a throughput of 128 kbps. A broadcast quality stereo sound signal simple sampled at 44.1 kHz, with a resolution of 16 bit PCM requires 1,411 kbps. Therefore an essential feature of codecs is to try and reduce the amount of data needed to faithfully communicate or store the information with the minimum of corruption or disturbance.

Algorithm

The principle benefits of handling a digital version of an audio signal were understood well over 70 years ago but it wasn’t until fairly recently with the advances in the implementation of digital signal processing hardware that CODECs became a more accessable reality. Many types of audio coding algorithms have been developed over the years, and improvements are still being made because of the falling cost of powerful DSP chip sets.

apt-X™

apt-X™is a highly successful proprietary audio coding algorithm commonly used originally for high quality audio over ISDN or data circuits. The algorithm works on a multi-band predictive technique. Matching encoders and decoder pairs mean that only signal difference information need be stored or transmitted. Typically data compression allows net data savings of around 4 to 1. A Key feature of apt-X™ is that the whole dynamic range and spectrum of the audio signal is processed, none is disregarded. The algorithm saves bits by analysis of the waveform structure itself and does not depend specifically on the sonic composition of the audio material.
apt-X™ is tolerant of multiple encode-decode processes and is extremely fast. It is possible to store or re-transmit a signal via an apt-X™ codec chain many more times than a chain based on Psycho-acoustic techniques (MPEG, ATRAC) or waveform modelling techniques (ACELP). This makes apt-X™ ideal for quality and time sensitive applications or where further psycho-acoustic processing is undesirable.

Enhanced apt-X™

A later development of the apt-X™ algorithm adds yet further improvements in noise floor and transient response, additional benefits include the ability to use phase coherent discrete multi-channel sound, popular in the film industry and in home cinema!

Baudion is the UK distributor for APT WorldCast Systems, the hardware spin off of Audio Processing Technology, the original developer and owner of the apt-X™ suite of audio algorithms and offers a whole range of equipment designed for applications ranging from audio contribution schemes via IP Ethernet through to modular rack crate solutions carrying multiple encoder/decoders for connection over ISDN, E1/T1 or n x 64 connections.

MPEG Audio

The Motion Picture Experts Group defines professional audio and video standards – including data reduction schemes. MPEG1 Layer 2 emerged in 1987 and allowed data compression ratios of 12:1 to be achieved. MPEG 1 Layers 1, 2 and 3 audio achieve data reduction by exploiting a characteristic of human hearing where ‘quiet’ frequencies are perceived to be inaudible, when louder sounds at nearby frequencies are present. This psychoacoustic masking principle basically works on the idea that if a particular band of ‘quiet’ frequencies next to a louder adjacent band of frequencies, the quieter sound is disregarded and therefore excluded from the encoding process, which of course, saves data.

MPEG 1 Layers

Layer 1
– Used for the Digital Compact cassette in the early 1990s, no longer used.

Layer 2 (sometimes known as MPEG L2, MP2, Musicam®)
– Uses fixed frequency bands during the encoding process, which although workable, is not ideal. Good results are possible at 192 kbps which is regarded as a CD-like quality. Layer 2 is also used at rates of 128kbps for occasional programme feeds as it can be accommodated on a bonded pair of ISDN channels.

Layer 3 (sometimes known as MPEG 3, MP3 )
– Uses logarithmically divided frequency bands, similar to the behaviour of the ear and offers better quality than Layer 2 for a given bit-rate. Despite advances in audio encoding it is still commonly used for media devices and internet audio streams.

AAC – Advanced Audio Coding
– Is a more recent perceptual compression standard for digital audio, that offers superior quality sound than previous MPEG coding schemes for the same bit rate. Offering a large number of possible sample rates, AAC is noticeably better than MP3 at low bit rate working.

HE-AAC ( sometimes known as MPEG-4 HE-AAC, and commercially available as aacPlus, aacPlusV2, AAC+)
– Designed for streaming audio applications. A development of the AAC standard that additionally incorporates spectral band replication (SBR) and in the latest version, parametric stereo operation. MPEG-4 HE-AAC is regarded as state-of-art for streamed audio.

AAC LD
– A variation of AAC that provides high quality audio for two way communications connections or programme feeds where delays are not acceptable. Coding delays comparable to older standards such as G.722 are possible.

MPEG-4
The MPEG-4 standard does not actually specify a particular audio coding scheme but offers a list, or kit of ‘tools’ (coding schemes) that are allowed within the standard. Various coding schemes can therefore be used in MPEG-4, depending on the application.

MPEG-2
– Is a standard for Digital Video used for Digital Video Broadcasting (DVB) and DVD. Not to be confused with MPEG1 Layer 2, MEPG-2 Audio is similar to MPEG-1 Layer 2 but has a multi-channel capability to facilitate the multi-lingual sound or tertiary audio information.