How does it work?
In layman's terms, MPEG audio compression can be compared to Persistence of Vision. In tech-speak, it's based on a perceptual coding scheme, where the encoded signal does not have to be exactly the same as the decoded signal - it just has to sound like it to the human ear. As Einstein is famous for saying - it's all relative - and in this case, it's relative to the human hearing range and frequency identification. This is called a psychoacoustic type of algorithm.
The masking effect rips out the parts of the signal that are not audible to the human ear (This may explain why dogs haven't jumped on the MP3 bandwagon). To achieve this psychoacoustic effect, a psychoacoustic model is used which attempts to mimic the human ear and associated human auditory issues. This model analyzes many consecutive blocks and determines the human block of the spectrum of the signal - then it models the masking properties based on this, and estimates the minimal volume.
After this encoding occurs during creation of the MPEG, it's much easier to decode (so don't hold your MP3 player in too much awe, the encoder did all the work above!). The decoder simply reads the file that was generated and reconstructs an audio signal - which sounds the same as the original to the human ear - but it isn't. Neat trick. You've been duped by people in white coats.
The more technical aspects are below, though they are nowhere near complete. I don't want too many people sleeping at their keyboards!
MPEG works in stages - MPEG-1, MPEG-2 and MPEG-4 for now. Stage 1 (MPEG-1) is used for encoding of monophonic and stereophonic sounds at frequencies used mainly for high quality. Stage 2 contains an extension to weaker recording frequencies, and also an extension to sounds which can include multiple voices. Both MPEG-1 and MPEG-2 both have 3 layer structures, with each layer representing a family of coding algorithms.
Each layer has it's own advantages, and the complexity increases from layer I (simplest) to layer III.
(1) Layer I, being the simplest, is suited for applications where the complexity (or processing requirement) is of high importance.
(2) Layer II is the middle ground, with the advantage that it suppresses more redundancy in a signal than Layer I and uses the psychoacoustic model in the most efficient manner.
(3) Layer III is used for applications requiring the lowest data rates; it does this with the highest level of redundant signal suppression and extraction of barely audible frequencies using it's filter. MP3.
