audio | Random Thoughts

The Evolution of Sound Recording

Posted in audio engineering, audio recording, history by commorancy on February 14, 2023

edison Beginning in the 1920s and up to the present, sound recordings have changed and improved dramatically. This article spans 100 years of audio technology improvements. Though, audio recording spans all the way back to Phonautograph in 1860. What drove these changes was primarily the quality of the recording media available at the time. This is a history-based article and is 20,000 words due to the material and ten decades covered. Grab a cup of your favorite hot beverage, sit back and let’s explore the many highlights of what sound recording has achieved since 1920.

Before We Get Started

Some caveats to begin. This article isn’t intended to offer a comprehensive history of all possible sound devices or technologies produced since the 1920s. Instead, this article’s goal is to provide a glimpse of what has led to our current technologies by calling out the most important technological breakthroughs in each decade, according to this Randocity author. Some of these audio technology breakthroughs may not have been designed for the field of audio, but have impacted audio recording processes, nonetheless. If you’re really interested in learning about every audio technology ever invented, sold or used within a given decade, Google is a the best place to start for that level of exploration and research. Wikipedia is also another good source of information. A comprehensive history is not the intent of this article.

Know then that this article will not discuss some technologies. If you think that a missing technology was important enough to have been included, please leave a comment below for consideration.

This article is broken down by decades, punctuated with deeper dives into specific topics. Thus, this sectioning is intended to make this article easier read over multiple sittings if you’re short on time. There is intentionally no TL;DR section in this article. If you’re wanting a quick synopsis you can read in 5 minutes, this is not that article.

Finally, because of the length of this article, there may still be unintended typos, misspellings and other errors still present. Randocity is continuing to comb through this article to shake out any loose grammatical problems. Please bear with us while we continue to clean up this article. Additionally in this cleanup process, more information may be added to improve clarity or as the article requires.

With that in mind, onto the roaring…

1920s

Ah, the flapper era. Let’s all reminisce over the quaint cloche hats covering tightly waved hair, the Charleston headbands adorned with a feather, the fringe dresses and the flapper era music in general, which wouldn’t be complete without including the Charleston itself.

For males, it was all about a pinstripe or grey suit with a Malone cap, straw hat or possibly a fedora. While women were burning their hair with hot irons to create that signature 1920s wave hairstyle and slipping into fringe flapper evening dresses, musicians were recording their music using, at least by today’s standards, antiquated audio equipment. At the time in the 1920s, though, that recording equipment was considered top end professional!

In the 1920s, recordings produced in a recording studio were recorded or ‘cut’ by a record cutting lathe. Hence, the use of the term “record cut”. This style lathe recorder used a stylus which cut a continuous groove into a “master” record, usually made of lacquer. The speed? 78 RPM (revolutions per minute). Typically, a studio using an acoustic microphone had one microphone. When electrical microphones appear, this setup requires an immense sized amplifier to feed the sound into that lathe recorder. Prior to the 1920s, records were made via acoustic microphones (no electricity involved). By the 1920s, this era ushered in the use of electrical amplifiers to improve the sound quality, improving microphone placement and numbers and make the recordings sound more natural by improving the volume on the records produced on the master recording. Effectively, a studio recorded the music straight onto a “cut” record… which that record would be used as a master to mass produce shellac 78 RPM records, which would then be sold in mass to consumers in stores.

This also meant that there was no such thing as overdubbing. Musicians had to get the entire song down in one single take. It’s possible multiple takes were utilized to get the best possible version, but that would waste several master discs until that best take could be made.

Though audio recording processes would improve only a little from 1920 to 1929, going into the 30s, the recording process would show much more improvement. We would have to wait until 1948 before 33 RPM records would be introduced to see a decidedly marked improvement in sound quality on records. Until then, 78 RPM shellac records would remain the steadfast, but average quality standard for buying music.

With non-electrical recordings of 1920s, these recordings utilized only a single microphone to record the entire band, including the singer. It wouldn’t be until the mid to late 20s, with electrical recording processes using amplifiers, that a two channel mixing board becomes available, allowing for placement of two or more microphones connected via wire, one or more mics for the band and one for the singer.

Shellac Records

Before we jump into the 1930s, let’s take a moment to discuss the mass produced shellac 78 records, which remained popular well into the 1950s. Shellac is very brittle. Thus, dropping one of these records would result in the record shattering into small pieces. Of course, one of these records can also be intentionally broken by whacking it against something hard, like so…

DonnaReedBreak

Shellac records fell out of vogue primarily because shellac became a scarce commodity due to World War II wartime efforts, but also because of its lack of quality when compared with vinyl records. Because of the scarcity of shellac combined with the rise of vinyl records, this led to the demise of the shellac format by 1959.

This wartime scarcity of shellac also led to another problem; the loss of some audio recordings. This shellac scarcity led to people performing their civic duty by turning in their shellac 78 RPM records to help reduce this shellac scarcity. Also around this time, some 78 records were made of vinyl due to this shellac shortage. While a noble goal, the turning in of shellac 78s to help out the war effort also contributed to the loss of many shellac recordings. In essence, this wartime effort may have caused the loss of many audio recordings that may never be heard again.

1920s Continued

As for cinema sound, it would be the 1920s that introduces moviegoers to what would be affectionately dubbed, “talkies“. Cinema sound as we know it, using an optical strip along side the film, was not yet available. Synchronization of sound to motion pictures was immensely difficult and ofttimes impractical to achieve with the technologies available at the time. One system used was the sound-on-disc process, which required synchronizing a separate large phonograph disc with the projected motion picture. Unfortunately, this synchronization could be impossible to achieve reliably. The first commercial success of this sound-on-disc process, albeit with limited sound throughout, was The Jazz Singer in 1927.

Even though the sound-on-film (optical strip) process (aka Fox Film Corporation’s Movietone) would be invented during the 1920s, it wouldn’t be widely in use until the 1930s, when this audio film process becomes fully viable for commercial film use. Though, the first movies released using Fox’s Movietone optical audio system would be Sunrise: A Song of Two Humans (1927) and then a year later Mother Knows Best (1928). Until optical sound became much more widely and easily accessible to filmmakers, most filmmakers in the 1920s utilized sound-on-disc (phonographs) to synchronize their sound separately with the film. Only the Fox Film Corporation, at that time, had access (due to William Fox having purchased the patents in 1926) for the Movietone film process. Even then, the two films above only sported limited voice acting on film. Most of the audio included in those two pictures consisted of music and sound effects, with very limited voice acting.

Regardless of the clumsy, low quality and usually unworkable processes for providing motion picture sound in the 1920s, this decade was immensely important in ushering sound into the motion picture industry. If anything, the 1920s (and William Fox) proved that sound would become fundamental to the motion picture experience.

Commercial radio broadcasting also began during this era. In November of 1920, KDKA began broadcasting its first radio programs. This began the era of commercial radio broadcasting that we know today. With this broadcast, radio broadcasters needed ways to attenuate the signal to accommodate the broadcast frequency bandwidth requirements.

Thus, additional technologies would be both needed and created during the 1930s, such as audio compression and limiting. These compression technologies were designed for the sole purpose of keeping audio volumes strictly within radio broadcast specifications, to prevent overloading the radio transmitter, thus giving the listener a better audio experience when using their new radio equipment. On a related note, RCA apparently held most of the patents for AM radio broadcasting during this time.

1930s

By the 1930s, women’s fashion had become more sensible, less ostentatious, less flamboyant and more down-to-earth. Gone are the cloche hats and heavy eye makeup. This decade shares a lot of its women’s fashion sensibilities with 1950s dress.

Continuing with the sound-on-film discussion from the 1920s, William Fox’s investment in Movietone, would only prove useful until 1931, when Western Electric introduced a light-valve optical recorder which superseded Fox’s Movietone process. This Western Electric process would become the optical film process of choice, utilized by filmmakers throughout 1930 film features and into the future, even though Western Electric’s recording equipment design proved to be bulky and heavy. Fox’s Movietone optical process would continue to remain in use for producing Movietone news reels until 1939 due to the better portability of the sound equipment, thanks in part to Western Electric’s over-engineering of its light-valve’s unnecessarily heavy case designs.

As for commercial audio recording throughout the 1930s, recording processes haven’t changed drastically from the 20s, except new equipment was introduced to aid in recording music better, including the use of better microphones. While the amplifiers got a little smaller, the microphone quality improved along with the use of multi-channel mixing boards. These boards were introduced so that instead of recording only one microphone, many microphones (as many as six) could record an orchestra including a singer mixed down into one monophonic recorded input for the lathe recorder. This change allowed for better, more accurate, more controlled sound recording and reproduction. However, recording to the lathe stylus recorder was still the main way to record, even though audio tape recording equipment was beginning to appear, such as the AEG/Telefunken Magnetophon K1 (1935).

At this time, RCA produced its first uni-directional ribbon microphone, the large 77A (1932) and this mic became a workhorse in many studios. There is some discrepancy on the exact date the 77A was introduced. However, it was big and bulky, but became an instant favorite. However, in 1933, RCA introduced its smaller successor to the 77A, the RCA 44A, which was a bi-directional microphone. The model 77 would go on to also see the release of the 77B, C, D and DX. However, the two latter 77 series microphones wouldn’t see release until the mid-40s, after having been redesigned to about the size of the 44.

There would be three 44 series models released including the 44A (1933), 44B (1936) and the 44BX (1938). These figure 8 pattern bi-directional ribbon microphones also became the workhorse mics used in most of the recording and broadcast industries in the United States, ultimately replacing the 77A throughout the 30s. These microphones were, in fact, so popular, some are still in use today and can be found on eBay. There’s even an RCA 44A replica being produced today by AEA. Unfortunately, RCA discontinued manufacture of the 44 microphone series in 1955. RCA would discontinue producing microphones altogether in 1977, ironically RCA’s last model released was the model 77 in 1977. The 44A sported an audio pickup range between 50 Hz to 15,000 Hz… an impressive dynamic range, even though the lathe recording system could not record or reproduce that dynamic range.

A mixing board when combined with several new workhorse 44A mics allowed a sound engineer to bring certain channel volumes up and other volumes down. A mixing board use allowed vocalists to be brought front and center in the recording, not drowned out by the band… with sound leveled on the fly during recording by the engineer’s hand and a pair of monitor headphones or speakers.

One microphone problem during the 20s was that microphones were primarily omni-directional. This meant that any noise would be picked up from anywhere around the microphone. This also meant that in recording situations, everything had to remain entirely silent during the recording process, except for the sound being recorded. By 1935, Siemens and RCA had introduced various cardioid microphones to attempt to solve for extraneous side noise. These uni or bi-directional microphones only picked up audio directly in front of the microphone, but not sounds outside of the microphone’s cardioid pattern. This improvement was important when recording audio for film when on location. You can’t exactly stop car honking, tires squealing and general city noises during a take. The solution was the uni-directional microphone, introduced around 1935.

Most recording studios at the time relied on heavy wall-mounted gear that wasn’t at all easy to transport. This meant recording had to be done in the confines of a studio using fixed equipment. This portability need led to the release of this 1938 Western Electric 22D model mixer, which had 4 microphone inputs and a master gain output. It sported a lighted VU meter and it could be mounted in a portable carrying case or possibly in a rack. This unit even sported a battery pack! In 1938, this unit was primarily used for radio broadcast recording, but this or similar portable models were also used when recording on-location audio for films or news reels at the time.

western-electric-22d

In the studio, larger channel versions were also utilized to allow for more microphone placement, but still mixing down into a single monophonic channel. Such studio typically used up to 6 microphones, though amplifiers sometimes added hiss and noise, which might be audibly detectable if too many were strung together. There was also the possibility that phase problems could exist if too many microphones were utilized. The resulting output recording would be monophonic for the mass produced shellac 78 RPM records, for radio broadcast or for movies shown in a theater.

Here are more details about this portable Western Electric 22-d mixing board…

WE22d-schematic

Lathe Recording

Unfortunately, electric current during this time was still considered too unreliable and could cause audio “wobble” if electrical power was used to power the turntable during recording. In some cases, lathe recorders used a heavy counterweight suspended from the ceiling which would slowly drop to the floor at a specified rate which would power the rotation of the lathe turntable to ensure a continuous rotation speed. This weight system provided the lathe with a stable cut from the beginning to the end of the recording, unaffected by potential unstable electrical power. Electrical power was used for amplification purposes, but not always for driving the turntable rotation while “cutting” the recording. Spring based or wound mechanisms may have also been used.

1930s Continued

All things considered, this Western Electric portable 4 channel mixer was pretty far ahead of the curve. With technology like this, these 1930 audio innovations led us directly into 60s and 70s era of recording. This portable mixing board alone, released in 1938, is definitely ahead of its time. Of course, this portability was likely being driven by both broadcasters, who wanted to record audio on location, and by the movie industry who needed to record audio on-location while filming. Though, the person tasked with using this equipment had to lug around 60 lbs of equipment, 30 lbs on each shoulder.

Additionally, during the 1930s and specifically in 1934, Bell Labs began experimenting with stereo (binaural) recordings in their audio labs. Here’s an early stereo recording from 1934.

Note that even though this recording was recorded stereo in 1934, the first commercially produced stereo / binaural record wouldn’t hit the stores until 1957. Until 1957, the monophonic / monaural 78 RPM records remained the primary standard for purchasing music during the 1930s.

For home recording units (and even used in professional situations) in the 1930s, there were options. Presto created various model home recording lathes. Some of Presto’s portable models include the 6D, D, M and J models, which were introduced between the years 1932 and 1937. The K8 model was later introduced around 1941. Some of these recorders can still be found on the secondary market today in working order. These units required specialty blank records in various 6″, 8″ and 10″ sizes and sported 4 holes in the center. This home recording lathe system recorded at either 78 or 33⅓ speed. In 1934, these recorder lathes cost around $400, equivalent to well over $2,000 today. By 1941, the price of the recorders had dropped to between $75 and $200. The blanks cost around $16 per disc during the 30s. That $16 then is equivalent to around $290 today. Just think about recording random noises on a $290 blank disk? Expensive.

Finally, it is worth discussing Walt Disney’s contribution to audio recording during the late 30s and into 1940. Fantasia (produced in 1939, released in 1940) was the first movie to sport a full stereo soundtrack. This was achieved through the use of a 9 track optical recorder. These 9 optical tracks were mixed down to 4 optical tracks for use when presenting the audio in a theater. Optical audio recording and playback is the method a sprocket film projector uses to play back audio through theater sound equipment (see 1920s above), prior to the introduction of magnetic analog audio and later digital audio in the 90s. Physically running along side the 35mm or 70mm film imagery, an audio track runs vertically throughout the entire length of the film. The audio track is run through a separate audio decoder and amplifier at the time the projector is flipping images.

To operate the Fantasia film with stereo in 1940, a theater would need two projectors running simultaneously. The first projector ran the visual film image and that film also contained one mono optical audio track (for backup or for theaters running Fantasia only in mono). The second “stereo” projector ran four (4) optical tracks consisting of the left, right and center audio tracks (technically, a 3.0 sound system). The fourth track managed an automated gain control to allow for fades as well as volume increase and decrease in the audio. This system was dubbed Fantasound by Disney. Note that Fantasound apparently employed an RCA compression system to make the audio sound better and keep the audio volumes consistent (not too loud, not to low volume) while watching Fantasia. At the time when shellac recordings were common, seeing a full color and stereo feature in the theater would have been a technical marvel.

Disney’s Fantasia vs Wizard of Oz

It is worth pausing here to discuss the technical achievement of both Walt Disney and MGM in sound recording and reproduction. Walt Disney contributed greatly to the advancement of theatrical film audio quality and stereo films. Fantasia (produced in 1939, released in 1940) was the first movie to sport a full stereo soundtrack in a theater. This was achieved through the use of a 9 track optical recorder when recording the original music soundtrack. These 9 optical tracks were then mixed down to 4 optical tracks for use when presenting the audio in a theater. According to Wikipedia, the Fantasia orchestra was outfitted with 36 microphones, these 36 mics were condensed down into the aforementioned 9 (less, actually) optical audio tracks when recorded. One of these 9 tracks was a click track for animators to use when producing their animations.

To explain optical audio a bit more, optical audio recording and playback is the method a sprocket film projector uses to playback audio through theater sound equipment. This optical audio system remained in use prior to the introduction of digital audio in the 90s. Physically running along side the 35mm or 70mm film reel imagery, there is an optical, but analog audio track that runs vertically throughout the entire length of the film. There have been many formats for this optical track. The audio track is run through a separate analog audio decoder and amplifier at the same time the projector is flipping through images.

For a theater operator to operate the Fantasia film in stereo in a theater in 1940, a theater would need two projectors running simultaneously along with appropriate left, right and center speakers, speaker amplifiers, speakers hidden behind the screen and, in the case of Fantasound, speakers mounted in the back of the theater. The first projector would present the visual film image on the screen, while that film reel also contained one mono optical audio track (used for backup purposes or for theaters running the film only in mono). The second “stereo” projector ran four (4) optical tracks consisting of the left, right and center audio tracks (likely the earliest 3.0 sound system). The fourth track managed an automated gain control to allow for fades as well as automated audio volume increase and decrease. This stereo system was dubbed Fantasound by Disney. At the time when mono shellac recordings were common in the home, seeing a full color and stereo motion picture in the theater in November of 1940 would have been a technical marvel.

Let’s pause here to savor this incredible Disney cinema sound innovation moment. Consider that it’s 1940, just barely out of the 30s. Stereo isn’t even a glimmer in the eye of record labels as yet and Walt Disney is outfitting theaters with basically what would be considered today’s modern multichannel audio theater standard (as in 1970s or newer) stereo type sound system. Though Cinerama, a 7 channel audio standard, would land in theaters as early as 1952 featuring the documentary film This Is Cinerama, it wouldn’t be until 1962’s How The West Was Won that the theater goers actually got a full scripted feature film using Cinerama’s 7 channel sound system. In fact, Disney’s Fantasound basically morphed into what would become Cinerama, using three synchronized projectors like Fantasound, but Cinerama used multiple projectors for a different reason than for multichannel sound.

Cinerama also gave pause to sound recording for film. It made filmmakers invest in more equipment and microphones to ensure that all 7 channels were recorded so that Cinerama could be used. Clearly, even though the technology was available for use in Cinemas, filmmakers didn’t exactly embrace this new audio technology as readily as theater owners were willing to install it. Basically, it wouldn’t be until the 60s and on into 70s that Cinerama and the later THX and Dolby sound systems became in common use in cinemas. Disney ushered the idea of stereo in theaters in in the 40s, but it took nearly 30 years for the entire film industry to embrace it, including easier and cheaper ways to achieve it.

Disney’s optical automated volume gain control track foreshadows Disney’s use of animatronics in its own theme parks beginning in the 1960s. Even though Disney’s animatronics use a completely different mechanism of control, the use of an optical track to control automation of the soundtrack’s volume in 1939 was (and still is) completely ingenious. Though, this entire optical stereo system, at a time when theaters were still running monophonic motion pictures, was also likewise quite ingenious (and expensive).

Unfortunately, Fantasia’s stereo run in theaters would be short, with only 11 roadshow runs using the Fantasound optical stereo system. The installation of Fantasound required a huge amount of equipment, including installation of amplifiers, speakers behind the screen and speakers in the back of the theater. In short, it required the equipment that modern stereo theaters require today. See the link just above for more details on this.

Consider also that the Wizard of Oz, which was released in 1939 by MGM, was also considered a technical marvel for its Technicolor process, but this musical film was released to theaters in mono. Though this film’s production did record most, if not all, of the audio for the Wizard of Oz on a multitrack recorder during filming, which occurred between 1938 and 1939. It wouldn’t be until 1998 when The Wizard of Oz’s original 1939 recorded multitrack audio was restored and remastered in stereo, finally giving The Wizard of Oz its full stereo soundtrack from its original 1930s on-set multitrack recordings.

Here’s Judy Garland singing Over the Rainbow from the original multitrack masters recorded in 1939, even though this song wasn’t released in stereo until 1998 after this film’s theatrical re-release. Note, I would embed the YouTube video inside this article, but this YouTube channel owner doesn’t allow for embedding. You’ll need to click through to listen.

As a side note, it wouldn’t be until the 1950s when stereo becomes commonplace in theaters and until the late 50s when stereo records also become available for home use. In 1939, we’re many years away from stereo audio standards. It’s amazing then that, between 1938 and 1939, MGM had the foresight to record this film’s audio using a multitrack recorder during filming sessions, in addition to MGM’s choice of employing those spectacular Technicolor sequences.

1940s

In addition to Disney’s Fantasia, the 1940s were punctuated by World War II (1939-1945), the holocaust (1933-1945) and the atomic bomb (1945). Because of the Great Depression and the frugality beginning in 1929 and lasting through the late 1930s, this frugality moved into the 1940s, in part because of left over anxieties from the Great Depression, but also because of the wartime necessity to ration certain types of items including sugar, tires, gasoline, meat, coffee, butter, canned goods and shoes. This rationing led housewives to be much more frugal in other areas of life including hairstyles and dress… also because the war surged the prices of some consumer goods.

Thus, this frugality influenced fashion and also impacted sound recording equipment manufacturing, most likely due to the early 1940s wartime efforts requiring manufacturers to convert to making wartime equipment instead of consumer goods. While RCA continued to manufacture microphones in the mid 40s (mostly after the war), a number of other manufacturers also jumped into the fray. Some microphone manufacturers targeted ham operators, while others created equipment targeted at “home recordists” (sic). These consumer microphones were fairly costly at the time, equivalent to hundreds of dollars today.

Some of 1940’s microphones sported a slider switch which allowed moving the microphone from uni-directional to bi-directional to omni-directional. This meant that the microphone could be used in a wide array of applications. For example, both RCA’s MI-6203-A and MI-6204-A microphones (both released in 1945) offered a slider switch to move between the 3 different pickup types. Earlier microphones, like RCA’s 44A, required opening up the microphone down to the main board and moving a “jumper” to various positions, if this change could be performed at all. Performing this change was inconvenient and meant extra setup time. Thus, the slider in the MI-6203 and MI-6204 made performing this change much easier and quicker. See, it’s the small innovations!

During the 1940s, both ASCAP and, later, BMI (music royalty services aka performing rights organization or PRO) changed the face of music. In the 1930s, most music played on broadcast programs had been performed by a live studio orchestra, employing many musicians. During the 1940s, this began to change. As sound reproduction became better sounding, these better quality sound recordings led broadcasters to using prerecorded music over live bands during broadcast segments.

This put a lot of musicians out of work, musicians who would have otherwise continued gainful employment with a radio program. ASCAP (established in 1914 as a PRO) tripled its royalties for broadcasters in January of 1941 to help out these musicians. In retaliation for these higher royalty costs to play ASCAP music, broadcasters dropped using ASCAP music from its broadcasts, instead choosing public domain music and, at the time, unlicensed music (country, R&B and Latin). Disenchanted by ASCAP’s already doubled fees in 1939, broadcasters created their own PRO organization, BMI in 1939 (acronym for Broadcast Music Incorporated). This meant that music placed under the BMI royalty catalog would either be free to broadcasters and/or supplied at a much lower cost than that music licensed by ASCAP.

This tripling of fees in 1941 and, subsequent, dropping of ASCAP’s catalog by broadcasters put a hefty dent in ASCAP’s (and its artist’s) bottom line. By October of 1941, ASCAP had reversed its tripled royalty requirement. During this several month period in 1941, ASCAP’s higher fees helped to popularize genres of music which were not only free to broadcasters, but these genres were now being introduced to unfamiliar new listeners. Thus, these musical genres which typically did not get much air play prior, including country, R&B and Latin music, saw major growth in popularity during this time via radio broadcasters.

The genre popularity growth is partly responsible for the rise of Latin artists like Desi Arnaz and Carmen Miranda throughout the 1940s.

By 1945, many recording studios had converted away from using the lathe stylus recording turntables and began using magnetic tape to prerecord music and other audio. The lathe turntables were still used to create the final 78 RPM disc master from the audio tape master for commercial record purposes. However, broadcasters didn’t need this when using reel to reel tape for playback.

Reel to reel tape also allowed for better fidelity and better broadcast playback over those noisy 78 RPM shellac records at the time. It also cost less to use because a single reel of tape could be recorded over again and again. With tape, there is also less hiss and way less background noise, making for a more professional listening and playback experience in a broadcast or film use. Listeners couldn’t tell the difference between the live radio segments and prerecorded musical segments.

Magnetic recording and playback would also give rise to better sounding commercials, though commercial jingle producers did record commercials using 78 RPMs during that era. From 1945 until about 1982, recordings had been produced almost exclusively using magnetic tape… a small preview of things to come.

While the very first vinyl record was released in 1930, this new vinyl format wouldn’t actually become viable as a prerecorded commercial product until 1948, when Columbia introduced its first 12″ 33⅓ RPM microgroove vinyl long playing (LP) record. CBS / Columbia was aided in producing this new format by the aforementioned Presto company who helped CBS develop the vinyl format. Considering Presto’s involvement with and innovation of its own line of lathe recorders, Columbia leaning on Presto was only natural. This Columbia LP format would eventually replace the shellac 78 RPMs in short order.

At around 23 minutes per side, the vinyl LP afforded a musical artist with about 46 minutes of recording time. This format quickly became the standard for releasing new music, not only because of the format’s ~46 minutes of running time, but also because it offered way less surface noise than when using shellac 78s. Vinyl records were also slightly less brittle than shellac records, giving them a bit more durability over shellac records.

By 1949, RCA had introduced a 7″ version of this 33⅓ microgroove vinyl format intended for use with individual (single) songs… holding around 4-6 minutes per side. These vinyl records at the time were still all monaural / monophonic. Stereo wouldn’t become available and popular until the next decade.

Note that Presto Recording Corporation continued to release and sell both portable, professional and home lathe recorders during the 1940s and on into the 50s and 60s. Unfortunately, the Presto Recording Corporation closed its doors in 1965.

1950s

By the 1950s, some big audio changes are in store; changes that Walt Disney helped usher in with Fantasia in 1940. Long past the World War II weary 1940s and the Great Depression ravaged 1930s, the 1950s had become a new prosperous era in American life. Along with this new prosperous time, fashion rebounded and so too did the musical recording industry and the movie theater industry. So too did musical artists who now began focusing on a new type of music, rock and roll. As a result of this new musical genre, recording this new genre needed some recording changes.

Because the late 1940s and early 1950s ushered in the new filmed musical format, many in Technicolor (and one in stereo in 1940), this led to audio advancements in theaters. Stereo radio broadcasts wouldn’t be heard until the 60s and stereo TV broadcasts wouldn’t begin until the early 80s, but stereo would become common place in theaters during the 1950s, particularly due to these musical features and the pressures placed on cinema by the television.

Musical films like Guys and Dolls (1955) were released in stereo along with earlier releases like Thunder Bay (1953) and House of Wax (1953). Though, it seems that some big musicals, like Marilyn Monroe’s Gentlemen Prefer Blondes (1952) was not released in stereo.

This means that stereo film recording for some films in the early 50s was haphazard and depended entirely on the film’s production. Apparently, not all film producers placed value in having stereo soundtracks for their respective films. Blockbuster films, many including Marilyn Monroe, didn’t include stereo soundtracks. However, lower budget horror and suspense films did include them, probably to entice moviegoers in for the experience.

By 1957, the first stereo LP album is released, which ushers in the stereophonic LP era. Additionally, by the late 1950s, most film producers began to see the value in recording stereo soundtracks for their films. No longer was it vogue to produce mono soundtracks for films. At this point, producers choosing to employ mono soundtracks did so out of personal choice and artistic merit, like Woody Allen.

Here’s a vinyl monophonic version of Frank Sinatra’s Too Marvelous for Words recorded for his 1956 album Songs for Swingin’ Lovers. Notice the telltale pops and clicks of vinyl. Even the newest untouched vinyl straight from the store still had a certain amount of surface noise and pops. Songs for Swingin’ Lovers was released one year before stereo made its vinyl debut. Though, Sinatra would still release a mono album in 1958, his last monophonic release entitled Only the Lonely. Sinatra may have begun recording of the Only the Lonely album in late 1956 using monophonic recording equipment and likely didn’t want to release portions of the album in mono and portions in stereo. Though, he could have done this by making side 1 mono and side 2 stereo. This gimmick would have made a great introduction to the stereo format for his fans and likely helped to sell even more copies.

This song is included to show the recording techniques being used during the 1950s and what vinyl typically sounded like.

Cinerama

In the 1950s, Cinema had the most impact on audio reproduction and recording. Because of Disney’s 1940 Fantasound process, this invention led to the design of Cinerama. A more simplified design by Fred Waller modified from his previous ambitious multi-projector installations. Waller had been instrumental in creating training films for the military using multi-projector systems.

However, in addition to the 3 separate, but synchronized images projected by Cinerama, the audio was also significantly changed. Like Disney’s Fantasound, Cinerama offered up multichannel audio, but in the form of 7 channels, not 4 like Fantasound. Cinerama’s audio system design was likely what led to the modern versions using DTS, Dolby Digital and SDDS sound. Cinerama, however, wasn’t intended to be primarily about the sound, but about the picture on the screen. Cinerama was intended to provide 3 projected images across a curved screen and provide that curved widescreen imagery seamlessly (a tall order and it didn’t always work properly). Cinerama was only marginally intended to be about the 7 channel audio. The audio was important to the visual experience, but not as important as the 3 projectors driving the imagery across that curved screen.

Waller’s goal was to discard the old single projector ideology and replace it with a multi-projector system akin to having peripheral vision. The lenses used to capture the film images were intended to be nearly the same focal length as the human eye in an attempt to be as visually accurate as possible and give the viewer an experience as though they were actually there, though the images were still flat, not 3D.

While Waller’s intent was to create a ground breaking projection system, the audio system employed is what withstood the test of time and what drove change in the movie and cinema sound industries. Unlike Fantavision, which used two projectors, one for visuals and one for 4 channel sound using optical tracks, Cinerama’s sound system used a magnetic strip which ran the length the film. This magnetic strip held 6 channels of audio with the 7th channel provided by the mono optical strip. Because Cinerama had 3 simultaneous projectors running, the Cinerama system could have actually supported 21 channels of audio information.

However, Cinerama settled on 7 audio channels, likely provided by the center projector. Though, information about exactly which of the three projectors provided the 7 channels of audio output is unclear. It’s also entirely possible that all 3 film reels held identical audio content for backup purposes. If one projector’s audio dies, one of the other two projectors could be used. The speaker layout for the 7 channels was five speakers behind the screen (surround left, left, center, right, surround right), two speaker channels on the walls (left and right or whatever channels the engineer feeds) and two channels in the back of the theater (again whatever the engineer feeds). There may have been more speakers than just two on the walls and in the rear, but two channels were fed to these speakers. The audio arrangement was managed manually by a sound engineer who would move the audio around the room live while the performance was running to enhance the effect and provide surround sound features. The 7 channels were likely as follows:

Left
Right
Center
Surround Left
Surround Right
Fill Left
Fill Right

Fill channels could be ambient audio like ocean noises, birds, trees rustling, etc. These ambient noises would be separately recorded and then mixed in at the appropriate time during the performance to bring more of a sense of realism to the visuals. Likely, the vast majority of the time, the speakers would provide the first 5 channels of audio. I don’t believe that this 7 channel audio system supported a subwoofer. Subwoofers would arrive in theaters later as part of the Sensurround system in the mid 1970s. Audio systems used in Cinerama would definitely influence later audio systems like Sensurround.

The real problem with Cinerama wasn’t its sound system. It was, in fact, its projector system. The 3 synchronized projectors projected separately filmed, but synchronized visual sequences. As a result, the three projected images overlapped each by a tiny bit. As a result of this overlap, both the projector played tricks to keep that line of overlap as unnoticeable as possible. While it mostly worked, the fact that 3 cameras were used that weren’t 100% perfectly aligned when filming led to problems with certain imagery on the screen. In short, Cinerama was a bear to use as a cinematographer. Very few film projects wanted to use the system due to its difficulty of filming scenes and it was even more difficult to make sure the scene appeared proper when projected. Thus, Cinerama wasn’t widely adopted by filmmakers nor theater owners. Though, the multichannel sound system was definitely something that filmmakers were interested in using.

Ramifications of Television on Cinema

As a result of the introduction of NTSC Television in 1941 and because of TV’s wide and rapid adoption by the early 1950s, the cinema industry tried all manner of gimmicky ideas to get people back into cinema seats. These gimmicks included, for example, Cinerama. Other in-cinema gimmicks included 3D glasses, smell-o-vision, mechanical skeletons, rumbling seats, multichannel stereo audio and even simple tricks like Cinemascope… which used anamorphic lenses to increase the width of the image instead of requiring multiple projectors to provide that width. The 50s were an era of endless trial and error cinema gimmicks in an effort to get people back into the cinema. None of these gimmicks furthered audio recording much, however.

Transition between Mono and Stereo LPs

During the 1960s, stereophonic sound would become an important asset to the recording industry. Many albums plastered the words “Stereo”, “Stereophonic” or “In Stereophonic Sound” written largely across parts of the album cover. Even the Beatles fell into this trap with a few of their albums. However, this marketing lingo was actually important at the time.

During the late 50s and into the early 60s, many albums were dual released both as monophonic and as a separate stereophonic release. These words across the front the album were intended to tell the consumer which version they were buying. This marketing text was only needed while the industry kept releasing both versions to stores. And yes, even though the words do appear prominently on the cover, some people didn’t understand and still bought the wrong version.

Thankfully, this mono vs stereo ambiguity didn’t last very long in stores. By the mid-1960s nearly every album released had converted to stereo, with very few being released in mono. By the 70s, no more mono recordings were being produced, except when an artist chose to make the recording mono for artistic purposes.

No longer was the consumer left wondering if they had bought the correct version, that is until 1976’s quadrophonic releases began… but that discussion is for the 70s. During the late 50s and early 60s, some artists were still recording in mono and some artists were recording in stereo. However, because many consumers didn’t yet own stereo players, record labels continued to release mono versions for consumers with monophonic equipment. It was assumed that stereo records wouldn’t play correctly on mono equipment, even though they played fine. Eventually, labels wised up and recorded the music in stereo, but mixed down to mono for some of the last monophonic releases… eventually abandoning monophonic releases altogether.

1960s

By the 1960s, big box recording studios were becoming the norm for recording bands like The Beatles, The Rolling Stones, The Beach Boys, The Who and vocalists like Barbra Streisand. These new studios were required to produce the professional and pristine stereo soundtracks on vinyl. This required heavy use of multitrack mixing boards. Here’s how RCA Studio B’s recording board looked when used in 1957. Most state of the art studios, at the time, would have used a mixing board similar to this one. The black and white picture shown on the wall behind this multitrack console depicts a 3 track mixing board, likely in use prior to the installation of this board.

Photo by cliff1066 under CC BY 2.0

RCA Studio B became a sort of standard for studio recording and mixing throughout the early to mid 1960s and even into the late 1970s. While this board may accept many input channels, the resulting master recording may record only as few as two tracks to as many as eight tracks through the mid-60s. It wouldn’t be until the late 60s that magnetic tape technologies would improve to allow recording 16 channels and then later to 24 channels by the 1970s.

Note that many modern mixing boards in use today resemble the layout and functionality of this 1957 RCA produced board, but these newer systems support more channels as well as effects.

Microphones of the 1960s also took to being majorly improved once again. No longer were microphones simply utilitarian, now they were being sold for luxury sound purposes. For example, Barbra Streisand almost exclusively recorded with the Neumann M49 microphone (called the Cadillac of Microphones, with a price tag to match) throughout the early 60s. In fact, this microphone became her staple. Whenever she recorded, she always requested a very specific serial number for her Neumann M49 from the rental service. She felt that this microphone specifically made her voice sound great.

However, part of the recording process was not just the microphone that Barbra Streisand used. It was also the recording equipment that Columbia owned at the time. While RCA’s studios made great sounding records, Columbia’s recording system was well beyond that. Barbra’s recordings from the 60s sound like they could have been recorded today on digital equipment. To some extent, that’s partially true. Barbra’s original 1960s recordings have been cleaned up and restored digitally. However, you have to have an excellent product from which to start to make it sound even better.

Columbia’s recordings of Barbra in the 60s were truly exceptional. These recordings were always crystal clear. Yes, the clarity is attributable to the microphone, but also due to Columbia’s high quality recording equipment, which was leaps and bounds ahead of other studios at the time. Not all recording systems were as good as what Columbia used, as evidenced by the soundtrack to the film Hello Dolly (1969) which Barbra recorded for 20th Century Fox. This recording is more midrangy, less warm and not at all as clear as the recordings Barbra made for Columbia records.

There were obviously still pockets of less-than-stellar recording studios recording inferior material for film and television, even going into the 1970s.

Cassettes and 8-Tracks

During the early 1960s and specifically in 1963, a new audio format was introduced in the Compact Cassette, otherwise known as simply a cassette tape. The cassette tape would go on to rival that of the vinyl record and have a commercial life of its own, which is still in diminished use to this day. Because the cassette didn’t rely on a stylus moving, there were way less constraints on the bass that could be laid down into it. This meant that cassettes ultimately had better sonic capabilities than vinyl.

In 1965, the 8-track or Stereo 8 format was introduced, which became extremely popular for use inside of vehicles initially. Eventually, though, the cassette tape and eventually the multi changer CD would replace 8 track systems in car stereos. Today, CarPlay and similar Bluetooth systems are the norm.

The Stereo 8 Cartridge was created in 1964 by a consortium led by Bill Lear, of Lear Jet Corporation, along with Ampex, Ford Motor Company, General Motors, Motorola, and RCA Victor Records (RCA – Radio Corporation of America).

Quote Source: Wikipedia

1970s

The 1970s were punctuated by mod clothing, bell bottom jeans, Farrah Fawcett feathered hair, drive-in movies and leisure suits. Coming out of the psychedelic 1960s, these bright vibrant colors and polyester knits led to some rather colorful, but dated rock and roll music (and outfits) to go along.

Though, drive-in theaters appeared as early as the 1940s, drive-in theaters would ultimately see their demise in late 1970s, primarily due to urban sprawl and the rise of malls. Even still, drive-in theaters wouldn’t have lasted into the multitrack 7.1 sound era of rapidly improving cinema sound. There is no way to reproduce such incredible surround sound within the confines of automobiles of the era, let alone today. The best that drive-in theaters offered was either a mono speaker affixed to the window or tuning into a radio station with the radio, which might or might not offer stereo sound, usually not. The sound capabilities afforded by indoor theaters, coupled with year round air conditioning, led people indoors to watch films any time of the day and all year round rather than watching movies in their cars only at night and when weather permitted. Thus, brutally cold winters don’t work well for drive-in theater viewing.

By the 1970s, sound recording engineers were also beginning to overcome the surface noise and sonic capabilities of stereo vinyl records, making stereo records sound much better. During this era, audiophiles were born. Audiophiles are people who always want the best audio equipment to make their vinyl records sound their absolute best. To that end, audio engineers pushed vinyl’s capabilities to its limits. Because diamond needles must travel through a groove to playback audio, if the audio gained too much thumping bass or volume, it could jump the needle out of its track and cause skipping.

To avoid this turntable skipping problem, audio engineers had to tune down the bass and volume when mastering for vinyl. While audio engineers could create two masters, one for vinyl and one for cassette, that almost never happened. Most sound engineers were tasked to create one single audio master for a musical artist and that master was strictly geared towards vinyl. This meant that a prerecorded cassette got the same audio master as the vinyl record, instead of a unique master created for the dynamic range available on a cassette.

Additionally, cassettes came in various formulations. From ferric oxide to metal (Type I to Type IV). There were eventually four different cassette tape formulations available to consumers, all of which commercial producers could also use when producing commercial duplication. However, most commercial producers opted to use Type I or Type II cassettes (the least costly formulations available). These were also available all the way through the 1970s. Type IV was metal and could produce the best sound available due to its tape formulation, but didn’t arrive until late in the 1970s.

8-tracks could be recorded, but there was essentially only one tape formulation. These recorders began appearing in the 1970s for home use. It was difficult to record an 8-track tape and sometimes more difficult to find blanks. Because each tape program was limited in length, you must make sure the audio doesn’t gap over from one track to the next or else you’ll have a jarring audio experience. With audio cassettes, this was a bit easier to avoid. Because 8-tracks had 4 stereo programs, each of the 4 stereo program segments is fairly short. Because the entire 8-track tape is 80 minutes, that would be 20 minutes per stereo track. It ends up more complicated for the home consumer to divide their music up into four 20 minute segments than it is to manage a 90 minute cassette with 45 minutes on each side.

Because a vinyl record only holds about 46 minutes, that length became the standard for musical artists until the CD arrived. Even though cassettes could hold up to 90 minutes of content, commercially produced prerecorded tapes held only the amount of tape need to match the 46 minutes of content available on vinyl. In other words, musical artists didn’t offer extended releases on cassettes during the 70s and 80s. It wouldn’t be until the CD arrives that musical artists were able to extend the amount of content they could produce.

As for studio recording during the 1960s and 1970s, most studios relied on Ampex or 3M (or similar professional quality) 1/2 inch or 1 inch multitrack tape for recording musical artists in the studio. Unfortunately, many of these Ampex and 3M branded tape formulations ended up not archival. This led to degradation (i.e., sticky-shed syndrome) in some of these audio masters 10-20 years later. The Tron Soundtrack, recorded in 1982 on Ampex tape, degraded in the 1990s to the point that the tape needed to be carefully baked in an oven to reaffix and solidify the ferric coating. After it had been carefully baked, there were effectively a few limited shots at re-recording the original tape audio onto a new master. It’s possible a baked master could also be played a few times onto several masters. Some Ampex tape audio master recordings may have been completely lost from the lack of being archival. Wendy Carlos explains in 1999 what it took to recover the masters for the 1982 Tron soundtrack.

Thankfully, cassette tape gluing formulations didn’t seem to suffer from sticky-shed syndrome like the some formulations of Ampex and 3M professional tape masters did. It also seems that 8-track tapes may have been immune to this problem as well.

For cinematic films made during the 1970s, almost every single film was recorded and presented in stereo. In fact, George Lucas’s Star Wars in 1977 ushered in the absolute need for stereo soundtracks in summer blockbusters to direct action sequences of the shots timed to orchestral music. Musical cues timed to each visual beat has now become a staple in filmmaking since the first Star Wars in 1977. While the recording of the music itself is much the same as it was in the 60s, the use of this orchestral music timed to visual beats became the breakthrough feature of filmmaking in the late 1970s and beyond. This musical beat system is still very much in use today in every summer blockbuster.

As for vinyl records and tapes of the 70s, surface noise and hiss is always a problem. To counter this problem, cassettes employed Dolby noise reduction techniques almost from the start. Commercially prerecorded tapes are encoded with a specific type of noise reduction. The associated player would need to be set on the same reduction type to reduce the inherent noise via that noise reduction. Setting a tape on the wrong noise reduction setting (or none at all) could cause the high end to be lost or, in many cases, for the audio playback to distort. For tapes, the most commonly used noise reduction was Dolby B, with the occasional use of Dolby C. Though, tapes could be encoded with Dolby A, B, C or S. The most commonly sold noise reduction for commercially prerecorded music cassette tapes was Dolby B, which began around 1965, but which remained in use throughout the 70s and 80s.

DBX For vinyl, most vinyl albums didn’t offer or include noise reduction systems at all. However, starting around 1971, a relatively small number of vinyl releases were sold containing the DBX encoding noise reduction system. The discs were signified with the DBX encoded disc notation. This system, like Dolby’s tape noise reduction system, requires a decoder to playback the vinyl properly. Unfortunately, no turntables or amplifiers sold, that Randocity is aware, had a built-in DBX decoder. Instead, you had to buy and then inline a separate DBX decoder component in your Stereo Hi-Fi chain of devices, like the DBX model 21 decoder. DBX vinyl noise reduction was not just noise reduction, however. It also changed the audio dynamics of the recorded vinyl groove. DBX grooved disks thinned out and reduced the sonics and dynamics dramatically, making listening to a DBX encoded vinyl disc without a decoder nearly impossible. The DBX decoder would uncompress these compressed and thinned tracks back into their original sonically and suitably dynamic audio range.

To play a DBX encoded vinyl disk back properly, it required buying a DBX decoder component (around $150-$200 in addition to the cost of an amplifier, speakers and a turntable). This extra cost was for only a handful of vinyl disks, though. Not really worth the investment. DBX is unlike Dolby B reduction on tape, which if Dolby B is not decoded, still sounded relatively decent sonically even without the noise reduction enabled. DBX encoded vinyl discs are almost impossible to listen to without a decoder. For this reason, it’s likely why only very few vinyl discs were released encoded with DBX. However, if you were willing to invest in a DBX decoder component, the high and low ends were said to sound much better than a standard vinyl disc containing no noise reduction. The DBX system expanded and played these dynamics better, but probably not as full a sound as a CD can reproduce. DBX encoded vinyl likely meant that a fully remastered or at least better equalized version of the vinyl master was produced for these specific vinyl releases.

With that said, Dolby C and Dolby S are more like DBX when reproducing dynamics than Dolby A and B, which these first two were strictly noise reduction, not offering dynamic enhancement. These noise reduction techniques are explained in this section under the 1970s area because this is where they rose to their most prominent use, moreso on cassettes than on vinyl. Of course, these noise reduction techniques are not needed on a CD format, which is yet to come during the 80s.

For professional audio recording, in 1978, 3M introduced the first digital multitrack recorder for professional studio use. This recorder used one inch tape for recording up to 32 tracks. However, it priced in at an astonishing $115,000 (32 tracks) and $32,000 (4 tracks), which only a professional recording studio could afford. Studios like Los Angeles’s A&M Studios, The Record Plant and Warner Brother’s Amigo Studios all installed this 3M system.

Around 1971, IMAX was introduced. While this incredibly large screen format didn’t specifically change audio recording or drastically improve audio playback in the cinema, it did provide a much bigger screen experience which has endured to today. It’s included here to be complete for the 70s, but not so much for its improvements to audio recording, though it did improve film requirements for filmmakers.

For advancements in cinema sound, the 1970s saw the introduction of Sensurround. While there weren’t many features that supported this cinema sound system, it was mostly for good reason. The gimmick primarily featured a huge rumbling, theater shaking subwoofer (or several) aimed directly at the audience from below the screen. Nevertheless, subwoofers have since become common and have even endured as a constant in theaters since the introduction of Sensurround, just not to the degree of Sensurround. Like the 50s near endless gimmicks to drive people back into the theaters, the 1970s tried a few of these gimmicks such as Sensurround to also captivate and drive people back into theaters.

Earthquake Sensurround

In case you’re really curious, a few film features supporting Sensurround were Earthquake (1974), the Towering Inferno (1974) and Battlestar Galactica (1978). The Sensurround experience was interesting, but the thundering, rattling subwoofer bass was, at times, more of a distraction than it added to the film’s experience. It’s no wonder why it only lasted through the 70s and why only a few filmmakers used it. Successor cinema sound systems include DTS, Dolby Digital and SDDS, while THX ensured proper sound reproduction to ensure those rumbling, thundering bass segments can be properly heard (and felt).

Digital Audio Workstations

Let’s pause here to discuss a new audio recording methodology introduced as a result of the advent of digital audio… more or less required for producing the CD. As a result of digital audio recorders becoming available in the late 70s and early 80s and with accessibility of easy to use computers now dawning, the DAW or digital audio workstation is born. While computers in the late 70s and early 80s were fairly primitive, the introduction of the Macintosh computer (1984) with its impressive and easy to use UI made building and using a DAW much easier. It’s probably a little early to discuss DAWs during the late 70s early 80s here, but because it factors into nearly every type of digital audio recording prominently during the late 80s, 90s, 00s and beyond, the discussion is placed here.

Moving into the late 80s with even easier UI based computers like the Macintosh (1984), Amiga (1985), Atari ST (1985), Windows 3 (1990) and later Windows 95 (1995), DAWs became even more available, accessible and usable by the general public. With the release of Windows 98 and newer Mac OS systems, the DAW systems became even more feature rich, commonplace and easy to use, ultimately targeting home musicians.

Free open source products like Audicity, which first released in 1999, also became available. By 2004, Apple would include its own DAW, GarageBand, with its own Mac OS X and iOS operating systems. Acid Music Studio by Sonic Foundry, another home consumer DAW, was introduced in 1998 for Windows. This product and Sonic Foundry would subsequently be acquired by Sony, but then was later sold to Magix in 2016.

Let’s talk more specifically about DAWs for a moment. The Digital Audio Workstation was a ground breaking improvement over editing using analog recording technologies. This digital visual editing system allows for much easier digital audio recording and editing than any previous system before it. With tape recording technologies of the past, to move audio around required juggling tapes by re-recording and then overdubbing on top of existing recordings. If done incorrectly, it could ruin the original audio with no way back. Back in the 50s, the simplest of editing which could be done with analog recordings was playing games with tape speeds and, if possible on the tape recorder itself by overdubbing.

With digital audio clips in a DAW, you can now pull up individual audio clips and place them on as many tracks as are needed visually on the screen. This means you can place drums on one track, guitars on another, bass on another and vocals on another. You can easily add sound effects to individual tracks or layer them on top with simple drag, drop and mouse moves. If you don’t like the drums, you can easily swap them for an entirely new drum track or mute the drums altogether to create an acoustic type of effect. With a DAW, creative control is almost limitless when putting together audio materials. In addition, DAWs support plugins of all varying types including both digital instruments as well as digital effects. They can even be used to synchronize music to videos.

DAWs are intended to allow for mixing multiple tracks down into a stereo (2 track) mix in many different formats, including MP3, AAC and even uncompressed WAV files.

DAWs can also control external music devices, like keyboards or drum machines or any other device that supports MIDI control. DAWs can also be used to record music or audio for films allowing for easy placement using the industry standard SMPTE timing controller. This allows complete synchronization of an audio track (or set of tracks) with a film visual’s easily and, more importantly, consistently. SMPTE can even control such devices as lighting controllers to allow for live control over lighting rig automation, though some lighting rigs also support MIDI. A DAW is a very flexible and extensible piece of software used by audio recording engineers to take the hassle out of mixing and mastering musical pieces and speed up the musical creation process… even being able to use it in live music situations.

While DAWs came to existence in the early 1980s for professional use, it was the 1990s and into the 2000s which saw more home consumer musician use, especially with tools like Acid Music Studio, which based their entire DAW around managing creative loops… loops being short for looped audio clips. Sonic Foundry sold a lot of prerecorded royalty free loops which the user could use those royalty free loops in the creation of musical works. Though, if you wanted to create your own loops in Acid Music Studio using your own musical instruments, that was (and still is) entirely possible.

The point is, once the DAW became commonplace, it changed the recording industry in very substantial ways. Unfortunately, with the good so comes the bad. As technology improved with DAWs, so too did technologies to improve a singer’s vocals… thus was born the dreaded and now overused autotune vocal effect. This effect is now used by many vocalists as a crutch to make their already great voice supposedly sound even better. On the flip side, it can also be used to make bad vocalists sound passable… which is personally how it’s being used these days. I don’t personally think autotune makes vocals sound better ever, but I don’t matter when it comes to such recordings. With DAWs out of the way, let’s segue into another spurious 1970s audio technology topic…

Quadrophonic Vinyl Releases

In the early 1970s, just as stereo began to take hold, JVC and RCA corporations devised Quadrophonic vinyl albums. This format expected the home consumer to buy into an all new audio system including a quad decoder amplifier, a quad turntable, two additional speakers for a total of four and to purchase into albums that supported the quad format. This was a tall (and expensive) order. As people had just begun investing in somewhat expensive amplifiers and speakers to support stereo, JVC and RCA expected the consumer to toss all of their existing (mostly new) equipment and invest in brand new equipment AGAIN. Let’s just say that that didn’t happen. Though, to be fair, you didn’t need to buy a quad turntable. Instead you simply needed to buy a CD-4 cartridge for your turntable and have an amplifier that could decode the resulting CD-4 encoded data.

For completion, the CD-4 system offered 4 discrete audio channels: left front, left back, right front and right back. Quad was intended to be enjoyed with four speakers each placed in a square around the listener.

This hatched quad plan expected way too much of consumers. While many record labels did adopt this format and did produce perhaps hundreds of releases in quad, the format was not at all successful due to consumer reticence. The equipment was simply too costly for most consumers to toss and replace their existing HiFi equipment. Stereo remained the dominant music format and has remained so since. Though, with the advent of quad’s special stylus cartridges, it did help improve stereo recordings by improvements with styluses and higher quality vinyl formulations needed to produce the quad vinyl LPs.

Note also that while quad vinyl LP releases made their way into record stores in the early 1970s, no cassette version of quad ever became available. However, Q8 or quad 8-track tapes arrived as early as 1970, two years before the first vinyl release. Of course, 8-track tapes at the time were primarily used in cars… which would have meant upgrading your car audio system with two more speakers and a new decoder car player with four amplifiers, one for each speaker.

The primary thing that the quad format was successful at doing, at least for consumers, was muddy the waters at the record store and introduce multichannel audio playback, which wouldn’t really become a home consumer “thing” until the DVD arrived in the 1990s. However, for a consumer shopping for albums in the 1970s, it would have been super easy to accidentally buy a quad album, take it home and then realize it doesn’t play. Same problem exists for Q8 tapes; though Q8 tapes had a special quad notch that may have prevented it from playing in some players. And now, onto the …

1980s

In the 1980s, we see big hair, glam rock bands and hear new wave, synth pop and alternative music on the radio. Along with all of these, this era ushers us into the digital music era using the new Compact Disc (CD) and, of course, players. The CD, however, would actually turn out to be a couple of decade stop gap for the music and film industries. While the CD is still very much in use and available today, its need is diminishing rapidly with the likes of music services, like Apple Music. But, that discussion is for the 2010s and into the 2020s.

Until 1983, vinyl, cassettes and, to a much lesser degree, 8-track tapes were the music formats available to buy at a record store. By late 1983 and into 1984, the newfangled CD hit the store shelves, but not majorly as yet. At the same time, out went 8-Track tapes. While the introduction of the CD was initially aimed at the classical music genre, where the CD’s silence and dynamic range works exceedingly well to capture orchestral music arrangements, pop music releases would take a bit more time to ramp up. By late 1984 and into 1985, popular music eventually begins to dribble its way onto CD as record labels begin re-releasing back catalog in an effort to slowly and begrudgingly embrace this new format. Though, bands were also embracing this new format, thus new music began releasing onto the CD format faster than back catalog.

However, the introduction of the all digital CD upped the sound engineer’s game once again. Like vinyl took a while for sound engineers to grasp, so too did the CD format. Because the top and bottom sonic end of the CD is effectively unlimited, placing those masters made for vinyl onto a CD made for a lower volume and a sonically unexciting and occasionally shrill music experience.

If you buy a CD made in the mid 1980s and listen to it, you can tell the master was originally crafted for a vinyl record. The sonics are usually tinny, harsh and flat with a very low volume. These vinyl master recordings were intended to prevent the needle from skipping and relied on some of the sonics to be smoothed out and filled in by the turntable and amplifier itself. A CD needs no such help. This meant that CD sound engineers needed to find their footing on how deep the bass goes, how high the treble can get and how loud it can be. Because vinyl (and the turntable itself) tended to attenuate the audio to a more manageable level, placing a vinyl master onto CD foisted all of these inherent vinyl mastering flaws onto the CD buying public. This especially, considering the price tag of a CD was typically priced around $14.99 when vinyl records regularly sold for $5.99-$7.99. Asking a consumer to fork over almost double the price for no real benefit in audio quality was a tall order.

Instead, sound engineers needed to remix and remaster the audio to fill the audio dynamics and sonics of a CD. However, studios at the time were cheap and wanted to sell product fast. That meant existing vinyl masters instantly made their way onto CDs, only to sound thin, shrill and harsh. In effect, it introduced the buying public to a lateral, if not inferior product that all but seemed to sound the same as vinyl. The only improved audio masters being tailored for CD were many classical music artists. Pop artist older catalog titles were simply being rote copied straight onto the CD format… no changes. To the pop, rock and R&B buying consumer, the CD appeared to be an expensive transition format with no real benefit.

The pop music industry more or less screwed itself with the introduction of the CD format before it even got a foothold. By the late 80s and into the early 90s, consumers began to hear the immense difference in a CD as musical artists began recording their new material using the full dynamic range of the CD, sometimes on digital recorders. Eventually, consumers began to hear the much better sonics and dynamics capable of the CD format. However, during the initial 2-4 years after the CD was introduced, many labels released previous vinyl catalog onto CD sounding way less than stellar… dare I say, most of those CD releases sounded bad. Even new releases were a mixed bag depending on the audio engineer’s capabilities and equipment access.

Further, format wars always seem to ensue with new audio formats and the CD was no exception. Sony felt the need to introduce their smaller MiniDisc format, a lossy compressed format. While the CD format offered straight up uncompressed digital audio at 16 bit, the MiniDisc offered compressed audio akin to an MP3. The introduction of the MiniDisc (MD) meant that this was the first time a consumer was effectively introduced to an MP3-like device. While the compression on the MD wasn’t the same as MP3, it effectively produced the same result. In effect, you might actually say a MiniDisc player was the first pseudo MP3 player, but used a small optical disc for its music storage.

The CD format was not dissuaded by the introduction of the MD format. If anything, many audiophile consumers didn’t like the MD for the fact that it used compressed audio, making it sometimes sound worse than a CD. Though, many vinyl audiophiles also didn’t embrace the CD format likening it to a very cold musical experience without warmth or expression. Many vinyl audiophiles preferred and even loved the warmth that a stylus brought to vinyl when dragged across a record’s groove. I was not one of these vinyl huggers, however. When a CD fades to silence, it’s 100% silent. When a vinyl record fades to silence, there’s still audible vinyl surface noise present. The silence and dynamics alone made the CD experience golden… especially when the deep bass and proper treble sonics are mixed correctly for the CD.

The MiniDisc did thrive to an extent, but only because recorders became available early in its life along with many, many players from a lot of different companies, thus ensuring price competition. That, and the MD sported an exceedingly small size when compared to carrying around a huge CD Walkman. This allowed people to record their own already purchased audio right to a MiniDisc and easily carry their music around with them in their pocket. The CD didn’t offer recordables until much, much later into the 90s, mostly after computers became commonplace and those computers needed to use CDs as data storage devices. And yes, there were also many prerecorded MiniDiscs available to buy.

During the late 70s and into the early 80s, bands began to experiment with digital recording units in studios, such as 3M’s. In 1982, Sony introduced its own 24 track PCM-3324 digital recorder in addition to 3M’s already existing 1978 32 track unit, thus widening studio options when looking for digital multitrack recorders. This expanded the ability for artists to record their music all digital at pretty much any studio. Onto the cinema scene…

THX_logo In the early-mid 80s, a new sound theater system standard emerged in THX by LucasFilm. This cinema acoustical sound standard is not a digital audio format and has nothing to do with recording and everything to do with audio playback and sound reproduction in a specific sized room space. At the time, theaters were coming out of the 1970s with short lived audio technologies like Sensurround. In the 1970s, theater acoustics were still fairly primitive and not at all optimized for the large theater room space. Thus, many of the theater sound systems were under-designed (read installed on the cheap) and didn’t appropriately or correctly fill the room with audio, leaving the soundtrack and music, at times, hard to hear. When Star Wars: Return of the Jedi was on the cusp of being released in 1983, George Lucas took an interest in theater acoustics to ensure moviegoers could hear all of the nuanced audio as George Lucas had intended in the film. Thus, the THX certification was born.

THX is essentially a movie theater certification program that ensures that all “certified theaters” must provide an optimal audio acoustical experience for moviegoers. Like the meticulous setup of Walt Disney’s Fantasound in 1940, George Lucas likewise wanted ensure his theater patrons could correctly hear all of the nuances and music within Star Wars: Return of the Jedi in 1983. Thus, any theater that chose to certify itself via the THX standard must outfit each of their theaters appropriately to present the audio to acoustically fill the theater space correctly for all theater patrons.

However, THX is not a digital recording standard. The digital recording standards like Dolby Digital and DTS and even SDDS are all capable of supporting theaters certified for THX. Theaters certified for THX also play the Deep Note sound to signify that the theater is acoustically certified to present the feature film just to come. In fact, even multichannel analog systems such as Fantasound, if it were still available, could benefit from an acoustically certified THX theater. Further, each cinema must individually outfit each individual theater in the building to acoustically uphold the THX standard. That means that the manager of each theater must work with THX to ensure that each theater in a given megaplex adheres to the THX acoustic standard before each theater can be certified. THX means having the appropriate volume levels needed to fill the space for each channel of audio no matter where the theater patron chooses to sit within the theater.

CD Nomenclature

When CDs were first introduced, it became difficult to determine whether a musician’s music was recorded analog or digital. To combat this confusion, CD producers put 3 letters onto the cover to tell consumers how the music was recorded, mixed and mastered. For example, DDD meant that the music was recorded, mixed and mastered using only digital equipment. This likely meant a DAW was entirely used to record, mix and master. Other labels you might see included:

DAD = Digital recording, Analog mixing, Digital mastering
ADD = Analog recording, Digital mixing, Digital mastering
AAD = Analog recording, Analog mixing, Digital mastering

The third letter on a CD would always be D because every CD had to be digitally mastered regardless of how it was recorded or mixed. This nomenclature has more or less dropped away today. I’m not even sure why it became that important during the 80s, but it did. It was probably included to placate audiophiles at the time. I honestly didn’t care about this nomenclature. For those who did, it was there.

Almost all back catalog releases recorded during the 70s and even some into the 80s would likely have been AAD simply because digital equipment wasn’t yet available when most 70s music would have been recorded and mixed. However, some artists did spend the money to take their original analog multitrack recordings back to an audio engineer to convert them to digital for remixing and remastering, thus making them ADD releases. This also explains why certain CD releases of some artists had longer intros, shorter outros and sometimes extended or changed content from their vinyl release.

DAT vs CD

Sony further introduced its two-track DAT professional audio recording systems around 1987. It would be these units that would allow bands to mix down to stereo digital recordings more easily. However, Sony messed this audio format up for the home consumer market.

Fearing that consumers could create infinite perfect duplicates of DAT tapes, Sony introduced a system that would limit how many times a DAT tape could be duplicated. Each time a tape was duplicated, a marker was placed onto the duplicated tape. If a recorder detected a counter marker at the allowed max duplication number, all recorders supporting this copy protection system should prevent the tape from being duplicated again. This copy protection system all but sank Sony’s DAT system as a viable consumer alternative. Consumers didn’t understand the system, but more than this, they didn’t want to be limited by Sony’s stupidity. Thus, DAT was dead as a home consumer technology.

This at the time when MiniDisc had no such stupid duplication requirements. Sony’s DAT format silently died while MiniDisc continued to thrive throughout the 1990s. Though, to be fair, the MD’s compression system would eventually turn duplicated music into unintelligible garbage after a fair number of recompression dupes. The DAT system utilized uncompressed audio where the MD didn’t.

The stupidity of Sony was that it and other manufacturers also sold semi-professional and professional DAT equipment. The “professional” gear was not subject to this limited duplication system. Anyone who wanted to buy a DAT recorder could simply by up to semi-professional gear from any manufacturer, like Fostex, where no such copy protection schemes were enforced or used. By the time these other manufacturer’s gear became available, consumers didn’t care about the format.

A secondary problem with the DAT format was that it used helical scanning head technology, similar to the head was used in a VHS or BetaMax video system. These heads spin rapidly and can go out of alignment easily. As a result, a DAT car stereo system was likely not long term feasible. Meaning, if you hit a bump, the spinning head might change alignment and then you’ll have to readjust. Enough bumps and the whole unit might need to be fully realigned. Even the heat of scorching summer days might damage the DAT system.

Worse, helical scanning systems are subject to getting dirty quickly, in addition to alignment problems. This meant the need to regularly clean these units with a specially designed cleaning tape. Many DAT recorders would stop working altogether until you used a cleaning tape in the unit, which would reset the cleaning counter and allow the unit to function again until it needed another cleaning. Alignment problems also didn’t help the format. A recording made on one DAT unit might prevent playing the tape on another unit. Head alignment is critical between two different units. This might mean getting a tape from your friend, whose DAT machine is aligned differently from yours, that won’t play. CDs and MDs didn’t suffer from this alignment problem. What that meant was that while you could always playback DATs recorded in your own unit, a friend might not be able to play your DAT tapes in their unit at all, suffering digital noise, static, long dropouts or silence on playback.

DAT was not an optimal technology for sharing or when using outside of the home for audio. Though, some bootleggers did adopt the portable DAT recorder for bootlegging concerts. That’s pretty much no longer needed, with smartphones now taking the place of such digital recorders.

Though, Sony would more than make up for the lack of DAT being adopted as a home audio format after the computer industry adopted the DAT tape as an enterprise backup tape solution. Once DAT tape changers and libraries became common, DATs became a staple in many computer centers. All was not lost for Sony in this format. DAT simply didn’t get used for its original intended purpose, to be a home consumer digital audio format. Though, it did (and does) have a cult audiophile and bootleg following.

1990s

By the 1990s, the CD had quickly become the new staple of the music industry (over vinyl and cassettes). It was so successful, it caused the music industry to stop producing vinyl records entirely, before their recent resurgence in the 2010s for a completely different reason. Cassettes and 8-track tapes also went the way of the dinosaurs. Though, 8-tracks had been more or less gone from stores by 1983, the prerecorded cassette continued to limp along into the early 90s. Though, even newer digital audio technologies and formats are yet on the horizon, they won’t make their way into consumer’s hands until the late 1990s.

Throughout the 1990s, the CD remains the primary digital audio format of choice for commercial prerecorded music. By 1995, you could even record your own audio CDs using blanks, thanks to the burgeoning computer industry. This meant that you could now copy an audio CD or convert all of the audio tracks from a CD into MP3s (called ripping) and/or make an MP3 CD, which some later CD players could play. And yes, there were even MiniDisc car stereos available later in the decade. The rise of the USB drive also gave life to MP3s as well. This meant you could easily carry a lot more music from place to place and from computer to computer than can be held on a single CD. The MP3’s portability and downloadability along with the Internet gave rise to music downloading and sharing sites like Napster.

Though, MP3 CDs could be played in some CD players, this format didn’t really take off as a standard. This left players primarily using the audio CD as the means of playing music while in a car, thus multi-CD car changers were born. The car stereo models that supported MP3 formatted CDs would have an ‘MP3’ label printed on the front bezel near the slot where you insert a CD. No label means MP3s were not supported. Though, the rise of separate mp3 players further gave rise to car auxiliary input jacks by car manufacturers, which began because of clumsy cassette adapters. If the car stereo had only a cassette player, you would need to use a cassette adapter to plug in your 3.5mm jack equipped mp3 player. Eventually, car players would adopt the Bluetooth standard so that wireless playback could be achieved when using smart phones, but the full usefulness of that technology wouldn’t become common until many years after the 1990s. However, Chrysler took a chance and integrated its own Bluetooth UConnect system into one of its cars as early as 1999! Talk about jumping on board early!?!

Throughout the 1990s, record stores were also still very much common places to shop and buy audio CDs. By the late 1990s, the rise of DVD with its multichannel audio had also become common. Even big box electronics retailers tried to get into the DVD act with Circuit City banking on its new DiVX rental and/or purchase format, which mostly disappeared within a year of introduction. This also meant big box record stores were still available such as Blockbuster Music, Virgin Megastore, Tower Records, Sound Warehouse, Sam Goody, Suncoast, Peaches, Borders and so on. The rise of the Blockbuster Video Rental stores would eventually became defunct as VHS died over DVD, which then switched to digital streaming around the time of the Blu-ray. Some blame Netflix for Blockbuster’s demise when it was, in fact, Redbox’s $1 rental that did in Blockbuster Video stores, which were still charging $5-6 for a rental at the time of their demise.

By 1999, Diamond had introduced the Rio MP3 player. Around that same time, Napster was born (a music sharing service). The Diamond Rio was the first actual MP3 player placed onto the market, not counting Sony’s MD players. It was a product that mirrored the want of digital music downloads, which were afforded by Napster… a then music download service. I won’t get into the nitty gritty legal details, but a battle ensued between Napster and the music industry and again between Diamond (for its Rio player) and the music industry. These two lawsuits were more or less settled. Diamond prevailed, which left the Rio player on the market and allowed subsequent MP3 players to come to market, which further led to Apple’s very own iPod player being released a few years later. Unfortunately, Napster lost its battle, which left Napster mostly out of business and without much of a future until it chose to reinvent or perish.

Without Diamond paving the legal way for the MP3 player’s coming in 1999, Apple wouldn’t have been able to benefit from this legal precedent with its first iPod, released in 2001. Napster’s loss also paved the way for Apple to succeed by doing music sharing and streaming right, by getting permission from the music industry first… which Napster failed to do and was not willing to do initially. If only Napster had had the foresight to loop in the music industry initially instead of alienating them.

As for recordings made during the 90s, read the DAW section above for more details on what a DAW is and how most audio recording worked during the 90s. Going into the early 90s, traditional recording methods may have been employed, but that was quickly replaced by computer based DAW systems as Windows 98, Mac OS and other computer systems made a DAW quick and easy to install and operate. Not only is a DAW now used to record all commercial music, it is also used to prerecord audio for movies and TV productions. Live audio productions might even use a DAW to add live effects while performing live.

Though, some commercial DAW systems like Pro Tools sport the ability to control a physical mixing board’s controls. With Pro Tools, for example, the DAW shows a virtual mixing board identical to a physical mixing board attached. When the virtual mixing board controls are moved, so too does it rotate the knobs and move the sliders of the attached specific (and quite expensive) physical mixing board. While the Pro Tools demo was quite impressive, it was very expensive to buy (both Pro Tools and the supported mixing board); it was mostly a novelty. When you’re recording a specific song with live musicians, such an automated system handling a physical board might be great if you’re wanting to make sure all of the musical parts are performed live in a professional sounding way without having a sound engineer sitting there tweaking all of the controls manually. Still, moving the sliders and knobs with automation software is cool to watch, but is way overpriced and not very practical.

To be fair, though, Pro Tools was originally released at the end of 1989, but I’m still considering it a 1990s product as it would have taken until the mid-90s to mature into a useful DAW. Cubase, a rival DAW product, actually released earlier in 1989 than Pro Tools. Both products are mostly equivalent in features, with the exception of Pro Tools being able to control a physical mixing board where Cubase, at least at the time I tested it, could not.

As for cinema sound, 1990 ushered in a new digital format in Cinema Digital Sound (CDS). Unfortunately, CDS had a fatal flaw that left some movie theaters in the lurch when presenting. Because CDS replaced the optical audio track on film with a magnetic strip of digital 5.1 sound (left, right, center, S-left and S-right and low frequency effects), this left the feature (and the format) without sound if the audio strip were damaged. As a result, Dolby Digital (1992) and Digital Theater Systems (DTS — 1993) quickly became the preferred formats for presenting films with digital sound to audiences. Dolby Digital and DTS used alternative placement for the film’s digital tracks leaving the optical track available for backup audio “just in case”. For completeness, Sony’s SDDS also uses alternative placement as well.

According to Wikipedia:

…unlike those formats [Dolby Digital and DTS], there was no analog optical backup in 35 mm and no magnetic backup in 70 mm, meaning that if the digital information were damaged in some way, there would be no sound at all.

CDS was quickly superseded by Digital Theatre Systems (DTS) and Dolby Digital formats.

Source: Wikipedia

However, Sony (aka Columbia Pictures) always prefers to create its own formats so that it doesn’t have to rely on or license technology from third parties (see: Blu-ray). As a result, Sony created Sony Dynamic Digital Sound (SDDS), which saw its first film release in 1993’s The Last Action Hero. However, DTS and Dolby Digital, at the time, remained the digital track system of choice when the film was not released by Sony. Likewise, Sony typically charged a mint to license and use its technologies. Thus, producers would opt for systems that cost less in the final product if the product were not being released by a Sony owned film studio. Because Sony also owned rival film studios, many non-Sony studios didn’t want to embrace or use Sony’s technological inventions, choosing Dolby Digital or DTS over Sony’s SDDS.

Wall of Sound and Loudness Wars

Sometime in the late 1990s, sound engineers using a DAW began to get a handle on properly remastering older 80s music. This is about the time that the Volume War (aka Loudness War) began. Sound engineers began using sound compression tools and add-ons, like iZotope’s Ozone, to push audio volumes ever higher and higher, while remaining under the maximum threshold of the CD’s volume capability to prevent noticeable clipping. These remastering tools meant, at least to the subsequent remastered audio, much louder sound output than before adding compression.

Such remastering tools have been a tremendous boon to audio and artists, though Ozone didn’t really begin until middle of the 2010s. Thus, we’re jumping ahead a little. Prior to using such 2010’s tools, Cubase and Pro Tools already offered built-in compression tools that afford similar audio compression to iZotope Ozone, but which required a more manual tweaking and complexity. These built-in tools have likely existed in these products since the mid 1990s.

The Wall of Sound idea is basically pushing an audio track’s volume to the point where nearly every point in the track has the same level of volume. It makes a track difficult to listen to, offers up major ear fatigue and is generally an unpleasant sonic experience for the listener. Some engineers have pushed the compression way too far on some releases. CDs afford impressive volume differences, from the softest whisper to the loudest shout. These dynamics in music can make for tremendous artistic uses. When compression is used on pop music, all of those dynamics are lost… instead replaced by a Wall of Sound that never ends. Many rock and pop tracks fall into this category, only made worse by a tin eared, inexperienced sound engineers with no finesse over a track’s dynamics. However, sometimes it’s the band requesting the remaster and giving explicit instructions, but sometimes it’s left up to the sound engineer to create what sounds best. Either way, a Wall of Sound is never a good idea.

As a result of improving sound quality through these new mastering, this invigorated the process of remastering those old crappy-sounding, vinyl-mastered 1980 CD releases… finally giving that music the sound quality treatment it should have had when those CDs originally released in the 1980s. That, and record labels needed yet more cash to continue to operate.

These remastering efforts, unfortunately, left a problem for consumers. Because the CD releases mostly look identical, you can’t tell if what you’re buying (particularly when buying used) is the original 1980s release or the updated and remastered new release. You’d need to read the dates printed on the CD case to know if it were pressed in the 1980s or in the late 1990s. Even then, this vinyl master CD pressing problem continued into the early 1990s. It wouldn’t be until around the late 1990s or into the 2000s when the remastering efforts really began in earnest. This meant that you couldn’t assume a 1993 CD release of a 1980s album was remastered.

The only way you know if the CD is remastered is 1) buying it new and seeing a sticker making this remastering claim and 2) listening to it. Even then, some older CDs only got very minimal sound improvements (usually only volume) when remastered over their 1980s CD release. Many remasters didn’t improve the bottom or top ends of the dynamics of the music and only focused on volume… which only served to make that tinny, vinyl remaster even louder. For example, The Cars’s 1984 release, Heartbeat City, is an good example of this problem. The original release on CD had thin, tinny audio, clearly indicative that the music was originally mastered to accommodate vinyl. The 1990s and 2000s remasters only served to improve the volume, but left the music dynamics shallow, thin and tinny, with no bottom end at all… basically leaving the original vinyl remaster’s sound almost wholly intact.

A sound engineer really needed to spend some quality time with the original material (preferably from the original multitrack master) bringing out the bottom end of the drums, bass and keyboards while bringing the vocals front and center. If remastered correctly, that album (and many other 1980s albums) could sound like it was recorded on modern equipment from at least the 2000s, if not the 2010s or beyond. On the flip side, Barbra Streisand’s 1960’s albums were fully digitally remastered by John Arrias who was able to reveal incredible sonics. Barbra’s vocals are fully crisp and entirely clear along side the music backing tracks. The handling of remixing and remastering of many rock and pop bands was ofttimes handed to ham-fisted, so-called sound engineers with no digital mastering experience at all.

Where in the 1930s, it was about simply getting a recording down to a shellac 78 rpm record, in the 90s for new music, it was all about pumping up the sub-bass and making the CD as loud as possible. All of this in the later 90s was made possible by digital editing using a DAW.

MP3

Seeing as this is an article about The Evolution of Sound, this article would be remiss if it didn’t discuss and describe the MP3 format’s contribution to audio evolution. The MP3 format, or more specifically, lossy compression, was invented by Karlheinz Brandenburg, a mathematician and electrical engineer working in conjunction with various people at Fraunhofer IIS. While Karlheinz has received numerous awards for this so-called audio technological improvement, one has to wonder if the MP3 really was an improvement to audio? Let’s dive deeper.

What exactly is lossy compression? Lossy compression is an algorithmic technique by which an mathematical algorithm takes in a stream of uncompressed digital audio content and then removes and rearranges that audio to reduce or eliminate extraneous, unnecessary or repetitive segments via an encoder. When the decoder plays back the resulting compressed audio file, it recreates that audio on-the-fly based on the encoded data back into a suitably similar audio form supposedly indistinguishable from the original uncompressed audio. The idea here is to produce audio so aurally similar to the uncompressed audio that the ears cannot distinguish a difference from the original uncompressed audio content. That’s the theory, but unfortunately this format isn’t 100% perfect.

Unfortunately, not all audio is amenable to being compressed in such a way. For example, MP3 is not at all capable of producing low volume content without introducing noticeable audible artifacting. Instead of hearing only the audio as expected, the decoder also introduces a digital whine… the equivalent of analog static or white noise. Because pop, rock, R&B and Country music rely on guitars, bass and drums, keeping the volumes mostly consistent throughout the track, the MP3 format works perfectly fine for these. For orchestral music with low volume passages, the MP3 format isn’t always the best choice.

Some of this digital whine can be overcome by increasing the bit rate of the resulting file. For example, many MP3s are compressed at 128k bits per second (kbps). However, this bit rate can be increased to 320 kbps, thus reducing digital whine and increasing the overall sound fidelity. The problem with increasing bit rates is that it also increases the resulting size of the file. Thus, 320 kbps MP3 file sizes might not be that far off in size from an uncompressed .WAV file. Why suffer possible audio artifacts using the MP3 format when you can simply store uncompressed audio and avoid this?

Let’s understand why MP3s were needed throughout the 1990s. Around 1989-1990, a 1 GB sized SCSI hard drive might you cost around $1000 or more. Considering that a CD holds around 700 megabytes, you could extract the contents of about 1.5 CDs onto a 1 GB sized hard drive. If you MP3 compressed those same CD tracks, that same 1GB hard drive might be able to hold 8-10 (or more) CDs worth of MP3s. As the 1990s progressed, hard drive sizes would increase and these prices would also decrease, eventually making both SCSI and IDE drives way more affordable. It wouldn’t be until 2007 when the first 1TB sized drive launched. From 1990 through to 2007, hard drive sizes were not amenable to storing tons of uncompressed audio wave files. To a lesser degree, we’re still affected by storage sizes even today, making compressed audio still necessary, particularly when storing audio on smart phones. We’re getting too far ahead.

Because of the small storage capacities of hard drives throughout the 1990s, the need for much smaller storage of audio files was necessary, thus the mp3 was born. You might be asking, “Well, what about the MiniDisc?” It is true that Sony’s MiniDisc format also used a compressed format. Sony, however, like it always does, devised its own compression technique called ATRAC. This compression format is not unlike MP3 in terms of its design. As for specifically how Sony’s ATRAC algorithm works exactly is unknown because it is a proprietary format. Because of ATRAC’s proprietary nature, this article will not speculate on how Sony came about creating it. Suffice it to say that Sony’s ATRAC arrived 2 years after the MP3 format’s initial release in 1991. Read into that what you will.

As for the advancement of audio in the MP3 format, lossy compression has really set back audio quality. While the CD format sought to improve on audio and did so by making tremendous strides in its near 0db silence, the MP3 only sought to make audio “sound” the same as an uncompressed CD track. With the word “sound” being the key to MP3. While MP3 did mostly achieve this goal with most musical genres, the format doesn’t work for all music and all musical styles. Specifically, certain electronic music with sawtooth or exactly square wave forms can suffer. Certain passages of very low volume can also suffer under MP3’s clutches. It’s most definitely not a perfect solution, but MP3 solved one big problem, reducing the file sizes down to fit on the small data storage products available at the time.

Data Compression vs Audio Compression

Note that the compression discussed above regarding the MP3 format is wholly different from audio compression used to increase volumes and reduce clipping when remastering a track. The MP3 compression above is strictly a form a data compression, but data compression designed specifically for audio tracks. Audio volume compression used in remastering (see Loudness Wars), is not a form of data compression at all. Audio compression used in remastering is a form of analog compression and limiting. It seeks to raise volume of most of a track, but only compresses down (or lowers the volume of) the peaks that would otherwise reach above the volume ceiling of the audio media.

Remastering (music production) audio compression is intended to increase the overall volume of the audio without introducing audio clipping (clicking and static heard if audio volumes increase above the audio volume ceiling). In other words, remastered audio compression is almost solely intended to increase volumes while eliminating or introducing unwanted noises. The MP3 compression described above is solely intended to reduce file storage sizes of audio files on disc, while maintaining the audio fidelity and quality as a reasonably close facsimile to its original uncompressed audio content. Note that while audio compression techniques began in the 1930s to support radio broadcasts, the MP3 format was created in the 1990s. While both of these techniques improved during the 1990s, they are entirely separate inventions used in entirely separate ways.

For the reasons described in this section, I actually question the long term viability of the MP3 format once storage sizes become sufficiently large that uncompressed audio is the norm. MP3 wasn’t designed to improve audio fidelity at all. It was solely improved to reduce file storage sizes of compressed audio.

2000s

In the 2000s, we then faced the turn of the millennium and all of the computer problems that went along with that. With the near fall of Napster and the rise of Apple (again), the 2000s are punctuated by smart phone devices, the iPod and various ever smaller and lighter laptops.

At this point, I’d also be remiss in not discussing the rise of the video game console which has now become a new form of storytelling, like interactive cinema in its own right. These games also require audio recordings, but because they’re computer programs, they rely entirely on digital audio to operate. Thus, the importance of using a DAW to create waveforms for video games.

Additionally, the rise of digital audio and video in cinemas further pushes the envelope for audio recording. However, instead of needing a massive mixing board, audio can be recorded in smaller segments into audio files, then those files are “mixed” together using a DAW by a sound engineer, who can then play all of the waveforms back simultaneously in a mixed format. Because the sound files can be moved around on the fly, the timing can be changed, they can be added, removed, volumed up or down, have effects added, run backwards, sped up or slowed down or even duplicated multiple times to create unusual echo effects and new sounds. With video games, this can be done by the software while running live. Instead of baking down music into a single premade track, video games can live mix many audio clips, effects and sounds into a whole continuous composition live at the time the game plays. For example, if you enter a cave environment in a game, the developers are likely to apply some form of reverb onto the sound effects of walking and combat situations to mimic the sound you might experience inside of a cave environment. Once you leave the cave, that reverb effect goes away.

The flexibility of sound creation in a DAW is fairly astounding, particularly when a sound engineer is doing all of this on a small laptop on location or when connected to their desk system at an office. The flexibility of using a video game console to live mix tracks into the gameplay on the fly is even more astounding. The flexibility of using a laptop remotely on a movie set is further amazing when you need to hear the resulting recordings played back instantly with effects and processing applied.

In the 2000s, these easy to use and affordable DAW software systems opened the door up to home musicians and even professionals. This affordability made DAW systems within the reach for small musicians to create professional sounding tracks even on a limited budget. As long as the home musician was studious with their learning of the DAW software, these musicians could now produce tracks that rivaled tracks professionally recorded, mixed and mastered at an expensive studio.

While the 1930s wanted to give home users a simple way to record audio at home, this was actually achieved once DAWs like Acid Music Studio arrived and could be easily run on a laptop with minimal requirements.

Not only were DAWs incredibly important to the recording industry, but so too were small portable recording and mixing devices like the Zoom H1n. These handheld devices sport two microphones and are battery operated. The H1n supported recording 4 track inputs and could record two or four tracks simultaneously onto an SD card in various digital audio formats. These recorders also sported multiple input types in addition to the built-in microphones. While these handheld units are not technically a DAW, they do offer a few built-in minimal DAW-like tools. Additionally, the resulting audio files produced by an H1n could be imported into a DAW and used in any audio mix.

These audio recorders are incredibly flexible and can be used in a myriad of environments to capture audio clips. For on the go of capturing ambient background effects, such as sirens, water running, rain falling or cars honking, this handheld recorder offers the perfect way to do this. Its resulting audio files from the built-in microphones are always incredibly crisp and clear, but you must remain perfectly silent to not have distracting noises picked up by the incredibly sensitive microphones.

There have been a number of Zoom handy recorder products including models going back to 2007. The H1n is one of its newest models, but each of these Zoom recorder products work almost identically in recording capabilities to the earlier models.

iPhone, iPod, iPad and Mac OS X

This article would be remiss if it failed to discuss the impact the iPod, iPad and iPhone have had on various industries. However, one industry it has had very little impact on is the sound recording industry. While the iPad and iPhone do sport microphones, these microphones are not high quality. Meaning, you wouldn’t want to use the microphones built into these devices for attempting to capture professional audio.

These included microphones work fine for talking on the phone or using Facetime or for purposes where the quality of the audio coming through the microphone is unimportant. As a sound designer, you wouldn’t want to use the built-in microphone for any purposes of recording professional audio. With that said, the iPad does sport the ability to input audio channels into its lightning or USB-C ports for recording into GarageBand (or possibly other DAWs) available on iOS, but that requires hooking up an external device.

Thus, these devices, while useful for their apps and for games and other mostly fun uses, are not intended to be used for trying to record professional audio content. With that said, it is possible to record audio into GarageBand via separate audio input devices connected to an iPhone or iPad.

A MacBook is much more useful for the purposes of audio recording because these typically have several ports which could sport several input or output audio devices such as mixing boards supporting multiple audio inputs, connecting a device up like the Zoom H1n or even controlling devices via MIDI and possibly all of the above. You can even attach extensive storage space to store these resulting recorded audio files, unlike an iPad and iPhone which don’t really have these large storage options available.

While the iPad and iPhone are groundbreaking devices in some areas, audio recording, mixing and mastering is not one of those areas… that’s also because of the limited storage space on these devices combined with its lack of high quality microphones. Apple has contributed very little to the improvement and ease of professional digital audio recording with its small handheld devices. The exception here is Apple’s MacBooks and MacOS X, when using software like GarageBand, Audacity or Cubase… software that’s not easily used on an iPhone or iPad.

Let’s talk about the iPod here, but last. This device arrived in Apple’s inventory in 2001, long before the iPad or iPhone. This device was intended to be Apple’s claim to fame… and for a time, it was. This device set the tone for the future of the iPhone and iPad and even Apple Music. This device was small enough to carry, but had large enough storage capacity to hold a very large library of music while on the go. The iPod, however, didn’t really much change audio recording. It did slightly improve the quality of audio with its improvement of AAC. While AAC encoding did improve the audio quality and clarity over MP3 to a degree, the quality improvements were mostly negligible to the ears over a properly created MP3. What AAC did for Apple, more than anything, is offer a protection system to prevent users from pirating music easily when saved in Apple’s AAC format. MP3 didn’t (and still doesn’t) offer these copy protections.

AAC ultimately became Apple’s way of enticing the music industry to sign onto the Apple iTunes store as it gave music producers peace of mind knowing that iPod users couldn’t easily copy and pirate music stored in that format. For audio consumers, the perceived enhanced quality is what got some consumers to buy into Apple’s iTunes marketplace. Though, AAC was really more about placating music industry executives than about enticing consumers.

2010s

The 2010s are mostly more of the same coming out of the 2000s with one exception, digital streaming services. By the 2010s, content delivery is quickly moving from physical media towards sales of digital product over the Internet via downloads. In some cases for video games, you don’t even get a digital copy. Instead, the software runs remotely with the only pieces pumped to your system being video and audio. With streaming music and video services, that’s entirely how they work. You never own a copy of the work. You only get to view that content “on demand”.

By this point in the 2010s, DVD, blu-rays and other physical media formats are becoming quickly obsolete. This is primarily due to conversion to streaming and digital download services. Even video games are not immune to this digital purchase conversion. This means that big box retailers of the past housing shelves and shelves of physically packaged audio CDs are quickly disappearing. These brick and mortar stores are now being replaced by digital streaming services (Apple Music, Netflix, Redbox, Hulu, iTunes and Amazon Prime), yes even for video games with services like Sony’s PlayStation Now (2014) and Microsoft’s GamePass (2017). Though, it can be said that Valve’s Steam began this video game digital evolution back 2003. Sony has also decided to invest even more in its own game streaming download platform in 2022, primarily in competition with GamePass, with its facelift to PlayStation Plus Extra.

As just stated above, we are now well underway in converting from physical media to digital downloads and digital streaming. Physical media is quickly becoming obsolete, along with the retailers who formerly sold those physical media products… thus many of these retailers have closed their doors (or are in the process)… including Circuit City, Fry’s Electronics, Federated, Borders / Waldenbooks and Incredible Universe. Some of these retailers like Barnes and Noble and Best Buy are still hanging on by a thread. Because Best Buy also sells appliances, such as washers, dryers along with large screen TVs, Best Buy is somewhat diversified to not be fully reliant on the conversion from physical media to digital purchases. It remains to be seen if Best Buy can survive once consumers switch entirely to digital goods and Blu-rays are no longer available to be sold. This means that were those digital content goods to disappear tomorrow, Best Buy may or may not be able to hang on. Barnes and Noble is still in a questionable position because they don’t have other tangible goods than books. They must rely primarily on physical book sales to keep this company afloat. GameStop is also in this same situation with physical video games, though they survive primarily by selling used consoles and used games.

Technological improvements in this decade include faster computers, but not necessarily better computers as well as somewhat faster Internet, but faster networking is entirely relative to where you live. While CPUs improve in speed, the operating systems seem to get more bloated and buggier… including iOS, Mac OS X and even Windows. Thus, while the CPUs and GPUs get faster, all of that performance is soaked up almost instantly by the extra bloatware installed on the operating systems by Apple and Microsoft and Google’s Android… making an investment in new hardware almost pointless.

Audio recording during this decade doesn’t really grow as much as one would hope. That’s mainly due to services like Apple Music, Amazon Music, Pandora, Tidal and even, yes, Napster. After 1999 when Napster more or less lost its case with the music industry, it was forced to effectively change or die. Apparently, Napster decided to become a subscription service, reinventing itself. Apparently, this allowed Napster to finally get the blessing of and force royalty payments to the industry with which had lost its legal file sharing battle. Musical artists are now creating music that sells only because they have a fan base, but not because the music actually has artistic merit.

As for Napster, it all gets more convoluted. From 1999 to 2009, Napster continued to exist and grow its music subscription service. In 2009, Best Buy wanted a music subscription service for its brand and bought Napster. A couple years later, in 2011, and due primarily to money loss problems within Best Buy, Best Buy was forced to sell the remnants of Napster to the Rhapsody music service including Napster’s subscriber base along with the Napster name. In 2016, Rhapsody bizarrely renames itself to Napster… which is where we are today. The Napster that exists today isn’t the Napster from 1999 or even the Napster from 2009.

The above information about Napster is more or less included as a follow-on to the previous discussion about Napster’s near demise. This information doesn’t necessarily further the audio recording evolution, but it does tertiarily relate to the health of the music and recording industry as a whole. Clearly, if Best Buy can’t make a solid go of its own music subscription service, then maybe we have too many?

As for cinema sound, DTS and Dolby Digital (with its famous double D logo) along side THX’s acoustical room engineering became the digital standards for theater sound. Though since, audio innovation in cinema has mostly halted. This decade has been more about using the previously designed innovations than about improving the cinema experience. In fact, you would have thought that after 2019’s COVID, Cinemas would have wanted to invigorate the theater experience to get people back into the auditoriums. The only real innovation in the theater has been to seating, but not to the sound or picture improvements.

This article has intentionally overlooked the transition from analog film cameras to digital cameras (aka digital cinematography), which began in the mid 1990s and has now become quite common in cinemas. Because this transition doesn’t directly impact sound recording, it’s mentioned only in passing. Know that this transition from film to digital cameras occurred. Likewise, this article has chosen not discuss Douglas Trumball’s 60 FPS Showscan film projection process as it likewise didn’t impact the sound recording evolution. You can click through to any of the links to get more details for these visual cinema technologies if you’re so inclined.

Audio Streaming Services

While the recording industry is now firmly reliant on a DAW for producing new music, that new music must be consumed by someone, somewhere. That somewhere includes streaming services like Napster, Apple Music, Amazon Music and Pandora.

Why is this important? It’s important because of the former usefulness of the CD format. As discussed earlier, the CD was more or less a stop-gap for the music industry, but at the same time it propelled audio recording process in a new direction and offered up a whole new format for consumers to buy. Streaming services, like those named above, are now the place to go to listen to music. No longer do you need to buy and own thousands of CDs. Now you just need to pay for a subscription service and you have instant access to perhaps millions of songs at your fingertips. That’s like walking into a record store and opening every single CD in the store and listening to every single one of them for a small monthly fee. This situation could only happen on a global Internet scale, never on a single store sized scale.

For this reason, record stores like Virgin Megastore and Blockbuster Music (now out of business) no longer need to exist. When getting CDs was the only way to get music, CDs made sense. Now that you can buy MP3s from Amazon or, better, sign up for a music streaming service, you can listen to any song you want at any time you want just by asking your device’s virtual assistant or by browsing.

The paradigm of listening to commercial music has now shifted during the 2010s. Apple Music launched in 2015, for example. Since 2015, this service has now gained 88 million subscribers as of 2022 and counting. The need to buy prerecorded music, particularly CDs or vinyl, is almost nonexistent. The only people left buying CDs or vinyl are collectors, DJs or music diehards. You can get access to brand new albums the instant they drop simply by being a subscriber. With devices like iOS and Apple Music, you can even download the music to your device for offline listening. You don’t need to rely on having access to the internet to listen. You simply need access to download the tracks, but not to listen to them. As long as your device remains subscribed, all downloaded tracks remain valid.

It also means that if you buy a new device, you still have access to all of the tracks you formerly had. You would simply need to download them again.

As for music recording during this era, the DAW is firmly the entrenched recording software of choice whether in a studio or at home. Bands can even set up home studios and record their tracks right in their own studio. No need to lease out expensive studio space when you can record in your own studio. This has likely put a punch onto former studios that relied on bands showing up to record, but it was an inevitable outcome of the DAW, among other music equipment changes.

Though, it also means that the movie industry has an easier time of recording audio for films. You simply need a laptop or two and you can easily record audio for a movie production while on location. What was once cumbersome and required many people handling lots of equipment is likely down to one or two people using portable equipment.

As for Cinema audio, effectively, not much has changed since the 1970s other than perhaps better amplifiers and speakers to better support THX certification. Though by the 2010s, digital sound has become ubiquitous, even when using actual developed film prints, though digital cinematography is now becoming the defacto standard. While Cinemas have moved towards megaplexes containing 10, 20 or 30 screens, the technology driving these theaters these hasn’t much changed this decade. Other competitors to THX have come into play, like Dolby Atmos (2012), which also offers up optimal speaker placement and volume to ensure high quality spatial audio in the space allotted. While THX’s certification system was intended for commercial theater use, Dolby Atmos can be used either in a commercial cinema setting or in a home cinema.

2020s

We’re slightly over 2 years into the 2020s (100 years since the 1920s) and it’s hard to say what this decade might hold for audio recording evolution. So far, this decade is still riding out what was produced in the 2010s. When this decade is over, this section will be written. Until then, this article is awaiting this decade to be complete. Because this article is intended as a 100 year history, no speculation will be offered as to what might happen this decade or farther out. Instead, we’ll need to let this decade play out to see where audio recording goes from here.

Note: All images used within this article are strictly used under the United States fair use doctrine for historical context and research purposes, except where specifically noted.

Please Like, Follow and Comment

If you enjoy reading Randocity’s historical content, such as this, please like this article, share it and click the follow button on the screen. Please comment below if you’d like to participate in the discussion or if you’d like to add information that might be missing.

If you have worked in the audio recording industry during any of these decades, Randocity is interested in hearing from you to help improve this article using your own personal stories. Please leave a comment below or feedback in the Contact Us area.

↩︎

Tagged with: 00s, 100 years, 10s, 1930s, 1940s, 1950s, 1960s, 1970s, 1980s, 1990s, 2000s, 2010s, 2020s, 20s, 30s, 40s, 50s, 60s, 70s, 80s, audio, decades, history, recording, technology

Rant Time: Adobe VoCo’s ethical dilemma

Posted in best practices, botch, business, california, ethics by commorancy on February 28, 2018

I have to wonder about Adobe’s business ethics at times. First, there’s Photoshop. While I can admit that photo editing has a legitimate purpose, such as correcting red eye or removing telephone ~~lines~~ or removing reflections of the camera man from a photo, there is the much seedier and ethically murky purpose for Photoshop. Now comes Adobe VoCo. It is a product idea that does for spoken audio what Photoshop does for images. Let’s explore this YouTube clip from 2016:

Skip to 3:18 for the meat of this video.

VoCo’s Use Cases and Ethics

Though, yes, I will concede that the demonstration above was funny and we all laughed, the demonstration has a deep seated ethically murky undertone once the laughing stops. In fact, that’s what prompted this blog article.

Unlike Photoshop which has actual real world use cases (yes, other than making models thinner and glowier for the cover of Vogue), VoCo is one of those unnecessary tools that, while cool in theory, makes Adobe seem that it’s now in the business of causing world disruption instead of actually solving creative problems. After the ethical problems created by Photoshop, Adobe has to know the ethical quandary it introduces by bringing the VoCo audio editing tool to market. Adobe decides to go ahead with demoing this tool anyway. So much for business ethics. Instead, Adobe should have patented and shelved this product idea and never shown it off.

There’s no effective real world use case for this product other than for making someone say things that they actually didn’t say. The only use case where this technology might even be somewhat useful, depending on output quality, is in the voice over industry where an actor might be unavailable at a time when a line needs to be changed to fit continuity better. The voice over industry is the only industry where VoCo could have even the smallest glimmer of hope of a use case. This is such a tiny niche market segment to introduce this tool in such a public spectacle way.

The only other use case would be to sample all of the audio from a particular dead actor or actress’s productions and then recreate lines of new spoken dialog based on that. Again, this is one of those entertainment areas that fits firmly into the uncanny valley, particularly if the spoken lines are attached to a CG actor. Again, this is not a substantial use case in my opinion and is most definitely creepy. It’s definitely not a big enough use case to warrant this public release spectacle. Do we really want to see Marilyn Monroe or Elvis brought back to life on the big screen using CG and VoCo dialog?

There is no other legitimate use case for this product. It’s like Adobe intentionally wants to flaunt its lack of ….

Business Ethics and Self-Editing

Businesses today have no ability to self-edit or recognize ethics. That is, stop ethically bad product ideas from making it to the market. Just thinking about this product and how it could possibly be used, it doesn’t have legitimate use cases (other than the voice over use case I mentioned above). However, there are perhaps thousands of illegitimate uses for this tool. Let’s list a few of them, shall we:

Falsifying a deposition to make the person being deposed say something they didn’t say
Falsifying a statement of non-confession to make a person confess to a crime when they didn’t actually confess
Falsifying a phone conversation
Changing any spoken words from non-incriminating to incriminating evidence

In legal circles, the use for this tool is ripe for abuse and has use cases as wide as the Grand Canyon and as deep as the Mariana Trench. In other words, while VoCo has no substantial legitimate use cases, it has thousands of illegitimate use cases. There is no way Adobe couldn’t see this. There is no way for Adobe to feign ignorance about this tool or the ethical problems it imposes if released.

Legal Evidence

Some have theorized that this tool would become just as Photoshop has. Basically, because evidence can now be manufactured in products like VoCo, it means that audio evidence would no longer be easily admissible. While that idea has some soundness to it, the legal system is not always technically savvy and can sometimes move at a snail’s pace. Eventually, the courts and lawyers will be on board with this ‘manufactured evidence’ sound clip idea, but not before several someones are incriminated over manufactured evidence that isn’t caught in time.

Some have theorized that Adobe should watermark the sound clip. The difficulty with audio watermarking is that it ruins the audio. No one would buy a professional audio tool that intentionally makes the audio sound bad or introduces something that is audibly noticeable, strictly because Adobe wants to insert a watermark to legally cover their collective butts. No. No one would buy a tool that causes damage to the audio output. This means that only a silent kind of watermark could be introduced. Such a watermark would consist primarily as a tag within the saved audio clip file. Any tags introduced in a save file can easily be stripped away by converting the audio clip to a new format or by playing the audio clip back and recording it on analog equipment. In fact, a whole industry and set of tools would likely appear to strip out any watermarks imposed by Adobe onto the saved files.

Unless there is a substantial way to identify that the clip has been edited, and I don’t know how Adobe could even solve this problem fully, VoCo is a tool that would end up more abused than legitimately used.

Flawed Product Ideas

While this is somewhat of a cool technological advancement, it doesn’t need to exist. It doesn’t need to exist because it has basically one limited use case. I’d argue that as a production runner, you can just wait until the voice actor becomes available and ask them to re-record the lines you need. That is, instead of using a tool like this. A tool like VoCo might save you some time, but by demanding such a tool for your use, it means the rest of the world must also endure the consequences of a world full of falsified evidence. Is that the world you want to live in? Evidence that could even be used against you, the audio editor. No, thanks.

However, it’s clear that prototype code has been written based on the video above. This means that Adobe could release such a product into the wild in the future. Thankfully, as of this article in 2018, this product does not yet exist. Unfortunately, Adobe has already opened Pandora’s box. A working prototype means that any coder with leanings towards audio engineering could produce a similar tool and release it into the wild without the help of Adobe. Thanks Adobe.

It is as yet unclear when or if this product could ever be released. Note that this video segment apparently showcases experimental product ideas (products that may never see the light of day) and not actual products. After all, such a legally murky product would have to clear Adobe’s legal team before release. Considering the many negative use cases for such an audio editing product and the legal liability that Adobe might endure as a result, I’d hope that Adobe’s legal team has shelved this product idea permanently.

Agree or disagree? Please leave a comment below. Also, don’t miss any new Randocity articles by subscribing to this blog via clicking the blue follow button at the top right.

Tagged with: adobe, audio, business ethics, editing, editor, evidence, falsified, voco

Audio Tip: How to decode 5.1 DTS / AC3 to 6 WAV files

Posted in audio engineering by commorancy on April 10, 2016

[Updated 02/21/2017] Please see the updated Alternative Solutions below. These don’t require Cubase.

For those of us who are hobbiest home audio engineers, here’s a tip that might come in handy when trying to extract 5.1 (6 channel) audio from DTS/AC3 to individual WAV (or more specifically, WAV 64) files. This technique may or may not work for 7.1 (8 channel) audio. Let’s explore.

What You Will Need

Cubase 5 (or newer) or possibly Audacity
FFMpeg
TSMuxer GUI with tsmuxer
Windows only: eac3t0

First Step – TSMuxer GUI

Extract the DTS/AC3 stream from video container using TSMuxer Gui. To do this…

Load the *.m2ts, *.vob, *.mkv, etc file into TSMuxer Gui using File=>Open.
Once the file is loaded, uncheck all other streams except the audio stream (DTS or AC3)
Choose ‘Demux’ as the Output type
Choose ‘Browse’ if you want to place the output file somewhere other than where the app has chosen
Click ‘Start Demuxing’
When completed, you will have a *.dts or *.ac3 file as output.

This step demuxes the audio from the full movie container.

Second Step – FFMpeg

Extract the DTS 5.1 audio to single 6 channel WAV file via FFMpeg using the following:

ffmpeg -i 00000.track_4352.dts -acodec pcm_s24le output-file.w64

This will create a 6 channel w64 (wave 64) file. You’ll want to use *.w64 because of the 4GB max size of standard *.wav files. If you know your output file will be sized smaller than 4GB, you can use *.wav instead. Also, if you want to master in 32 bit or higher, you can choose the pcm output version that corresponds to the bit size you want to use. I’m using 24 bits for my remastering efforts. The larger amount of bits you use for mastering, the more likely you will need to use w64.

Third Step – Cubase

File=>Import to input output-file.w64 into Cubase
When the small panel appears asking how you would like to import, choose Split Channels. You can number them if you like.
It may take a little while to split them all out.

Note, this is the part that I do not know if Audacity supports. It may be able to perform Split Channels like Cubase, but you would need to test Audacity to find out whether it can and how. Cubase can definitely split the channels, though.

Fourth Step – Exporting WAV / W64 files

From here, you can continue to use Cubase or Audacity to produce a remastered audio file or …
You can save each individual channel as a separate WAV file for some other use. Note, you should use *.w64 (wave 64) if the files are expected to be larger than 4GB in size.

It’s up to you what you want to do with the resulting files.

Alternative Solutions

Using ffmpeg only.

Note, you will need to install the latest version of ffmpeg to ensure compatibility with this solution.

I have since found you can accomplish the extraction to individual WAVes using ffmpeg only. You won’t need Cubase or tsmuxer for this alternative solution. You extract your WAVes by setting up channel mappings, then assigning those mappings to each output file. Though, this solution is just a tad bit more complicated in that you need to know what channels your input audio offers, which channels to extract and what the two letter abbreviation for the channel is within ffmpeg.

For extracting 5.1, use the following command (Linux line break style shown):

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=5.1[FL][FR][FC][LFE][BL][BR]" \
-map "[FL]" front_left.wav \
-map "[FR]" front_right.wav \
-map "[FC]" front_center.wav \
-map "[LFE]" lfe.wav \
-map "[BL]" back_left.wav \
-map "[BR]" back_right.wav

To extract 7.1, use the following command:

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=7.1[FL][FR][FC][LFE][BL][BR][FLC][FRC]" \
-map "[FL]" front_left.wav \
-map "[FR]" front_right.wav \
-map "[FC]" front_center.wav \
-map "[LFE]" lfe.wav \
-map "[BL]" back_left.wav \
-map "[BR]" back_right.wav \
-map "[FLC]" front_left_center.wav \
-map "[FRC]" front_right_center.wav

Where infile is your source file. The input can be a video file (i.e., vob, m2ts, mkv, etc) or a multitrack audio file (i.e., AC3, DTS, etc). If you are running Windows using a CMD command shell, you will need to type the command in without the line breaks shown above. So, copy and paste won’t directly work on Windows. You’ll need to use an editor to make the command Windows friendly.

Note that there are a lot of different possible mappings for various types of audio input files. Since there are now many formats of soundtrack audio available such as Dolby Atmos, Dolby DTS, AC3 and various others, you should first determine the format of the input audio with ffprobe (see below) to better understand how to map and extract the audio. The channels available for possible extraction in ffmpeg include the following:

Command:

ffmpeg -layouts -hide_banner

Output:

Individual channels:
NAME        DESCRIPTION
FL          front left
FR          front right
FC          front center
LFE         low frequency
BL          back left
BR          back right
FLC         front left-of-center
FRC         front right-of-center
BC          back center
SL          side left
SR          side right
TC          top center
TFL         top front left
TFC         top front center
TFR         top front right
TBL         top back left
TBC         top back center
TBR         top back right
DL          downmix left
DR          downmix right
WL          wide left
WR          wide right
SDL         surround direct left
SDR         surround direct right
LFE2        low frequency 2

Standard channel layouts:
NAME        DECOMPOSITION
mono        FC
stereo      FL+FR
2.1         FL+FR+LFE
3.0         FL+FR+FC
3.0(back)   FL+FR+BC
4.0         FL+FR+FC+BC
quad        FL+FR+BL+BR
quad(side)  FL+FR+SL+SR
3.1         FL+FR+FC+LFE
5.0         FL+FR+FC+BL+BR
5.0(side)   FL+FR+FC+SL+SR
4.1         FL+FR+FC+LFE+BC
5.1         FL+FR+FC+LFE+BL+BR
5.1(side)   FL+FR+FC+LFE+SL+SR
6.0         FL+FR+FC+BC+SL+SR
6.0(front)  FL+FR+FLC+FRC+SL+SR
hexagonal   FL+FR+FC+BL+BR+BC
6.1         FL+FR+FC+LFE+BC+SL+SR
6.1         FL+FR+FC+LFE+BL+BR+BC
6.1(front)  FL+FR+LFE+FLC+FRC+SL+SR
7.0         FL+FR+FC+BL+BR+SL+SR
7.0(front)  FL+FR+FC+FLC+FRC+SL+SR
7.1         FL+FR+FC+LFE+BL+BR+SL+SR
7.1(wide)   FL+FR+FC+LFE+BL+BR+FLC+FRC
7.1(wide-side)FL+FR+FC+LFE+FLC+FRC+SL+SR
octagonal   FL+FR+FC+BL+BR+BC+SL+SR
downmix     DL+DR

Using FFProbe

To determine the audio channels available to extract from your infile, use ffprobe as follows:

Linux / MacOS X

$ /path/to/ffprobe -i infile -hide_banner 2>&1 | egrep "^I|^ "

Windows

C:\path\to\ffprobe -i infile -hide_banner

The output will look something like

Input #0, mpegts, from 'my_movie.m2ts':
  Duration: 01:37:13.61, start: 11.650667, bitrate: 43437 kb/s
  Program 1
    Stream #0:0[0x1011]: Video: h264 (High) (HDMV / 0x564D4448), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 90k tbn, 47.95 tbc
    Stream #0:1[0x1100]: Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side), fltp, 1536 kb/s
    Stream #0:2[0x1200]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090)
    Stream #0:3[0x1201]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090)

The stream marked in red is the stream you will need to examine. In the output example above, this audio contains 5.1(side) audio using the DTS format. For 5.1(side) extraction, the command would look like the following:

# From the above -layouts output
# input = 24 bits and preserve output at 24 bits
# 5.1(side)   FL+FR+FC+LFE+SL+SR

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=5.1[FL][FR][FC][LFE][SL][SR]" \
-acodec pcm_s24le \
-map "[FL]" front_left.wav \
-acodec pcm_s24le \
-map "[FR]" front_right.wav \
-acodec pcm_s24le \
-map "[FC]" front_center.wav \
-acodec pcm_s24le \
-map "[LFE]" lfe.wav \
-acodec pcm_s24le \
-map "[SL]" side_left.wav \
-acodec pcm_s24le \
-map "[SR]" side_right.wav

Audio Output Format

If you don’t specify an audio output format, ffmpeg defaults to using pcm_s16le which creates a 16 bit WAV file for each channel. If you want to preserve the bit rate of the original audio, then you’ll want to specify the output format to be the same as the input for each WAV. This means that you can, if you want, specify 16 bit for some channels and 24 bit for others. For 24 bits, specify -acodec pcm_s24le just before each -map flag (see example above). For 32 bits, specify -acodec pcm_s32le.

Also note that while you can select the individual audio track within the a movie container, you don’t really need to. Ffmpeg automatically selects the best quality audio stream available in the container. However, if there are multiple 5.1 or 7.1 audio streams containing equal quality or there are multiple separate programs, you will need to tell ffmpeg to choose which stream to decode into WAV files. Note that if you need the syntax for selecting a specific audio stream using ffmpeg, please leave a comment below and I’ll write up an example for you.

Below is a list of PCM output formats available in ffmpeg version 3.2.4:

# Find these with the command

$ ffmpeg -formats | grep PCM

 DE alaw            PCM A-law
 DE f32be           PCM 32-bit floating-point big-endian
 DE f32le           PCM 32-bit floating-point little-endian
 DE f64be           PCM 64-bit floating-point big-endian
 DE f64le           PCM 64-bit floating-point little-endian
 DE mulaw           PCM mu-law
 DE s16be           PCM signed 16-bit big-endian
 DE s16le           PCM signed 16-bit little-endian
 DE s24be           PCM signed 24-bit big-endian
 DE s24le           PCM signed 24-bit little-endian
 DE s32be           PCM signed 32-bit big-endian
 DE s32le           PCM signed 32-bit little-endian
 DE s8              PCM signed 8-bit
 DE u16be           PCM unsigned 16-bit big-endian
 DE u16le           PCM unsigned 16-bit little-endian
 DE u24be           PCM unsigned 24-bit big-endian
 DE u24le           PCM unsigned 24-bit little-endian
 DE u32be           PCM unsigned 32-bit big-endian
 DE u32le           PCM unsigned 32-bit little-endian
 DE u8              PCM unsigned 8-bit

You’ll need to prefix pcm_ to the name of the format to use it on the command line. For example, if you want to use u16be, then you would specify that as -acodec pcm_u16be.

Windows Only (eac3t0)

Tool required: eac3to

Here is the command to use to extract all of the waves with eac3to:

# N: means the stream number in the input file
# You will need to determine which stream number to use
C:\path\to\eac3to "C:\path\to\infile" N: C:\path\to\output.wavs

Example:
  C:\bin\eac3to "C:\Movies Folder\mymovie.m2ts" 1: D:\Audio\mymovie.wavs

You’ll want to choose the proper drive for each of the paths instead of C:\. Also note, the .wavs (with the ‘s’) extension is important so that all of the WAVes will be exported from this tool.

Thanks go to Eli for the eac3to solution. If you are running Windows in your audio workflow, then this tool seems to be a fast one-step alternative.

Tagged with: audio, cubase, dts, ffmpeg, how-to, split, tsmuxer, wav

4 comments

Amazon Echo: What is it?

Posted in Amazon, business, cloud computing by commorancy on June 20, 2015

What is Amazon Echo?

It’s an approximately 10″ flat black cylinder with reasonable quality speakers, an led ring around the top, a voice recognition system and a remote. While that may seem a little simple, these are the fundamental pieces that matter.

If you’ve ever used a Roku, a smart TV, Amazon Fire TV or an Apple TV, then you pretty much know what Amazon Echo is (minus the speakers). Except, Amazon Echo is intended to be used with audio programs (i.e., news radio, podcasts, music, prime music, weather radio, audiobooks, speech synthesis for reading articles, etc). Anything you can imagine with audio, the Amazon Echo would be the perfect companion. What Apple TV is to movies, Amazon Echo is to audio programs.

In addition to the Alexa audio assistant (like Siri), with a web, tablet or phone app, you can completely control your Echo with the Echo companion app. There is so much that is required by the app, you really can’t get along without it. In fact, you need the app to hook up the Echo to the WiFi which also asks a series of questions about how it will be used. So, if you don’t have a phone, tablet or computer browser, good luck setting up your Echo.

And no, you don’t need to own an Amazon tablet. You can use an iPhone, iPad or any other Android tablet or phone. In fact, you can even use your computer’s browser. Because the Amazon Echo is hooked to the Amazon eco-system, you will also need an Amazon login and password. But, you likely already have this since you purchased the Echo with it. But, if you’re planning on giving it as a gift, the person you are giving it to will also be required to have all of the above. So, Amazon Echo is probably not the best gift idea for those who are not computer savvy or those who choose not to be connected. Remember that this is first and foremost a cloud player device. The faster the speed of the internet connection, the better Echo will work.

Is the Amazon Echo useful?

That’s a good question. If you’re someone who listens to radio programs or other audio programs like podcasts, then perhaps. Though, keep in mind there are some severe limitations in what you can do with the Amazon Echo. For example, the partners Amazon has chosen for its ‘audio channels’ are limited to Pandora, iHeartRadio, TuneIn and Audible. So, like Apple TV has limited video channels, Amazon Echo also has severely limited audio channels. Because of the audio partner limits, you really get a very small selection of content. For example, if Amazon had partnered with Sirius radio, there would be a whole lot more programming choices. Or, for that matter, partnering with Muzak, Soundcloud, Rhapsody, YouTube audio, Last.fm or other audio partners, I would say there would be much more choices in audio. Until then, Echo is nice but somewhat a novelty.

Alexa vs Siri

Alexa clearly has a better voice than Siri. But, other than the voice choice, the functionality is about the same. Like Siri, Alexa has easter eggs, knows what she knows, but what she knows is very very limited. So, don’t expect to be able to ask Alexa complex questions. To activate Alexa, you simply say the key word ‘Alexa’, suffixed quickly by what you want her to do. For example, ‘Alexa, set volume to 5’. Alexa is always listening for the keyword. Once you say the keyword, Alexa will begin listening for your command.

Wording matters with your sentences or Alexa gets quickly confused as to what you’re asking. For example, there’s a difference between asking ‘Alexa, play Frank Sinatra Songs for Swingin Lovers’ or ‘Alexa, play Songs for Swingin Lovers by Frank Sinatra’ or ‘Alexa, play the album Songs for Swingin Lovers by Frank Sinatra’. The shorter you tend to phrase your request, the more likely Alexa is to do the wrong thing or become confused and do nothing. Echo sometimes hears a phantom keyword and activates.

There are many times when you ask Alexa to do something that instead of responding with ‘Okay’ or some affirmative voice response, the led ring at the top flashes in a ‘special way’. So, we’re left to try and decode the R2D2 led responses from the Echo. Instead, I personally believe Alexa should affirmatively or negatively respond to every voice command. Unfortunately, she doesn’t.

Oh, and no, there is not yet a way to change the voice to a male or some other alternative voice. Though, you can change the wake-up word. So, it doesn’t have to be ‘Alexa’.

Alarms, ToDo Lists and Shopping

It is to be expected that you can shop for music through the Echo. So, if you ask Echo to play something that leads to samples, you can buy the song that’s playing. This will then be put into your library for future playback.

You can set up to 1 alarm and up to 1 timer. This means you can set an alarm for wakeup, but you can’t have two alarms. So, if you have a spouse or partner, you can’t have your own alarm and they have one set for a separate time. That won’t work, yet. If you want to time down two different things (important while cooking), you can’t do this either. It supports only one timer.

When the alarm or timer goes off, the audio noise it makes is limited to an internal sound only. Even though you have access to Prime music and radio, you cannot set the timer to use one of those audio sources. So… limited. There are also other limits.

There is a ToDo and Shopping list that you can ask Alexa to manage. You can say, ‘Alexa, add bananas to my shopping list’. When you open the Echo app, you will have your shopping list with you in the store. You can also remote control the Echo app as long as you have Internet on your phone. So, if you have a cat and you like to leave music playing, you can set up playlists, turn the volume up or down, change the music or shut it off.

Music

This is probably where the Echo shines its brightest. With its two speaker system, the audio is bright and vibrant. Not quite as nice as the Bose Soundlink Mini, but the sound is acceptably full and rich for the cylinder design. Unfortunately, it also has no stereo and it needs it. Amazon needs to offer a companion cylinder connected by bluetooth to offer full rich stereo sound. In fact, it could offer several BT connected cylinders to offer 5.1 or 7.1.

Beyond the sound quality issues, having access to Prime music is a necessity here. If you aren’t a Prime member, you really can’t take advantage of what Echo offers. If you do have Prime, then you get access to not only whatever you’ve purchased or uploaded to Amazon’s cloud player, you also get access to the full Prime music library. Still, Amazon’s Prime library is limited. It seems to have a lot of classic rock choices, but not all of it. So, while it has Fleetwood Mac and the Eagles, it doesn’t have Supertramp, for example.

Though, Autorip is your friend with Echo. If you buy a CD with Autorip, it automatically becomes available on the Echo as soon as you’ve paid. However, if you purchase a CD at Target and rip it, you’re limited to 250 uploaded songs unless you pay Amazon an additional $25 a year for 250,000 song uploads.

Audiobooks

If you are a big Audible.com consumer, then you have a distinct advantage with the Echo. You can listen to all of your audio books right on the Echo. If your library is vast, you’ll immediately have a lot of content available to you. In hindsight, I should have been buying audio books when offered with my Kindle purchases, but I never really had any way to play them. With Echo, that’s changed. I will definitely consider audio books in the future.

Kindle Support?

In short, no. There is no support for Alexa to read back Kindle book content using Alexa. Alexa would be the perfect companion to the Kindles that do not offer audio voice playback. Considering this is an Amazon product and would be the perfect companion for the Kindle, the integration between Kindle and Echo is non-existent.

Audiophile Quality?

Definitely not. You’re playing streaming music here, in mono no less. So, while the Echo is great for podcasts, news and incidental background music, don’t give up your audiophile gear. Much of the music streamed from Amazon prime has the telltale mpeg haziness. Echo never skips or stutters while playing Prime or library music, so its streaming IO seems quite robust, but it just doesn’t sound high quality. This is definitely not to be considered an HD quality device as it clearly isn’t. So, don’t go into an Amazon Echo thinking you’ll be getting a high quality music experience. The music does sound decent, but it’s not anywhere near perfect.

Though, for news, podcasts and other spoken word programs, the Amazon Echo is perfect for this use.

Speech Synthesis and Browsing

The voice for Alexa sounds great most of the time. However, when reading back a synopsis Wikipedia article, she doesn’t always do a great job. While music is Echo’s strongest area, the article reading is easily one of Echo’s weakest. Instead, of becoming an audio web browser (which is what Echo should become), Alexa only offers page snippets of articles and then encourages you to crack open a browser or tablet and finish reading there. If Echo is going to do this, why bother using Alexa at all? If I can get better results by reading it myself, then Alexa is pointless for this purpose.

Instead, Alexa should provide full 100% article reading. Read me news, wiki articles or, indeed, any other page on the web. If I ask Alexa to browse to Yahoo News, Alexa needs to be able to read article headlines and let me choose which article to read back. Literally, Echo should become an audio based web browser. Echo should set the standard for audio web browsing so much so that Yahoo and Google optimize their pages for audio browsers much like they are now doing for mobile devices.

Kitchen Use?

Echo would be the perfect companion in the kitchen. Tablets and other touch devices are no where near the perfect device in the kitchen. They get dirty and must be touched by dirty or wet hands. Echo, on the other hand, is the perfect hands-free kitchen companion. ‘Alexa, how do I make Beef Stroganoff’. Seems like a simple recipe request, but no. Alexa has no knowledge of cooking, recipes or anything else to do with kitchen chores. This seems like a no-brainer, but Amazon made no effort here.

Problems and Crashing

After having unboxed Amazon Echo, it had already crashed within 10 minutes of using it. Not the app, but the actual Echo. The app lost connectivity to the Echo until it had rebooted. Though, I have also had the app crash. So, this first incarnation of the Echo is a little beta still. I’m guessing that’s why they cut the 50% off deal with those who were invited to pick them up for testing. Though, when the Echo works, it does work well.

Improvements

The Amazon Echo could benefit from a number of improvements including:

Battery backup
Full audio web browsing
Games (i.e., chess, checkers, etc)
Better interactive integration between Echo and its companion app
Satellite interfaces (to use Echo in every room)
Stereo audio / Multichannel audio (using multiple cylinders)
Audio playback to stereo BT devices (i.e., headphones and speakers)
Speakerphone
Remote control of Amazon devices
Check status of Amazon orders
Recipes and general kitchen helper
Alexa reading Kindle books
More audio channels such as:
- Sirius Radio
- Police Scanners
- Custom podcast URLs
- SoundCloud and similar sites
- YouTube Audio
- Last.fm
- Spotify
- MySpace Music
- Amie Street
- A much bigger selection of Internet radio stations
- Archives of pre-recorded news broadcasts

Limitations

This first incarnation of Amazon Echo is quite limited. Echo has about 1/10th of the feature set you would expect to offer a complete experience. For example, it should become an audio web browser. Audio is the next evolution in browsers. Sitting at a computer watching a screen is time consuming. But, using an audio web browser, you could browse the web and work on other things. It’s easy to listen and still focus on other tasks. We do it all the time.

In fact, Alexa needs to be imported into every Amazon device including the Fire phone, Fire tablet line and every other interactive device it makes. While Alexa needs to be on every Amazon device, the use case of Echo and all of the audio channels should still be limited to the Echo.

So, while Alexa exists and works as well as Siri, Alexa is simply the input and output device on the Echo out of necessity. The functionality of the Echo needs to firmly focus on all aspects of audio communication including podcasts, dictation, news programs, web browsing, audio books, cooking, music and more. Alexa shouldn’t be overlooked as the home helper, but not strictly on the Echo. I know that Amazon is planning on expanding the Echo to supporting home automation through such phrases as ‘Alexa, turn on the light’. But, that requires a home automation system that interfaces with the Echo. There are probably other uses just waiting to be explored.

In fact, if Amazon were to put Alexa on every device, you could have a unified Alexa system throughout your home. So, each device could learn the types of things you do regularly and share that among all of the Alexa systems. So, if you frequently ask for a specific type of music, Alexa could offer recommendations for new playlists.

Overall, it’s currently an okay device. Out of 10 stars, I’d give it 4 stars. Amazon compromised just a little too much in all aspects of this device to make it truly outstanding. In fact, Alexa should have had white LED lights on the unit so that it could illuminate the room. It also needs a battery backup so you can still use some of Alexa’s basic functions, like the alarm clock, if the power goes out. The next incarnation of the Echo will likely make up for its current shortcomings.

Tagged with: amazon, audio, control, home, music

Cinavia: Annoying? Yes. What is it?

Posted in botch, business, california by commorancy on February 23, 2014

If you’re into playing back movies on your PS3, you might have run into an annoying problem where your movie plays for about 20 minutes, then the audio suddenly drops out entirely with a warning message on the screen. This is Cinavia. Let’s explore.

What is Cinavia and how does it work?

Cinavia is an audio watermarking technology created by the company Verance where an audio subcode is embedded within digital audio soundtracks at humanly imperceptible levels, but at a level where a DSP or other included hardware chip can read and decode its presence. Don’t be fooled by the ad with smiling children on the Verance site, this has nothing to do with helping make audio better for the consumer. No, it is solely created for industry media protection.

This Cinavia watermark audio subcode seems to be embedded at a phase and frequency that can be easily isolated and extracted from an audio soundtrack, then processed and determined if it’s valid for the movie title being played back. Likely, it’s also an analog audio-based digital carrier subcode (like a modem tone) that contains data about the title being played.

How is Cinavia used in the film industry?

There are two types of known uses of Cinavia watermarking. The first use is to protect theatrical releases from being pirated. Because the audio watermarking is audible, but imperceptible, it will be picked up by microphones (strictly because of the Hz range where the subcode is embedded). Keep in mind that just because the subcode cannot be heard by human ears, it doesn’t mean it can’t be heard and decoded by a specialty hardware chip. So, if a theatrical release is CAMed (i.e. recorded from the screen), the Cinavia watermarking will also be recorded in the audio. After all, what is a movie without audio?

The second use is to protect Blu-ray copies of films from being pirated. For the same reason as theatrical releases, Blu-ray films are also embedded with a subcode. But, that subcode is different from theatrical films. For this reason, films destined for theatrical releases will never play in a consumer Blu-ray player ever (including players such as the PS3, PS4 or Xbox One). Commercial Blu-ray disks play because the audio track uses AACS with a key likely embedded within the subcode watermark. If the AACS key matches the value from the watermark, the check passes and the audio continues to play.

I have also read there is a third use emerging… to protect DVD releases. But, I have yet to confirm any DVDs currently using this technology. If you have run into any such releases, please leave a comment.

How would I be affected by this?

All consumer Blu-ray players manufactured after 2012-2013 are required to support Cinavia. If the Cinavia subcode is present, the player will blank the audio track if the AACS key is mismatched. This means hardware Blu-ray players from pretty much any manufacturer will be affected by Cinavia protection if the title supports it. CAM copies of theatrical releases will never play because the audio subcode is entirely different for theatrical films and the Blu-ray player will recognize that theatrical subcode and stop audio playback.

Not all movie titles use Cinavia to protect their content. Not all players support the Cinavia protections from all media types. For example, some Blu-ray players can play media from a variety of sources beside BD disks (e.g., USB drives, Network servers, etc). These alternative sources are not always under Cinavia protection even if the specific movie has an embedded subcode.

Since Sony is the biggest proponent and user of this technology, all Sony players, including the PS3 and PS4 along with their standalone Blu-ray players will not play back Cinavia protected material if it doesn’t continue to pass the subcode tests. For example, if you rip a Blu-ray disk protected by Cinavia and then burn it to a BD-rom disk, the movie will stop playing audio at around the 20 minute mark and display a warning. If you attempt to stop and start the movie, it will play audio again for a few seconds and then stop playing with a warning.

How can you remove Cinavia protection?

In short, it’s not as easy as that may sound. Once the Cinavia protection is detected on the media, the hardware activates and continues to look for the information it needs to make sure the content is ‘legitimate’.

With that said, there are ways of getting around this on certain devices. As I explained, some players don’t check for Cinavia for certain types of media (i.e., USB or Network streaming). Sony, however, does check for all media types. The PS3, though, doesn’t seem to check for Cinavia if the playback is through the optical output port (i.e., when playing back through an optical receiver). That would make sense, though, as it would be left up to the receiver to blank the audio based on Cinavia. Since most receivers probably don’t support Cinavia, there should be no issue with playback.

Other technical methods include garbling the audio somewhat or using variable speed on the audio. Neither of these two methods are really acceptable to the ears when watching a movie. We all want our movies to both look and sound correct.

How can I avoid this problem?

You can easily avoid this issue by using a a player that doesn’t support Cinavia protection. For example, Windows Media Player, VLC, etc. Most PC media players do not support Cinavia. Though, if you get a PC from Sony, expect the media player on any Sony product to support Cinavia (yes, even Windows Media Player might as Sony may have loaded a system-wide Cinavia plugin). If you buy a PC from any manufacturer other than Sony, you likely won’t be affected by Cinavia.

This problem almost solely exists on Blu-ray standalone players. So, if you avoid playing movies on such consumer hardware players, you can usually avoid the Cinavia issue entirely. Though, there are some commercial PC media players that do support Cinavia.

A possible real solution?

Another method which I have not seen explored, I have decided to propose here. With a film protected by Cinavia, the Cinavia subcode should exist both within silence as well as noisy portions likely at the same volume. First, extract a length of silence (that contains Cinavia subcode). Now, garble, stretch, warp and generally distort this subcode so that it cannot be recognized by a Cinavia decoder. Then duplicate the garbled ‘silence’ subcode to fill the length of the entire film. Extract the film’s audio soundtrack, mix in the new garbled full length subcode throughout the entire film. Note that remixing 7.1 or 5.1 track is a bit tricky, but it can be done. I would suggest inserting it on the subwoofer track or the center track, though it may be present on all of the tracks by design. After the audio track is remixed and remuxed into a resulting MP4 (or other format), the new garbled subcode should hopefully interfere just enough with the existing already-embedded subcode to prevent the Cinavia protection from getting a lock on the film’s original subcode.

The outcome of the garbled subcode could cause one of two things to happen. 1) The Cinavia detection is rendered useless and the Cinavia hardware ignores the subcode entirely or 2) The Cinavia detection realizes such tampering and shuts down the audio track immediately. While erroring on the side of fail is really a bad move in an industry already fraught with bad press around failed past media protection schemes, I would more likely suspect scenario number 1. But, it’s probably worth a test. No, I have not yet had time to test my theory.

While this doesn’t exactly remove Cinavia, it should hopefully render it useless. But, it won’t recover the lost audio portions being used by the Cinavia subcode.

How would I go about doing this?

I wouldn’t attempt doing the above suggestion manually on films as it takes a fair amount of time demuxing audio, creating the garbled audio subcode, remixing the new track and remuxing it into the video. But an application capable of ripping could easily handle this task during the rip and conversion process if provided with a length of garbled subcode.

[Updated: 2018-01-06]

Apparently, DVDFab seems to have a way to rip and disable Cinavia protections according to their literature. They have released the DVDFab DVD and Blu-ray Cinavia Removal tool. If you’re still having difficulties with Cinavia while watching your movies, it might be worth checking out this tool. Note, I have not personally used this tool, so I can’t vouch for its effectiveness. I am also not being sponsored by DVDFab in this article. I’m only pointing out this tool because I recently found it and because it seems to have a high rating. On the other hand, I do see some complaints that it doesn’t always recognize and remove Cinavia on some movies. So, caveat emptor. Even though it’s not an inexpensive product, it is on sale at the time of this update for whatever that’s worth.

It seems that someone finally may have implemented my idea above. Good on them if they did… it only took around 4 years.

Tagged with: audio, cinavia, protection

10 comments

	commorancy on Mary Poppins: Who exactly is…
	Jason on TV Review: Wayward Pines
	commorancy on How to Overcome Apple’s…
	commorancy on Recipe: How to make Sushi…
	commorancy on Retro Review: Earth Final…

Random Thoughts – Randocity!

The Evolution of Sound Recording

1920s

1930s

1940s

1950s

1960s

1970s

1980s

1990s

2000s

2010s

2020s

Rant Time: Adobe VoCo’s ethical dilemma

Skip to 3:18 for the meat of this video.

Audio Tip: How to decode 5.1 DTS / AC3 to 6 WAV files

Amazon Echo: What is it?

Cinavia: Annoying? Yes. What is it?

Top Posts

Randocity Archives

Flickr Photos

Recent Comments

Subscribe to Blog via Email

Notices

Random Thoughts – Randocity!

The Evolution of Sound Recording

1920s

1930s

1940s

1950s

1960s

1970s

1980s

1990s

2000s

2010s

2020s

Share Randocity:

Rant Time: Adobe VoCo’s ethical dilemma

Skip to 3:18 for the meat of this video.

Share Randocity:

Audio Tip: How to decode 5.1 DTS / AC3 to 6 WAV files

Share Randocity:

Amazon Echo: What is it?

Share Randocity:

Cinavia: Annoying? Yes. What is it?

Share Randocity:

Top Posts

Randocity Archives

Flickr Photos

Recent Comments

Subscribe to Blog via Email

Notices