Random Thoughts – Randocity!

Rant Time: Adobe VoCo’s ethical dilemma

Posted in best practices, botch, business, california, ethics by commorancy on February 28, 2018

I have to wonder about Adobe’s business ethics at times. First, there’s Photoshop. While I can admit that photo editing has a legitimate purpose, such as correcting red eye or removing telephone lines or removing reflections of the camera man from a photo, there is the much seedier and ethically murky purpose for Photoshop. Now comes Adobe VoCo. It is a product idea that does for spoken audio what Photoshop does for images. Let’s explore this YouTube clip from 2016:

Skip to 3:18 for the meat of this video.

VoCo’s Use Cases and Ethics

Though, yes, I will concede that the demonstration above was funny and we all laughed, the demonstration has a deep seated ethically murky undertone once the laughing stops. In fact, that’s what prompted this blog article.

Unlike Photoshop which has actual real world use cases (yes, other than making models thinner and glowier for the cover of Vogue), VoCo is one of those unnecessary tools that, while cool in theory, makes Adobe seem that it’s now in the business of causing world disruption instead of actually solving creative problems. After the ethical problems created by Photoshop, Adobe has to know the ethical quandary it introduces by bringing the VoCo audio editing tool to market. Adobe decides to go ahead with demoing this tool anyway. So much for business ethics. Instead, Adobe should have patented and shelved this product idea and never shown it off.

There’s no effective real world use case for this product other than for making someone say things that they actually didn’t say. The only use case where this technology might even be somewhat useful, depending on output quality, is in the voice over industry where an actor might be unavailable at a time when a line needs to be changed to fit continuity better. The voice over industry is the only industry where VoCo could have even the smallest glimmer of hope of a use case. This is such a tiny niche market segment to introduce this tool in such a public spectacle way.

The only other use case would be to sample all of the audio from a particular dead actor or actress’s productions and then recreate lines of new spoken dialog based on that. Again, this is one of those entertainment areas that fits firmly into the uncanny valley, particularly if the spoken lines are attached to a CG actor. Again, this is not a substantial use case in my opinion and is most definitely creepy. It’s definitely not a big enough use case to warrant this public release spectacle. Do we really want to see Marilyn Monroe or Elvis brought back to life on the big screen using CG and VoCo dialog?

There is no other legitimate use case for this product. It’s like Adobe intentionally wants to flaunt its lack of ….

Business Ethics and Self-Editing

Businesses today have no ability to self-edit or recognize ethics. That is, stop ethically bad product ideas from making it to the market. Just thinking about this product and how it could possibly be used, it doesn’t have legitimate use cases (other than the voice over use case I mentioned above). However, there are perhaps thousands of illegitimate uses for this tool. Let’s list a few of them, shall we:

  • Falsifying a deposition to make the person being deposed say something they didn’t say
  • Falsifying a statement of non-confession to make a person confess to a crime when they didn’t actually confess
  • Falsifying a phone conversation
  • Changing any spoken words from non-incriminating to incriminating evidence

In legal circles, the use for this tool is ripe for abuse and has use cases as wide as the Grand Canyon and as deep as the Mariana Trench. In other words, while VoCo has no substantial legitimate use cases, it has thousands of illegitimate use cases. There is no way Adobe couldn’t see this. There is no way for Adobe to feign ignorance about this tool or the ethical problems it imposes if released.

Legal Evidence

Some have theorized that this tool would become just as Photoshop has. Basically, because evidence can now be manufactured in products like VoCo, it means that audio evidence would no longer be easily admissible. While that idea has some soundness to it, the legal system is not always technically savvy and can sometimes move at a snail’s pace. Eventually, the courts and lawyers will be on board with this ‘manufactured evidence’ sound clip idea, but not before several someones are incriminated over manufactured evidence that isn’t caught in time.

Some have theorized that Adobe should watermark the sound clip. The difficulty with audio watermarking is that it ruins the audio. No one would buy a professional audio tool that intentionally makes the audio sound bad or introduces something that is audibly noticeable, strictly because Adobe wants to insert a watermark to legally cover their collective butts. No. No one would buy a tool that causes damage to the audio output. This means that only a silent kind of watermark could be introduced. Such a watermark would consist primarily as a tag within the saved audio clip file. Any tags introduced in a save file can easily be stripped away by converting the audio clip to a new format or by playing the audio clip back and recording it on analog equipment. In fact, a whole industry and set of tools would likely appear to strip out any watermarks imposed by Adobe onto the saved files.

Unless there is a substantial way to identify that the clip has been edited, and I don’t know how Adobe could even solve this problem fully, VoCo is a tool that would end up more abused than legitimately used.

Flawed Product Ideas

While this is somewhat of a cool technological advancement, it doesn’t need to exist. It doesn’t need to exist because it has basically one limited use case. I’d argue that as a production runner, you can just wait until the voice actor becomes available and ask them to re-record the lines you need. That is, instead of using a tool like this. A tool like VoCo might save you some time, but by demanding such a tool for your use, it means the rest of the world must also endure the consequences of a world full of falsified evidence. Is that the world you want to live in? Evidence that could even be used against you, the audio editor. No, thanks.

However, it’s clear that prototype code has been written based on the video above. This means that Adobe could release such a product into the wild in the future. Thankfully, as of this article in 2018, this product does not yet exist. Unfortunately, Adobe has already opened Pandora’s box. A working prototype means that any coder with leanings towards audio engineering could produce a similar tool and release it into the wild without the help of Adobe. Thanks Adobe.

It is as yet unclear when or if this product could ever be released. Note that this video segment apparently showcases experimental product ideas (products that may never see the light of day) and not actual products. After all, such a legally murky product would have to clear Adobe’s legal team before release. Considering the many negative use cases for such an audio editing product and the legal liability that Adobe might endure as a result, I’d hope that Adobe’s legal team has shelved this product idea permanently.

Agree or disagree? Please leave a comment below. Also, don’t miss any new Randocity articles by subscribing to this blog via clicking the blue follow button at the top right.

Audio Tip: How to decode 5.1 DTS / AC3 to 6 WAV files

Posted in audio engineering by commorancy on April 10, 2016

[Updated 02/21/2017] Please see the updated Alternative Solutions below. These don’t require Cubase.

For those of us who are hobbiest home audio engineers, here’s a tip that might come in handy when trying to extract 5.1 (6 channel) audio from DTS/AC3 to individual WAV (or more specifically, WAV 64) files. This technique may or may not work for 7.1 (8 channel) audio. Let’s explore.

What You Will Need

First Step – TSMuxer GUI

Extract the DTS/AC3 stream from video container using TSMuxer Gui. To do this…

  1. Load the *.m2ts, *.vob, *.mkv, etc file into TSMuxer Gui using File=>Open.
  2. Once the file is loaded, uncheck all other streams except the audio stream (DTS or AC3)
  3. Choose ‘Demux’ as the Output type
  4. Choose ‘Browse’ if you want to place the output file somewhere other than where the app has chosen
  5. Click ‘Start Demuxing’
  6. When completed, you will have a *.dts or *.ac3 file as output.

This step demuxes the audio from the full movie container.

Second Step – FFMpeg

Extract the DTS 5.1 audio to single 6 channel WAV file via FFMpeg using the following:

ffmpeg -i 00000.track_4352.dts -acodec pcm_s24le output-file.w64

This will create a 6 channel w64 (wave 64) file. You’ll want to use *.w64 because of the 4GB max size of standard *.wav files. If you know your output file will be sized smaller than 4GB, you can use *.wav instead. Also, if you want to master in 32 bit or higher, you can choose the pcm output version that corresponds to the bit size you want to use. I’m using 24 bits for my remastering efforts. The larger amount of bits you use for mastering, the more likely you will need to use w64.

Third Step – Cubase

  1. File=>Import to input output-file.w64 into Cubase
  2. When the small panel appears asking how you would like to import, choose Split Channels. You can number them if you like.Cubase Import
  3. It may take a little while to split them all out.

Note, this is the part that I do not know if Audacity supports. It may be able to perform Split Channels like Cubase, but you would need to test Audacity to find out whether it can and how. Cubase can definitely split the channels, though.

Fourth Step – Exporting WAV / W64 files

  1. From here, you can continue to use Cubase or Audacity to produce a remastered audio file or …
  2. You can save each individual channel as a separate WAV file for some other use. Note, you should use *.w64 (wave 64) if the files are expected to be larger than 4GB in size.

It’s up to you what you want to do with the resulting files.

Alternative Solutions

Using ffmpeg only.

Note, you will need to install the latest version of ffmpeg to ensure compatibility with this solution.

I have since found you can accomplish the extraction to individual WAVes using ffmpeg only. You won’t need Cubase or tsmuxer for this alternative solution. You extract your WAVes by setting up channel mappings, then assigning those mappings to each output file. Though, this solution is just a tad bit more complicated in that you need to know what channels your input audio offers, which channels to extract and what the two letter abbreviation for the channel is within ffmpeg.

For extracting 5.1, use the following command (Linux line break style shown):

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=5.1[FL][FR][FC][LFE][BL][BR]" \
-map "[FL]" front_left.wav \
-map "[FR]" front_right.wav \
-map "[FC]" front_center.wav \
-map "[LFE]" lfe.wav \
-map "[BL]" back_left.wav \
-map "[BR]" back_right.wav

To extract 7.1, use the following command:

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=7.1[FL][FR][FC][LFE][BL][BR][FLC][FRC]" \
-map "[FL]" front_left.wav \
-map "[FR]" front_right.wav \
-map "[FC]" front_center.wav \
-map "[LFE]" lfe.wav \
-map "[BL]" back_left.wav \
-map "[BR]" back_right.wav \
-map "[FLC]" front_left_center.wav \
-map "[FRC]" front_right_center.wav

Where infile is your source file. The input can be a video file (i.e., vob, m2ts, mkv, etc) or a multitrack audio file (i.e., AC3, DTS, etc). If you are running Windows using a CMD command shell, you will need to type the command in without the line breaks shown above. So, copy and paste won’t directly work on Windows. You’ll need to use an editor to make the command Windows friendly.

Note that there are a lot of different possible mappings for various types of audio input files. Since there are now many formats of soundtrack audio available such as Dolby Atmos, Dolby DTS, AC3 and various others, you should first determine the format of the input audio with ffprobe (see below) to better understand how to map and extract the audio. The channels available for possible extraction in ffmpeg include the following:

Command:

ffmpeg -layouts -hide_banner

Output:

Individual channels:
NAME        DESCRIPTION
FL          front left
FR          front right
FC          front center
LFE         low frequency
BL          back left
BR          back right
FLC         front left-of-center
FRC         front right-of-center
BC          back center
SL          side left
SR          side right
TC          top center
TFL         top front left
TFC         top front center
TFR         top front right
TBL         top back left
TBC         top back center
TBR         top back right
DL          downmix left
DR          downmix right
WL          wide left
WR          wide right
SDL         surround direct left
SDR         surround direct right
LFE2        low frequency 2

Standard channel layouts:
NAME        DECOMPOSITION
mono        FC
stereo      FL+FR
2.1         FL+FR+LFE
3.0         FL+FR+FC
3.0(back)   FL+FR+BC
4.0         FL+FR+FC+BC
quad        FL+FR+BL+BR
quad(side)  FL+FR+SL+SR
3.1         FL+FR+FC+LFE
5.0         FL+FR+FC+BL+BR
5.0(side)   FL+FR+FC+SL+SR
4.1         FL+FR+FC+LFE+BC
5.1         FL+FR+FC+LFE+BL+BR
5.1(side)   FL+FR+FC+LFE+SL+SR
6.0         FL+FR+FC+BC+SL+SR
6.0(front)  FL+FR+FLC+FRC+SL+SR
hexagonal   FL+FR+FC+BL+BR+BC
6.1         FL+FR+FC+LFE+BC+SL+SR
6.1         FL+FR+FC+LFE+BL+BR+BC
6.1(front)  FL+FR+LFE+FLC+FRC+SL+SR
7.0         FL+FR+FC+BL+BR+SL+SR
7.0(front)  FL+FR+FC+FLC+FRC+SL+SR
7.1         FL+FR+FC+LFE+BL+BR+SL+SR
7.1(wide)   FL+FR+FC+LFE+BL+BR+FLC+FRC
7.1(wide-side)FL+FR+FC+LFE+FLC+FRC+SL+SR
octagonal   FL+FR+FC+BL+BR+BC+SL+SR
downmix     DL+DR

Using FFProbe

To determine the audio channels available to extract from your infile, use ffprobe as follows:

Linux / MacOS X

$ /path/to/ffprobe -i infile -hide_banner 2>&1 | egrep "^I|^ "

Windows

C:\path\to\ffprobe -i infile -hide_banner

The output will look something like

Input #0, mpegts, from 'my_movie.m2ts':
  Duration: 01:37:13.61, start: 11.650667, bitrate: 43437 kb/s
  Program 1
    Stream #0:0[0x1011]: Video: h264 (High) (HDMV / 0x564D4448), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 90k tbn, 47.95 tbc
    Stream #0:1[0x1100]: Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side), fltp, 1536 kb/s
    Stream #0:2[0x1200]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090)
    Stream #0:3[0x1201]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090)

The stream marked in red is the stream you will need to examine. In the output example above, this audio contains 5.1(side) audio using the DTS format. For 5.1(side) extraction, the command would look like the following:

# From the above -layouts output
# input = 24 bits and preserve output at 24 bits
# 5.1(side)   FL+FR+FC+LFE+SL+SR

ffmpeg -i infile \
-filter_complex "channelsplit=channel_layout=5.1[FL][FR][FC][LFE][SL][SR]" \
-acodec pcm_s24le \
-map "[FL]" front_left.wav \
-acodec pcm_s24le \
-map "[FR]" front_right.wav \
-acodec pcm_s24le \
-map "[FC]" front_center.wav \
-acodec pcm_s24le \
-map "[LFE]" lfe.wav \
-acodec pcm_s24le \
-map "[SL]" side_left.wav \
-acodec pcm_s24le \
-map "[SR]" side_right.wav

Audio Output Format

If you don’t specify an audio output format, ffmpeg defaults to using pcm_s16le which creates a 16 bit WAV file for each channel. If you want to preserve the bit rate of the original audio, then you’ll want to specify the output format to be the same as the input for each WAV. This means that you can, if you want, specify 16 bit for some channels and 24 bit for others. For 24 bits, specify -acodec pcm_s24le just before each -map flag (see example above). For 32 bits, specify -acodec pcm_s32le.

Also note that while you can select the individual audio track within the a movie container, you don’t really need to. Ffmpeg automatically selects the best quality audio stream available in the container. However, if there are multiple 5.1 or 7.1 audio streams containing equal quality or there are multiple separate programs, you will need to tell ffmpeg to choose which stream to decode into WAV files. Note that if you need the syntax for selecting a specific audio stream using ffmpeg, please leave a comment below and I’ll write up an example for you.

Below is a list of PCM output formats available in ffmpeg version 3.2.4:

# Find these with the command

$ ffmpeg -formats | grep PCM

 DE alaw            PCM A-law
 DE f32be           PCM 32-bit floating-point big-endian
 DE f32le           PCM 32-bit floating-point little-endian
 DE f64be           PCM 64-bit floating-point big-endian
 DE f64le           PCM 64-bit floating-point little-endian
 DE mulaw           PCM mu-law
 DE s16be           PCM signed 16-bit big-endian
 DE s16le           PCM signed 16-bit little-endian
 DE s24be           PCM signed 24-bit big-endian
 DE s24le           PCM signed 24-bit little-endian
 DE s32be           PCM signed 32-bit big-endian
 DE s32le           PCM signed 32-bit little-endian
 DE s8              PCM signed 8-bit
 DE u16be           PCM unsigned 16-bit big-endian
 DE u16le           PCM unsigned 16-bit little-endian
 DE u24be           PCM unsigned 24-bit big-endian
 DE u24le           PCM unsigned 24-bit little-endian
 DE u32be           PCM unsigned 32-bit big-endian
 DE u32le           PCM unsigned 32-bit little-endian
 DE u8              PCM unsigned 8-bit

You’ll need to prefix pcm_ to the name of the format to use it on the command line. For example, if you want to use u16be, then you would specify that as -acodec pcm_u16be.

Windows Only (eac3t0)

Tool required: eac3to

Here is the command to use to extract all of the waves with eac3to:

# N: means the stream number in the input file
# You will need to determine which stream number to use
C:\path\to\eac3to "C:\path\to\infile" N: C:\path\to\output.wavs

Example:
  C:\bin\eac3to "C:\Movies Folder\mymovie.m2ts" 1: D:\Audio\mymovie.wavs

You’ll want to choose the proper drive for each of the paths instead of C:\. Also note, the .wavs (with the ‘s’) extension is important so that all of the WAVes will be exported from this tool.

Thanks go to Eli for the eac3to solution. If you are running Windows in your audio workflow, then this tool seems to be a fast one-step alternative.

Tagged with: , , , , , , ,

Amazon Echo: What is it?

Posted in Amazon, business, cloud computing by commorancy on June 20, 2015

Amazon EchoWhat is Amazon Echo?

It’s an approximately 10″ flat black cylinder with reasonable quality speakers, an led ring around the top, a voice recognition system and a remote. While that may seem a little simple, these are the fundamental pieces that matter.

If you’ve ever used a Roku, a smart TV, Amazon Fire TV or an Apple TV, then you pretty much know what Amazon Echo is (minus the speakers). Except, Amazon Echo is intended to be used with audio programs (i.e., news radio, podcasts, music, prime music, weather radio, audiobooks, speech synthesis for reading articles, etc). Anything you can imagine with audio, the Amazon Echo would be the perfect companion. What Apple TV is to movies, Amazon Echo is to audio programs.

In addition to the Alexa audio assistant (like Siri), with a web, tablet or phone app, you can completely control your Echo with the Echo companion app. There is so much that is required by the app, you really can’t get along without it. In fact, you need the app to hook up the Echo to the WiFi which also asks a series of questions about how it will be used. So, if you don’t have a phone, tablet or computer browser, good luck setting up your Echo.

And no, you don’t need to own an Amazon tablet. You can use an iPhone, iPad or any other Android tablet or phone. In fact, you can even use your computer’s browser. Because the Amazon Echo is hooked to the Amazon eco-system, you will also need an Amazon login and password. But, you likely already have this since you purchased the Echo with it. But, if you’re planning on giving it as a gift, the person you are giving it to will also be required to have all of the above. So, Amazon Echo is probably not the best gift idea for those who are not computer savvy or those who choose not to be connected. Remember that this is first and foremost a cloud player device. The faster the speed of the internet connection, the better Echo will work.

Is the Amazon Echo useful?

That’s a good question. If you’re someone who listens to radio programs or other audio programs like podcasts, then perhaps. Though, keep in mind there are some severe limitations in what you can do with the Amazon Echo. For example, the partners Amazon has chosen for its ‘audio channels’ are limited to Pandora, iHeartRadio, TuneIn and Audible. So, like Apple TV has limited video channels, Amazon Echo also has severely limited audio channels. Because of the audio partner limits, you really get a very small selection of content. For example, if Amazon had partnered with Sirius radio, there would be a whole lot more programming choices. Or, for that matter, partnering with Muzak, Soundcloud, Rhapsody, YouTube audio, Last.fm or other audio partners, I would say there would be much more choices in audio. Until then, Echo is nice but somewhat a novelty.

Alexa vs Siri

Alexa clearly has a better voice than Siri. But, other than the voice choice, the functionality is about the same. Like Siri, Alexa has easter eggs, knows what she knows, but what she knows is very very limited. So, don’t expect to be able to ask Alexa complex questions. To activate Alexa, you simply say the key word ‘Alexa’, suffixed quickly by what you want her to do. For example, ‘Alexa, set volume to 5’. Alexa is always listening for the keyword. Once you say the keyword, Alexa will begin listening for your command.

Wording matters with your sentences or Alexa gets quickly confused as to what you’re asking. For example, there’s a difference between asking ‘Alexa, play Frank Sinatra Songs for Swingin Lovers’ or ‘Alexa, play Songs for Swingin Lovers by Frank Sinatra’ or ‘Alexa, play the album Songs for Swingin Lovers by Frank Sinatra’. The shorter you tend to phrase your request, the more likely Alexa is to do the wrong thing or become confused and do nothing. Echo sometimes hears a phantom keyword and activates.

There are many times when you ask Alexa to do something that instead of responding with ‘Okay’ or some affirmative voice response, the led ring at the top flashes in a ‘special way’. So, we’re left to try and decode the R2D2 led responses from the Echo. Instead, I personally believe Alexa should affirmatively or negatively respond to every voice command. Unfortunately, she doesn’t.

Oh, and no, there is not yet a way to change the voice to a male or some other alternative voice. Though, you can change the wake-up word. So, it doesn’t have to be ‘Alexa’.

Alarms, ToDo Lists and Shopping

It is to be expected that you can shop for music through the Echo. So, if you ask Echo to play something that leads to samples, you can buy the song that’s playing. This will then be put into your library for future playback.

You can set up to 1 alarm and up to 1 timer. This means you can set an alarm for wakeup, but you can’t have two alarms. So, if you have a spouse or partner, you can’t have your own alarm and they have one set for a separate time. That won’t work, yet. If you want to time down two different things (important while cooking), you can’t do this either. It supports only one timer.

When the alarm or timer goes off, the audio noise it makes is limited to an internal sound only. Even though you have access to Prime music and radio, you cannot set the timer to use one of those audio sources. So… limited. There are also other limits.

There is a ToDo and Shopping list that you can ask Alexa to manage. You can say, ‘Alexa, add bananas to my shopping list’. When you open the Echo app, you will have your shopping list with you in the store. You can also remote control the Echo app as long as you have Internet on your phone. So, if you have a cat and you like to leave music playing, you can set up playlists, turn the volume up or down, change the music or shut it off.

Music

This is probably where the Echo shines its brightest. With its two speaker system, the audio is bright and vibrant. Not quite as nice as the Bose Soundlink Mini, but the sound is acceptably full and rich for the cylinder design. Unfortunately, it also has no stereo and it needs it. Amazon needs to offer a companion cylinder connected by bluetooth to offer full rich stereo sound. In fact, it could offer several BT connected cylinders to offer 5.1 or 7.1.

Beyond the sound quality issues, having access to Prime music is a necessity here. If you aren’t a Prime member, you really can’t take advantage of what Echo offers. If you do have Prime, then you get access to not only whatever you’ve purchased or uploaded to Amazon’s cloud player, you also get access to the full Prime music library. Still, Amazon’s Prime library is limited. It seems to have a lot of classic rock choices, but not all of it. So, while it has Fleetwood Mac and the Eagles, it doesn’t have Supertramp, for example.

Though, Autorip is your friend with Echo. If you buy a CD with Autorip, it automatically becomes available on the Echo as soon as you’ve paid. However, if you purchase a CD at Target and rip it, you’re limited to 250 uploaded songs unless you pay Amazon an additional $25 a year for 250,000 song uploads.

Audiobooks

If you are a big Audible.com consumer, then you have a distinct advantage with the Echo. You can listen to all of your audio books right on the Echo. If your library is vast, you’ll immediately have a lot of content available to you. In hindsight, I should have been buying audio books when offered with my Kindle purchases, but I never really had any way to play them. With Echo, that’s changed. I will definitely consider audio books in the future.

Kindle Support?

In short, no. There is no support for Alexa to read back Kindle book content using Alexa. Alexa would be the perfect companion to the Kindles that do not offer audio voice playback. Considering this is an Amazon product and would be the perfect companion for the Kindle, the integration between Kindle and Echo is non-existent.

Audiophile Quality?

Definitely not. You’re playing streaming music here, in mono no less. So, while the Echo is great for podcasts, news and incidental background music, don’t give up your audiophile gear. Much of the music streamed from Amazon prime has the telltale mpeg haziness. Echo never skips or stutters while playing Prime or library music, so its streaming IO seems quite robust, but it just doesn’t sound high quality. This is definitely not to be considered an HD quality device as it clearly isn’t. So, don’t go into an Amazon Echo thinking you’ll be getting a high quality music experience. The music does sound decent, but it’s not anywhere near perfect.

Though, for news, podcasts and other spoken word programs, the Amazon Echo is perfect for this use.

Speech Synthesis and Browsing

The voice for Alexa sounds great most of the time. However, when reading back a synopsis Wikipedia article, she doesn’t always do a great job. While music is Echo’s strongest area, the article reading is easily one of Echo’s weakest. Instead, of becoming an audio web browser (which is what Echo should become), Alexa only offers page snippets of articles and then encourages you to crack open a browser or tablet and finish reading there. If Echo is going to do this, why bother using Alexa at all? If I can get better results by reading it myself, then Alexa is pointless for this purpose.

Instead, Alexa should provide full 100% article reading. Read me news, wiki articles or, indeed, any other page on the web. If I ask Alexa to browse to Yahoo News, Alexa needs to be able to read article headlines and let me choose which article to read back. Literally, Echo should become an audio based web browser. Echo should set the standard for audio web browsing so much so that Yahoo and Google optimize their pages for audio browsers much like they are now doing for mobile devices.

Kitchen Use?

Echo would be the perfect companion in the kitchen. Tablets and other touch devices are no where near the perfect device in the kitchen. They get dirty and must be touched by dirty or wet hands.  Echo, on the other hand, is the perfect hands-free kitchen companion. ‘Alexa, how do I make Beef Stroganoff’. Seems like a simple recipe request, but no. Alexa has no knowledge of cooking, recipes or anything else to do with kitchen chores. This seems like a no-brainer, but Amazon made no effort here.

Problems and Crashing

After having unboxed Amazon Echo, it had already crashed within 10 minutes of using it. Not the app, but the actual Echo. The app lost connectivity to the Echo until it had rebooted. Though, I have also had the app crash. So, this first incarnation of the Echo is a little beta still. I’m guessing that’s why they cut the 50% off deal with those who were invited to pick them up for testing. Though, when the Echo works, it does work well.

Improvements

The Amazon Echo could benefit from a number of improvements including:

  • Battery backup
  • Full audio web browsing
  • Games (i.e., chess, checkers, etc)
  • Better interactive integration between Echo and its companion app
  • Satellite interfaces (to use Echo in every room)
  • Stereo audio / Multichannel audio (using multiple cylinders)
  • Audio playback to stereo BT devices (i.e., headphones and speakers)
  • Speakerphone
  • Remote control of Amazon devices
  • Check status of Amazon orders
  • Recipes and general kitchen helper
  • Alexa reading Kindle books
  • More audio channels such as:
    • Sirius Radio
    • Police Scanners
    • Custom podcast URLs
    • SoundCloud and similar sites
    • YouTube Audio
    • Last.fm
    • Spotify
    • MySpace Music
    • Amie Street
    • A much bigger selection of Internet radio stations
    • Archives of pre-recorded news broadcasts

Limitations

This first incarnation of Amazon Echo is quite limited. Echo has about 1/10th of the feature set you would expect to offer a complete experience. For example, it should become an audio web browser. Audio is the next evolution in browsers. Sitting at a computer watching a screen is time consuming. But, using an audio web browser, you could browse the web and work on other things. It’s easy to listen and still focus on other tasks. We do it all the time.

In fact, Alexa needs to be imported into every Amazon device including the Fire phone, Fire tablet line and every other interactive device it makes. While Alexa needs to be on every Amazon device, the use case of Echo and all of the audio channels should still be limited to the Echo.

So, while Alexa exists and works as well as Siri, Alexa is simply the input and output device on the Echo out of necessity. The functionality of the Echo needs to firmly focus on all aspects of audio communication including podcasts, dictation, news programs, web browsing, audio books, cooking, music and more. Alexa shouldn’t be overlooked as the home helper, but not strictly on the Echo. I know that Amazon is planning on expanding the Echo to supporting home automation through such phrases as ‘Alexa, turn on the light’. But, that requires a home automation system that interfaces with the Echo. There are probably other uses just waiting to be explored.

In fact, if Amazon were to put Alexa on every device, you could have a unified Alexa system throughout your home. So, each device could learn the types of things you do regularly and share that among all of the Alexa systems. So, if you frequently ask for a specific type of music, Alexa could offer recommendations for new playlists.

Overall, it’s currently an okay device. Out of 10 stars, I’d give it 4 stars. Amazon compromised just a little too much in all aspects of this device to make it truly outstanding. In fact, Alexa should have had white LED lights on the unit so that it could illuminate the room. It also needs a battery backup so you can still use some of Alexa’s basic functions, like the alarm clock, if the power goes out. The next incarnation of the Echo will likely make up for its current shortcomings.

Tagged with: , , , ,

Cinavia: Annoying? Yes. What is it?

Posted in botch, business, california by commorancy on February 23, 2014

If you’re into playing back movies on your PS3, you might have run into an annoying problem where your movie plays for about 20 minutes, then the audio suddenly drops out entirely with a warning message on the screen. This is Cinavia. Let’s explore.

What is Cinavia and how does it work?

Cinavia is an audio watermarking technology created by the company Verance where an audio subcode is embedded within digital audio soundtracks at humanly imperceptible levels, but at a level where a DSP or other included hardware chip can read and decode its presence. Don’t be fooled by the ad with smiling children on the Verance site, this has nothing to do with helping make audio better for the consumer. No, it is solely created for industry media protection.

This Cinavia watermark audio subcode seems to be embedded at a phase and frequency that can be easily isolated and extracted from an audio soundtrack, then processed and determined if it’s valid for the movie title being played back. Likely, it’s also an analog audio-based digital carrier subcode (like a modem tone) that contains data about the title being played.

How is Cinavia used in the film industry?

There are two types of known uses of Cinavia watermarking. The first use is to protect theatrical releases from being pirated. Because the audio watermarking is audible, but imperceptible, it will be picked up by microphones (strictly because of the Hz range where the subcode is embedded). Keep in mind that just because the subcode cannot be heard by human ears, it doesn’t mean it can’t be heard and decoded by a specialty hardware chip. So, if a theatrical release is CAMed (i.e. recorded from the screen), the Cinavia watermarking will also be recorded in the audio. After all, what is a movie without audio?

The second use is to protect Blu-ray copies of films from being pirated. For the same reason as theatrical releases, Blu-ray films are also embedded with a subcode. But, that subcode is different from theatrical films. For this reason, films destined for theatrical releases will never play in a consumer Blu-ray player ever (including players such as the PS3, PS4 or Xbox One). Commercial Blu-ray disks play because the audio track uses AACS with a key likely embedded within the subcode watermark. If the AACS key matches the value from the watermark, the check passes and the audio continues to play.

I have also read there is a third use emerging… to protect DVD releases. But, I have yet to confirm any DVDs currently using this technology. If you have run into any such releases, please leave a comment.

How would I be affected by this?

All consumer Blu-ray players manufactured after 2012-2013 are required to support Cinavia. If the Cinavia subcode is present, the player will blank the audio track if the AACS key is mismatched. This means hardware Blu-ray players from pretty much any manufacturer will be affected by Cinavia protection if the title supports it. CAM copies of theatrical releases will never play because the audio subcode is entirely different for theatrical films and the Blu-ray player will recognize that theatrical subcode and stop audio playback.

Not all movie titles use Cinavia to protect their content. Not all players support the Cinavia protections from all media types. For example, some Blu-ray players can play media from a variety of sources beside BD disks (e.g., USB drives, Network servers, etc). These alternative sources are not always under Cinavia protection even if the specific movie has an embedded subcode.

Since Sony is the biggest proponent and user of this technology, all Sony players, including the PS3 and PS4 along with their standalone Blu-ray players will not play back Cinavia protected material if it doesn’t continue to pass the subcode tests. For example, if you rip a Blu-ray disk protected by Cinavia and then burn it to a BD-rom disk, the movie will stop playing audio at around the 20 minute mark and display a warning. If you attempt to stop and start the movie, it will play audio again for a few seconds and then stop playing with a warning.

How can you remove Cinavia protection?

In short, it’s not as easy as that may sound. Once the Cinavia protection is detected on the media, the hardware activates and continues to look for the information it needs to make sure the content is ‘legitimate’.

With that said, there are ways of getting around this on certain devices. As I explained, some players don’t check for Cinavia for certain types of media (i.e., USB or Network streaming). Sony, however, does check for all media types. The PS3, though, doesn’t seem to check for Cinavia if the playback is through the optical output port (i.e., when playing back through an optical receiver). That would make sense, though, as it would be left up to the receiver to blank the audio based on Cinavia. Since most receivers probably don’t support Cinavia, there should be no issue with playback.

Other technical methods include garbling the audio somewhat or using variable speed on the audio. Neither of these two methods are really acceptable to the ears when watching a movie. We all want our movies to both look and sound correct.

How can I avoid this problem?

You can easily avoid this issue by using a a player that doesn’t support Cinavia protection. For example, Windows Media Player, VLC, etc. Most PC media players do not support Cinavia. Though, if you get a PC from Sony, expect the media player on any Sony product to support Cinavia (yes, even Windows Media Player might as Sony may have loaded a system-wide Cinavia plugin). If you buy a PC from any manufacturer other than Sony, you likely won’t be affected by Cinavia.

This problem almost solely exists on Blu-ray standalone players. So, if you avoid playing movies on such consumer hardware players, you can usually avoid the Cinavia issue entirely. Though, there are some commercial PC media players that do support Cinavia.

A possible real solution?

Another method which I have not seen explored, I have decided to propose here. With a film protected by Cinavia, the Cinavia subcode should exist both within silence as well as noisy portions likely at the same volume. First, extract a length of silence (that contains Cinavia subcode). Now, garble, stretch, warp and generally distort this subcode so that it cannot be recognized by a Cinavia decoder. Then duplicate the garbled ‘silence’ subcode to fill the length of the entire film. Extract the film’s audio soundtrack, mix in the new garbled full length subcode throughout the entire film. Note that remixing 7.1 or 5.1 track is a bit tricky, but it can be done. I would suggest inserting it on the subwoofer track or the center track, though it may be present on all of the tracks by design. After the audio track is remixed and remuxed into a resulting MP4 (or other format), the new garbled subcode should hopefully interfere just enough with the existing already-embedded subcode to prevent the Cinavia protection from getting a lock on the film’s original subcode.

The outcome of the garbled subcode could cause one of two things to happen. 1) The Cinavia detection is rendered useless and the Cinavia hardware ignores the subcode entirely or 2) The Cinavia detection realizes such tampering and shuts down the audio track immediately. While erroring on the side of fail is really a bad move in an industry already fraught with bad press around failed past media protection schemes, I would more likely suspect scenario number 1. But, it’s probably worth a test. No, I have not yet had time to test my theory.

While this doesn’t exactly remove Cinavia, it should hopefully render it useless. But, it won’t recover the lost audio portions being used by the Cinavia subcode.

How would I go about doing this?

I wouldn’t attempt doing the above suggestion manually on films as it takes a fair amount of time demuxing audio, creating the garbled audio subcode, remixing the new track and remuxing it into the video. But an application capable of ripping could easily handle this task during the rip and conversion process if provided with a length of garbled subcode.

[Updated: 2018-01-06]

Apparently, DVDFab seems to have a way to rip and disable Cinavia protections according to their literature. They have released the DVDFab DVD and Blu-ray Cinavia Removal tool. If you’re still having difficulties with Cinavia while watching your movies, it might be worth checking out this tool. Note, I have not personally used this tool, so I can’t vouch for its effectiveness. I am also not being sponsored by DVDFab in this article. I’m only pointing out this tool because I recently found it and because it seems to have a high rating. On the other hand, I do see some complaints that it doesn’t always recognized and remove Cinavia on some movies. So, caveat emptor. Even though it’s not an inexpensive product, it is on sale at the time of this update for whatever that’s worth.

It seems that someone finally may have implemented my idea above. Good on them if they did… it only took around 4 years.

Tagged with: , ,
%d bloggers like this: