Rival voice assistants were supposed to kill the App Store, but Apple’s Shortcuts app suddenly made Siri ultra powerful.
The Great Expectations of Alexa
Amazon’s Alexa voice assistant certainly involves some impressive technology. Alexa, along with a variety of similar voice services operating in China; Assistant; Cortana; and Samsung’s me-too Bixby were all created in the model of Apple’s original Siri, with additional differentiating features intended to make them superior. In some ways, Alexa and other voice services made Siri look embarrassingly antiquated.
Alexa specifically aimed to muscle into Apple’s extremely valuable ecosystem of iOS and Mac users and peel away customers devoted to Amazon. Alexa originated at Amazon’s Lab126 R&D center as a foundational part of Amazon’s Fire Phone strategy, the Android-based smartphone project dreamt up and meticulously directed by Amazon’s founder Jeff Bezos. The project was reportedly started in response to Steve Jobs’ dramatic unveiling of iPhone 4 in 2010, but wasn’t ready to show until the summer of 2014.
Fire Phone was designed to replace Apple’s general purpose iPhone with more of a handheld shopping tool that could use its camera and microphone to recognize objects and then order them for you, from Amazon. A primary selling point was FireFly (below), a port of a camera shopping app Amazon had been offering for years on iOS without much attention.
After failing to sell Fire Phone as camera-based shopping tool in 2014, Amazon’s Lab126 began offering Alexa Echo as an Internet microphone
After Fire Phone flopped, Lab126 scrambled to repurpose some of its expensive technology in a new form. By the end of 2014, Amazon was ready to ship its standalone Echo Internet appliance, which lacked either a camera or a display but could respond to voice requests using the company’s cloud-based analysis.
Echo and related products slowly gained traction before exploding into a supernova of media excitement in 2016 that proclaimed Alexa to be not just an interesting new technology, but the One True Way of the future and the glorious promised leader who would slay Apple and bring order to the Force. That’s hardly even an exaggeration.
In April 2016, Slate published a piece that called voice services the upcoming replacement of GUIs and multitouch and claimed that mobile apps were going away to make way for “virtual assistants, bots, and software agents.”
It even proclaimed “Alexa and software agents like it will be the prisms through which we interact with the online world.” Metaphorically, of course; Echo is actually a cylinder and fell short of literally splitting the Internet into rainbows, despite all the colorful prose that rained down on it from pundits.
Slate, for one, concocted some over the top prose to welcome voice agents like Alexa as our new overlords
Later that year Business Insider gleefully outlined the late 2016 forecast of Mizuho analyst Neil Doshi, who imagined that Alexa would be generating $11 billion in revenues for Amazon by 2020. Just $4 billion of that was expected to come from hardware sales (of 41.3 million annual units of ~$100 Amazon Alexa devices) while $7 billion would come from Amazon orders.
That math assumed that Amazon wouldn’t slash the price of its Echos to $30 or less, and that half of the user base would be ordering “$25 worth of products 5 times a year on average,” with the implication that these orders would be newly prompted by the presence of Alexa, not just the stuff people were already ordering.
“We believe,” Doshi wrote, “that the Alexa-enabled Echo and its family of product, coupled with transactions and apps, could provide a large revenue opportunity, and make Amazon a pivotal part of peoples’ everyday lives.”
That giddy prediction effectively suggested that Alexa would soon be bringing in about as much every year as Apple’s Services are now doing in a quarter. In other words, the boring, piddly snoozefest of non-hardware sales related to Apple’s App Store, which barely matters because it isn’t nearly as big as the iPhone, divided by four, was the large and exciting prospect driving unhinged excitement around the idea of Amazon’s internet microphones.
RBC Capital were among the many analysts who agreed, based on a survey it conducted that appeared to indicate that over one quarter of Alexa-powered device owners “said they made purchases ‘very’ or ‘somewhat’ often through voice shopping,” which was interpreted to be “an impressive number given Alexa just rose to prominence over the past 12 to 18 months.”
In 2017, Juniper Research wrote that “advertising is the biggest revenue opportunity for voice assistants,” imagineering a forecast that voice-assistant advertising would “reach nearly $19 billion globally by 2022.” Researcher James Moar hedged his monumental bet by acknowledging that “voice-based interaction presents less options [sic] than other forms of advertising, meaning less adverts [sic] are possible.”
Writing for Time in 2017, Lisa Eadicicco eagerly announced, “Amazon Is Already Winning the Next Big Arms Race in Tech!”
“The most convincing evidence of the Seattle-based giant’s advantage? Alexa, Amazon’s voice assistant, is dominating this year’s CES,” Eadicicco wrote.
Eadicicco certainly wasn’t alone in prematurely characterizing Amazon’s Alexa as the winner that was “already” revolutionizing the technology industry, mostly just because various companies had announced Alexa-compatible devices at trade shows like CES. However, there wasn’t much thought given to whether people were actually going to buy Alexa-powered shower heads or rush to replace their smoke detectors with Alexa-powered units that cost $250 each.
Instead of thinking up the fearsome doubts that Apple has to disprove, such as “who would pay $999 for a phone!?” or “who will keep paying $2 for apps, when there are so many available?” the journalists fawning over the premise of Alexa simply assumed Amazon’s incredible competency at managing global warehouse sweatshops would translate into launching a new hardware interaction model, a vibrant new App Store-killing “skills” market, tons of exciting new advertising, and definitely billions of dollars of new Amazon orders.
Everyone knows it’s child’s play to compete with Apple and just run it out of business. Just look at Surface, Nexus, Pixel, Moto, Essential, Galaxy, Fitbit, Swatch, and Fire. However, something went wrong on the way to Alexa destroying mobile apps and crushing Apple’s iOS platform.
Alexa and the Voice First Reality
Three years later, iOS mobile apps have grown dramatically rather than shriveling up into obscurity as Slate prognosticated—along with Recode, The Verge, TechCrunch, Quartz, and many others who wrote up their treatise on the Bot War on Apps waged by those “virtual assistants, bots, and software agents” like Alexa.
Almost all of our Internet interactions still involve keyboards, trackpads, or multitouch gestures—with voice being only occasionally useful for tasks like dictation, setting an alarm, or asking for a specific song. The unhinged excitement about Alexa in a world of “ambient computing” turned out to be just about as “revolutionary” as the CueCat.
The primary intent of Internet microphones was to help drive Amazon’s online sales or further Google’s surveillance advertising efforts. Their progress turned out to be wildly exaggerated. Instead of “often” driving voice shopping, Alexa was actually found to be doing virtually nothing to spark Amazon orders. The Information reported that just 2 percent of Alexa users ever tried to make a purchase from Amazon in a year, and 90 percent of those who did never tried again. And that came despite the fact that early adopters of Echo were already Amazon’s Prime customers.
Rather than generating billions of additional shopping revenue as Mizuho and RBC analysts imagined a couple of years ago, Alexa has been as much of a gimmick as Amazon’s earlier efforts to promote online shopping with A9 Flow and Fire Phone’s FireFly service.
Additionally, consumers have bristled at the mere thought of Alexa or Assistant injecting advertising messages into their conversations. Both companies have adamantly claimed that they are not even thinking about pushing voice advertising, based on the caustic receptions they get whenever they try to sneak in some advertising. There goes another “$19 billion” opportunity that was supposed to keep voice assistants flush with cash.
On top of that, the idea that third-party developers would at some point be earning App Store-like revenues from building voice agent “Alexa Skills” totally failed to materialize. This March, Bloomberg detailed a variety of Alexa developer experiences, noting that “the advent of the smartphone triggered an app gold rush. So far that hasn’t happened with Alexa.”
Amazon Alexa Skills didn’t exactly slay the App Store
Meanwhile, Apple’s App Store sales, search advertising, and subscriptions grew 19 percent over just the last year, driving its Services business into a $10.9 billion per quarter enterprise. Apple has even harnessed the installed base of Amazon Alexa as a new way to promote its own Services revenue via Apple Music. It’s as if Apple wins even when everyone says it’s losing.
Alexa’s Expensive Free Hardware Gambit
While the iOS app and subscription content paradigm continues to grow in influence and value, the core of Apple’s business is hardware—to the tune of well over $200 billion in revenues every year.
Amazon Alexa and Google Home were lavished with oversized attention for years over rather minor sales of their Internet microphone products that weren’t remotely profitable. During that time, Apple’s hardware-based profits—virtually all of which featured Siri—were around $50 billion annually.
In other words, while Amazon and Google burned through tons of money across four years of just trying to establish an installed base of microphoned users with the hope of someday monetizing them, Apple earned $200 billion in hardware profits. Today, many of the same journalists who were excited about the promise of Alexa Voice monetization are now pouting that Apple is also working to sell additional new services to its vast installed base. It’s almost as if they have a horse in the advertising game.
That is exactly why voice needed to be chattered about as if it were going to materially hurt Apple. Amazon and Google had already proven that they can’t sell a phone, a tablet, a watch, a tv box or any other hardware at a profit, so they really needed a Big Lie. The suggestion that voice was going to topple apps—and very clearly by extension Apple—was the root of the whole fantasy erected to support the idea that suddenly there was a hardware category where Amazon and Google could win even while they were very clearly losing.
Neither Amazon nor Google have even attempted to earn any Apple-like hardware profits from their mostly loss leader Internet microphone offerings that they effectively give away. Rather than being poised to sell another $4 billion in hardware next year as Mizuho analysts predicted, Amazon is now facing an increasingly saturated US market and intense competition from Google, which is copying its cheap device strategy. Both also have little hope of entering new markets like China where local companies have already rolled out their own voice alternatives. Internet microphones are pretty obviously not the next iPhone.
Amazon’s invention of dozens of new Alexa form factors—from stationary tablets to wall clocks to a microwave that orders popcorn for you—have not shifted where money is being made in tech hardware. Instead, Apple has dramatically increased the perceived value of its iOS offerings to the point where mainstream iPhone buyers now spend an average of nearly $800 on new iPhones—and rather loyally do so every two or three years. Apple’s growing base of customers have also kept buying tens of millions of premium iPads and Macs, and millions plunked down new money on the hottest product of the year: Siri-capable AirPods.
Rather than giving away hardware, Apple sells over $200 billion worth of gear annually that also supports Siri features, from HomePod to AirPods
Apple reportedly sold 35 million units of AirPods alone last year, a figure which would generate about $5.6 billion of revenue. AirPods also earn money, rather than just being a loss leader racing to deploy “microphone market share” capable of summoning Siri.
Apple’s HomePod sold about 3 million units across just the second half of last year, generating more than another billion in revenue at its debut. That’s equivalent to the revenues of 20 million units of Echo Dots (when they’re not on sale), but HomePod is profitable. Nobody predicted that Apple would be earning more money than Amazon from “smart speakers” back at the peak of Alexa hysteria.
But beyond accomplishing little for online shopping, or advertising, or for developers, or drumming up any new hardware profits, there’s a larger problem for Alexa—the initiative is blowing through tons of money while doing all that nothing.
The idea that Apple fell behind in the voice-based assistant race that it effectively initiated back at the 2011 introduction of Siri has made for a compelling, truthy media narrative. Siri as a service was clearly being bested by various features and abilities of Alexa and Assistant. Even Apple’s most ardent fans were vocal in hoping the company would plough more money into Siri research so that it, too, could maintain more conversational requests and answer trivial questions at least as well as Alexa.
Amazon has invested spectacularly in Alexa research. In 2016, Bezos, the company’s chief executive, said that about 1,000 people were working on Alexa and the Echo. Toward the end of 2017, the company’s senior vice president of devices and services David Limp announced there were 5,000 working on Alexa.
The company is notoriously slippery with its numbers, but if that actually meant “5,000 employees” it would represent close to half the workforce capacity of the shiny new Apple Park. Imagine Apple devoting half of its new campus just to work on Siri features. If shareholders heard that, they’d have good reason to dump Apple’s stock because Siri itself isn’t a revenue generator. Alexa not only doesn’t generate revenue, but it doesn’t even pretend to help Amazon sell $200 billion of other hardware each year the way Siri does.
Imagine dedicating nearly half of Apple Park just to work on Siri
The idea that Apple reportedly once had 1,000 people working on some sort of automotive efforts in Project Titan—and laid off 200—was taken as shocking pivot of wasted effort, even in a year where Apple had earned $59.5 billion in profit on sales of $266 billion in product. Critics similarly fretted that Apple Park’s supposedly $5 billion price tag was astronomical, arrogant and a portent of doom.
But for Amazon, the idea that 5,000 people were working on a loss leader novelty feature that wasn’t accomplishing anything across years of monumental investment was just plain exciting. Even more so was that fact that Amazon was also trying to hire hundreds more to work on its “Alexa engine” and “Alexa machine learning.” And even more exciting was the fact that Amazon was building its own urban campus in the middle of Seattle for $4 billion, featuring giant Spheres full of plants.
In 2018, Amazon reported $141.92 billion in product sales and earned all of $10 billion in profits. Dumping tons of money on Alexa was clearly part of the reason Amazon wasn’t nearly as profitable as Apple, year after year over the last half-decade. But all these years later, Amazon has effectively nothing to show for it.
Last year, Apple actually paid out more just in shareholder dividends—about $13 billion—than Amazon reported earning. Amazon doesn’t even pay dividends. Clearly, something would have to change for Alexa to be worth shoveling so much money at. Investors clearly must think so because Amazon is valued stratospherically, with a Price to Earnings ratio of 91.7, compared to just 16.7 for Apple. But when will Alexa start generating tens of billions of dollars, and how exactly?
Since 2014, Amazon’s voice efforts have only managed to create an “Echosystem” of 100 million devices, a number that includes every device shipped by anyone, with some support for Alexa; last fall CIRP estimated the installed base of all the voice assistant Internet microphones ever sold as being just 50 million.
Apple has an active installed base of 900 million iPhones, 100 million Macs, and hundreds of millions of other devices—from iPad to AirPods—designed to work with Siri, not even counting all of the third party products that support HomeKit or other Made for iOS integration with Siri. Yet pundits kept talking about Alexa as if it has some unique “Amazon Everywhere” or “first mover” advantage.
Alexa was neither everywhere nor the first mover in voice assistants
Cloud competency vs user-driven automation
Alexa can certainly be used to do things Siri can’t. There are some devices (including peripherals from Amazon’s Ring subsidiary) that only work with Alexa and not Siri. Alexa can run a variety of third-party skills, even if there’s not much real value in those. In various ways, Siri has become a runner up in terms of the value of its “cloud competency” compared to Alexa and Google Assistant.
But rather than devoting thousands of workers and shoveling tons of money at Siri to “catch up” to make Siri equally good at doing everything Amazon has worked to make Alexa capable of, Apple pursued an entirely different strategy focused on user-driven automation.
While Alexa aspires to understand and anticipate anything a user might ask, Apple focused on Siri performing a subset of specific, valuable tasks. This was frustrating for many users because there wasn’t any obvious way to know in advance if Apple had implemented a given feature. And in many cases, Apple hadn’t, such as in enabling multiple timers or other features that Alexa explicitly could. Feature comparisons of Siri against Alexa or Assistant were often unflattering.
At the same time, independent behavioral studies kept highlighting the reality that most people were only really using voice services to do a few common, memorable tasks, such as playing music or setting alarms. Certainly Apple knew this, given that it had access to massive usage data from the most frequently and broadly used voice service in the world.
A primary area where Apple focused attention was accessibility. At WWDC 2017, Apple presented Todd Stabelfeldt, a disability advocate and quadriplegic who outlined how Siri, paired with the alternative motor control features of iOS Switch Control, and the display and camera hardware that supports FaceTime and iOS 10’s new Home app, empowered him to live a productive life. He implored developers in the audience to make full use of the assistive technologies in Apple’s platforms.
Sarah Herrlinger, Apple’s senior manager of accessibility policy and initiatives, stated at the time, “We put a lot of time and effort into making sure our products are as accessible as possible for all users. For some people, doing something like turning on your lights or opening a blind or changing your thermostat might be seen as a convenience, but for others, that represents empowerment, and independence, and dignity.”
Beyond accessibility, Apple’s attention to meaningful enhancements to Siri as an integral part of its various platforms has also followed a user-driven focus, rather than just showing off frivolous entertainment features like the ability to recite jokes and maintain an ongoing, fake conversation with users.
As we predicted in advance last year, Apple’s approach to making Siri more useful for real-world users involved acquiring Workflow and integrating its automation features into “Shortcuts,” complex task workflows that users can then invoke with their voice via Siri.
Siri Shortcuts shift voice assistance from being a genius in the cloud to a useful way to trigger common tasks on the devices you already have
You can build Siri Shortcuts to, for example, begin a workout, start a favorite music playlist, and put your phone in Do Not Disturb mode when you tell Siri “I’m at the gym,” a set of actions no cloud service could reasonably anticipate for you. Sometimes the best assistant is somebody who can do exactly what you specify, rather than trying to guess at what you might want.
At last year’s WWDC, Apple laid out an entire strategy for Siri automation that included third-party developers creating Shortcuts for users, an easy to use Shortcut app that lets individuals set up their own Siri-enabled workflows, and even iOS using artificial intelligence to recommend Siri Shortcut actions individuals can activate with a spoken phrase, based on their common tasks—such as ordering an almond milk latte on the way to work every weekday, or checking where their next calendar appointment will be and getting directions to it.
Apple continues to maintain the advantages Siri originally had—including the largest installed base of users and connected devices in the most places around the globe, as well as support for by far the most languages worldwide. But it now also has Alexa’s ability to leverage third-party efforts to expand and customize its functionality, albeit in a much more individualized way, using sophisticated, personalized workflows.
Apple achieved this without having to hire thousands of employees and create dozens of different Echo-like variants to effectively give away at tremendous cost. So all these years later, rather than being a genius, step-ahead strategy destined to destroy Apple’s position and take away all of its assets, Alexa appears to have been mostly a vast, speculative waste of resources.
When all you have is a hammer, everything looks like a nail
The voice engineers at Amazon, Google, Microsoft, Samsung, and Facebook (remember “M”?) all racing to deliver a better Siri pretty clearly overestimated the value of voice assistants, certainly in the short term. But there’s nothing new about getting carried away with the perceived value of whatever technology you’re currently working on.
Blackberry once thought that its BES messaging service and its physical keyboard would mount a sold defense against Apple’s iPhone. Microsoft once thought Windows was so valuable that it could slowly roll out tablets and phones and the enterprise would just wait around for years for it to catch up to iOS. And back in the day, Apple once thought its Mac desktop was so great that it could just tinker on wonky futuristic new ideas like QuickDraw GX, QuickTime 3D, and PowerTalk, and its clients would adopt them.
In each case, journalists and analysts often agreed and supplied their own figures and logic to support ideas that were clearly not realistic, just as they all bandwagoned on the ridiculous idea that Internet microphone voice appliances were going to rid us of iPhones and apps so that instead of living our private lives staring at screens, we’d walk around shouting our private life into a series of speakers, one in each room and maybe one in our car. It all makes so much sense when you turn your brain off!
Apple didn’t rise to prominence by giving away free stuff. It became the global economic powerhouse of today by creating good products that people wanted to pay money for. The profits from those sales continued to fund new generations of iPods, iPhones, iPads, as well as the chips to power them, the iOS upgrades to enhance them, and new generations of devices and form factors from Apple Watch to HomePod to Apple TV to AirPods.
Siri has served as a useful component to Apple’s products, but certainly has never become the company’s primary strategy. Apple’s years of alleged neglect of Siri appear to have caused no discernible problem in the long term. Apple is now making significant profits from its own voice-based music appliance, as well as with its Siri-capable wearables, mobile devices, its HomeKit ecosystem and with CarPlay integration.
Apple has continually expanded Siri’s relevance in ways that the tech media have ignored, simply because it didn’t fit their nutty narrative of Alexa killing the App Store. Siri’s support for more languages and its installed base on over a billion devices means that meaningful enhancements—including last year’s DIY Siri Shortcuts—can be rolled out faster and cheaper than either Amazon or Google can design and shovel out tens of millions of nearly free new devices–this time with a camera and a display–at their own expense.
Apple’s increasingly large installed base of Apple Watch users can now effortlessly invoke Siri with Raise to Speak. And iPhones, which already use AI to categorize your photos by people, places, moments, categories, and various objects within them, can now be invoked by Siri: “show me my pictures of motorcycles in Mexico.”
watchOS 5 introduced even more fluid access to Siri
Who would entrust Amazon or Google to pull all their personal photos into the cloud for analysis and categorization, given their histories of misusing personal data or accidentally sharing it to the wrong person, or handing it to the authorities in whatever country demands it? One of Siri’s primary features is that much of the metadata it can make available to users is local to their devices and never leaves.
The tightly integrated nature of Siri means that even a significantly better voice competitor wouldn’t be enough to gut and replace Apple’s $200 billion worth of annual hardware sales. Apple’s users can make use of Alexa, or Assistant or even Cortana without changing their hardware and without removing Siri functionality.
In fact, the majority of Alexa users are already Apple customers; they haven’t given up anything to play with Echos. But in the end, those customers are far more vested in Apple products than in cheap Amazon devices that can be replaced tomorrow with Google Home freebies, and then after that by the next wave of free hardware trying to promote the next fad, charging nothing and gaining nothing but a huge write-off and a temporary surge of non-commercial customer activity.
We’ve now witnessed just over four solid years of Alexa-based hardware sales that have added up to effectively nothing, even as Apple’s installed base of interconnected devices have radically expanded to create a valuable platform that actually sells billions of dollars worth of apps and other third-party investments, including HomeKit peripherals, service subscriptions from Google and Microsoft, and replacement Lightning cables by Amazon.
The Alexa voice fantasy isn’t unique in being the only media narrative that’s complete nonsense, but it’s a great example of how bamboozled analysts and journalists can be when some new technology is dangled in front of them with the premise that it will destroy what is currently the most valuable and productive tech company. Their credulity isn’t evidence of reality.