When Andy Rubin sold his startup Android to Google in 2005, the tech landscape looked vastly different. Our phones had clunky keypads, barely functional cameras, and mini screens that didn’t support touch input. They also couldn’t run the same apps. It’s hard to remember what that felt like in a world where Android-powered mobile devices are everywhere, each one built on the same software foundation.
More than a decade later, Rubin is trying to crack the next big platform war that’s brewing in the consumer tech space. “There were so many frickin’ operating systems for phones,” Rubin says. “And it reminds me of what’s happening in the home and Internet of Things [market] today.”
In May, Rubin gave the public a first look at what his newly founded company, Essential, has been up to. The firm recently launched a new Android phone that Rubin hopes will be future-proof: it has a magnetic connector for attaching accessories that can enable new features, like 360-degree video capture, without having to upgrade your device. And other crucial features, like the camera, are capable of being updated via software over time.
But Rubin’s goals are bigger than battling planned smartphone obsolescence. He wants to invent a new type of software that’s capable of connecting all of the devices in your life, be they smartphones, smart home appliances, or smartwatches. For now, he’s starting with the home. When Rubin unveiled Essential in May, he also announced another yet-to-be released product called Home, an ambitious hub that runs on Essential’s new Ambient OS and can control all sorts of intelligent household gadgets. With this device, Rubin wants to give all smart home tech an easy way to talk to one another, whether they run on Apple’s HomeKit or Samsung’s SmartThings, or answer to voice assistants like Apple’s Siri or Amazon’s Alexa.
Rubin spoke with TIME about his intentions for Ambient OS, how he sees artificial intelligence changing the way we use our phones, his response to criticism of the Essential Phone, and more. The transcript has been edited for length and clarity.
TIME: What’s the next big idea after the smartphone? Apple seems to think it’s augmented reality. Amazon is focusing on virtual assistants. What is it for you?
Andy Rubin: In the multi-device world, there are unaddressed segments. And I think something has to tie together all of the devices in your life, whether it’s your phone or a watch. Or maybe one day it’s a smaller phone that’s less capable. Maybe [the small phone] is really good at communicating, but you don’t need every app on the phone. And [when] you’re going out to dinner or going jogging or to a sporting event, you’re going to want to take the smaller phone because it’s more convenient than a phablet. And you might choose to take the phablet to work because it has a big screen. So I think consumers will have more choices in the future.
Carriers are already starting to support this scenario with electronic SIM cards, and all the carriers in the U.S. have a service where you can have one phone number that rings multiple SIM cards. So I think you’ll see these platforms emerging that allow you to basically manage that from a user interface perspective. Don’t think of it as a management burden where it’s yet another thing I have to do. Think of it as we can invent software technology that makes that invisible to you.
Let’s discuss Ambient OS for a bit. You’ve talked in the past about how you plan to make Home compatible with other smart home platforms, like Apple’s HomeKit, by plugging into their APIs. But what if a company like Apple decided to cut you out of their ecosystem? Have you thought about how Essential would address that?
You’re right. Technically one of these companies could say ‘OK, just for Andy Rubin’s company, we’re going to turn off our APIs.’ But I’ve run larger organizations before at public companies, and I think that’s anti-customer. It hurts your customer, and it hurts somebody else’s customer just because you don’t like a strategy, you don’t like a feud, or you don’t want to share. So I think those things don’t actually end up playing out in the market. I think people end up trying to figure out how to collaborate.
My reference point on that is Android. When Android came to market in 2008, there was Linux, Motorola had their own OS. There were so many frickin’ operating systems for phones, and it reminds me of what’s happening in the home and Internet of Things [market] today. Everybody’s building their own operating system. And what it takes is an operating system that goes horizontal. For the first time with Android, a Samsung phone could run the same app as a Motorola phone. That had never happened before. Going forward, I think there are derivatives of that that will help in this IOT home scenario.
Tell me more about what your approach will be for Ambient OS’ virtual assistant. Google, being a search company, has built an assistant that’s really good at answering everyday questions. With Alexa, Amazon has been really focused on encouraging developers to build “skills.” What’s going to be Essential’s angle?
We’re supporting more than one assistant on the OS and all the products that use the OS. I don’t feel the need to go and lock somebody into my assistant. It’s a multi-assistant world and a multi-device world. We should deal with that, not exclude it from the consumer’s menu of options.
And then there will be an Essential assistant as well. There will be certain cases where the consumer wants to choose that assistant to do certain types of things, [such as] the interoperation of the things in your home. There isn’t a good assistant that’s focused on that today. I think assistants are kind of yesterday’s technology. I had a quote a long time ago while I was at Google. It said I don’t want to talk to a fake person on my phone. My phone is a device that should basically be completely transparent to how I get information, and I shouldn’t have to have a conversation to get the answers I need. The way you converse with an assistant, it’s kind of an OK input metaphor. I don’t want to be on a bus having a conversation with the assistant. Everyone can hear my questions, it just doesn’t make sense. You end up using the assistant in times when you’re alone. I also think that [for] input it’s a pretty good metaphor. But [for] output speech is a terrible metaphor. If I ask an assistant to give me driving directions to San Francisco, the assistant is going to read back the 13 turns I need to take probably before I’ve even gotten in my car. So a multi-modal A.I. is probably a better way to think about it. And you’re starting to see some of that.
And we’re not calling it next-generation assistants. This is the reason we have an ambient OS. Because these things become pervasive throughout the entire operating system. You’ve seen machine learning do a better job at helping you find a photo in Google Photos. You’ve seen machine learning presenting to you these two or three word reply buttons at the bottom of Gmail. And there’s a lot of stuff going on in the background to read that email, and understand the natural language, and then propose those responses for you. An interesting thing about that is if I send the same email to three people, those buttons on the bottom of Gmail are exactly the same. I think there’s a future where these operating systems will make some of this AI machine learning technology pervasive throughout the OS. So it’s not one team thinking about how to do it for Gmail. It’s a team thinking about how you do it for every app.
Can you give me a hypothetical example? What’s an example of something this operating system would be able to do that Android can’t today?
I’m purposely not going into great detail about this because we haven’t announced it yet. Like the Gmail example, where all of the buttons are the same for each user. If you had it [built into] the OS, those buttons could reply in the user’s voice of that device. You and I will have different mannerisms in how we communicate with our colleagues at work and with our friends and family. I might use a ton of emoji when I communicate with friends and family. But I might not do that in a work setting. None of that has even surfaced in what they’re doing today. So what I’d like to see is the A.I. customizing itself to be a little virtual version of you. So when things come into your device, and when things happen on your device, the little virtual version of you is the first one to see it.
One of the biggest things that differentiates Essential’s phone from most competitors is the accessory port on the back. Other phone makers like Motorola have taken a similar approach, but there isn’t really any proven consumer interest in this yet. Why did you decide to make that a selling point of your phone?
You’re faced with this long research and development cycle, where companies disappear. A year later they’ll come out with like a speed bump [improvement] to the phone, and then two years later they’ll come out with some platform improvement to the phone.
[The accessories] are made to fill those long upgrade cycles with innovation. And that might be our 360 degree camera. You might find that built in to future generations of our phone. So it gives us that mechanism by which we can test the market, provide the innovation really quickly to the market, iterate as necessary, and bake it into the next phone.
An early criticism about the Essential Phone is that its camera doesn’t measure up to that of its rivals. What led you to the choices you made with the camera and how are you addressing those concerns?
Our camera was built with two different types of sensors, not lenses. One is a black and white sensor. Technically black and white [sensors] let in more light and the clarity is much better. But it’s black and white, and not everybody wants black and white photos. So then we apply the black and white pixels to the color image in the other camera and we do computational photography.
It’s a technical direction that we think is a sound technical direction. It’s all computational photography at this point. We’re on a software update cycle for the camera. That typically hasn’t been the case for these other approaches. You have a sensor in the phone, and once you get the image out of the sensor you’re done. You’ve taken the picture. Here, we’re actually applying algorithms to post-process the pictures to make the fusion of these two sensors create new functionality. When we gave the phone to reviewers, which was prelaunch, we hadn’t finished all the camera software. And we raced to get some of the over-the-air updates out there during the reviews, and we continue to do so.
The phone also shipped later than it was expected to. How do you plan to prevent that in the future?
If you were a fly on the wall in the Apple or Samsung boardroom, or anywhere else, they don’t publicly announce their delivery timeframes until they’re ready to announce the product. I do that. I’m a startup company. I like being transparent.
I want to make sure we’re shipping a quality product. I want to be relatively transparent. I don’t think secrecy helps garner any trust in a company. I want to have a more conversational relationship with my customers. And they have to understand that if I’m two or three weeks late, that’s not a big deal in the grand scale of we started designing these products two years ago. And I think the only difference between us and the rest of the industry is we told people.
So you think our phones and homes are the best places to start when implementing new types of intelligence. What’s next? Are you thinking about the car at all?
Our products are staged very carefully. There’s a means to an end kind of approach here. If you’re building a new operating system, I think it’s really hard to build a technology for technology’s sake. You kind of have to prove the worth of the technology in a consumer’s life. The first thing we did was build a great phone. This device is going to evolve. We’re also going to start adding some of these concepts into our releases for the phone. So some of this A.I. stuff will start seeping into the current product.
But as you know, A.I. is a different form of computer science. Instead of programmers writing linear lines of code, you have data scientists who are training systems to do things. This is why the companies that have the most data are the ones that are furthest ahead in A.I., Google, Facebook, and Microsoft. There are two different approaches to [gathering] data. One approach is Google’s with Waymo. The problem with these things is that in order to make them safe, you need enough data and experience in the real world to make sure your A.I. has learned every little facet that you might encounter. So the companies that want to go from zero to level five autonomy have a chicken or egg problem. They don’t have enough deployment of the tech to get enough data to understand how the real world works. And without that data, they can’t do the deployment in the cars to make them operate safely. So you’re stuck.
So that’s one approach. Another approach is what Elon [Musk] does at Tesla. Elon goes out and he builds a really, really cool electric car. And you do one simple thing when you make those electric cars. You mandate that every car is going to be a connected car. Every car is going to have an LTE modem in it. And then while your happy users are driving around in their electric cars, that connected car is reporting its environment and all of those experience back to Tesla. And as time goes on, Tesla learns from those experiences. And it keeps getting better and better.
So between the chicken or egg approach and the Tesla approach, which one would get to autonomy faster? Definitely the Tesla one. I don’t want to call it a Trojan horse, but it’s an iterative approach where you’re learning on the shoulders of your existing customers. If the industry could wrap its head around that, we’re going to get to bigger, more capable A.I. systems faster. If you put that in the context now of cellphones, and how you get to an artificially intelligent operating system, part of this is that one of my legs is firmly planted in what we do with smartphones today, and another is in what the future looks like. And literally we’re building a bridge to get to the future.