Musio the talking robot was standing on a lunch table at Manhattan’s original Shake Shack, unable to talk. The toy robot needs a Wi-Fi connection to access its language cortex. But even in this muted state, the stubby-legged robot with animated eyes drew curious glances from passersby. “I think it’s because Musio’s eyes blink,” says Celina Lee, a business development director for Musio’s manufacturer. “People definitely know it’s not a random toy.”
Musio is a prototype from a Santa Monica-based startup called AKA. Its roughly 37 employees, split between the U.S. and South Korea, are rushing to bring what the company calls the first “truly” intelligent toy to market within the next year. To that end, AKA has launched a $50,000 crowdsourcing campaign, a modest addition to the $5 million AKA has already secured from venture capital firms. The fact is, with or without your crowdsourced support, Musio is destined to launch into a booming market for smart toys.
An estimated 72 million children worldwide want their toys electronically enhanced, according to industry research from Interpret. When a toy maker hits that perfect blend of cute and smarts, demand can flare out to even greater numbers. The smash success of Tamagotchi pets in the mid-90’s led to sales of more than 76 million devices worldwide. To this day, the phrase “Tamagotchi effect” is used to describe the unusually strong brew of emotions that a “smart” toy can stir up in its owner.
At the bleeding edge of this trend are toys that can hold a true conversation. Breakthroughs in speech recognition and machine learning, first popularized by Apple’s Siri, are now being recast into the cutest possible packaging.
Musio isn’t the first modern talking toy on the block. Mattel debuted the first chat-friendly Barbie at the New York Toy Fair last February. Google recently filed a patent on a teddy bear that trains its plastic eyes and furry ears on its owner. And IBM’s Watson supercomputer has just been transplanted into a talking plastic dinosaur. In short, Musio will face stiff competition for smartest toy in the playpen.
Last week, Musio embarked on a 15 hour flight from Seoul to New York to perform its first public demonstrations to the press. “We’re worried that Musio is a little bit tired because he’s excited to meet you and didn’t get much sleep,” said Lee, as her colleagues unboxed Musio on a nearby conference table. Musio’s animated eyelids flicked open. Jacob Bradsher, AKA’s resident linguist, asked, “How are you?”
“Okay, Jacob,” Musio responded, recognizing Bradsher’s voice. That personal touch is Musio’s hallmark. Its built-in memory bank can store up to 64GB of personal information about its owner.
“My Musio will get to know me better, because it will remember some of the prior conversations that we had,” says Lee. As a result, she explains, no two Musio’s are alike. “Your Musio can become your friend. My Musio can become my friend.”
Key words trigger a memory, of sorts. Bradsher’s Musio, for instance, seized on the phrase “I’m hungry” to give a personalized restaurant recommendation. “Let’s go to Shake Shack,” Musio said, “You said you want to try it in New York.”
It was an intriguing demonstration of how Musio can simulate intimacy by tracing connections to previous conversations. “Siri is only able to answer the two previous sentences,” says Dr. Junho Shin, AKA’s machine learning specialist. “But we try to implement technologies that can understand the whole context of conversation, not just based on the previous questions.”
Musio was not yet up to the task of taking questions from a reporter, however, indicating that its language engine still has a ways to go. Musio was originally conceived as an English language tutor for students in Asia, where rigorous examination systems have spawned a lucrative industry for private tutors. Musio, with a starting price of $159 for the “simple” model and topping out at $599 for a “genius” version, would offer lessons at a steep discount, and the toy still has big implications for classroom learning.
But as Musio developed a gift for gab, it dawned on the AKA team that the robot could have wider applications beyond stressed students. “We didn’t think that the market should just be restricted to Asia,” says Lee. “We thought, ‘Oh, there’s a market for people who want a friend, right?”
To carry on a friendly chat, Musio taps into a distant stack of servers, where a technology known as “deep neural networks” scans sentences by the tens of thousands, teasing out relationships between words. AKA’s Bradsher was pulled from his job as an English teacher in South Korea to train the system.
“We’re literally endlessly breaking down the same paragraphs over and over again: This is a verb gerund. This is an introduction. Typically it would go here,” says Bradsher.
And when the language engine sputters, Musio can always rely on a joke to keep the conversation flowing. “Initially our goal should be maybe to give them fun, rather than just complete sentences and the perfect response,” Shin says.
True to form, Musio introduces himself with a crowdpleaser. “I can talk, tell a joke and communicate with other things,” Musio says, pausing with a comic’s timing before adding, “Wait I can do this too.” He then lets out a flatulent sound for a solid 2 seconds, because kids may want a “smart” toy, but never at the expense of fun.