Rodney Cutler, hairstylist and grooming columnist for Esquire magazine, is giving styling advice via iPad. “I’ll be honest, curly long hair is tricky,” he says in his Australian brogue. “But we can make it work so you don’t look like a throwback from the ’80s. We’ve got to make sure you are using the right product. What are you using in your hair at the moment?”
“Head & Shoulders,” says the iPad user.
“You know, that’s why I’m here,” he grimaces. “For me, that’s not the best product for the look you are trying to achieve.” He recommends a leave-in conditioner. Except “he” isn’t Cutler at all, but a series of videos designed to simulate a conversation with him.
For years the technology world has promised the ability to speak with computers. Services like Siri and Google (GOOG) Voice Search allow people to convey specific commands and receive answers. Volio, a San Francisco startup that produced Cutler’s conversational grooming video for the new Talk to Esquire iPad app, has a more ambitious goal: creating an entirely new media format it calls participatory video. Companies will be able to use Volio’s tools to set up conversational videos of people that users can interact with, learn from, and, if they’re mischievous, try to stump. “We’re not trying to fool anybody,” says Ronald Croen, Volio’s founder and chief executive officer. “But if this is done well, we’ll create the experience of talking to a real human being.”
Croen co-founded Nuance Communications (NUAN), a speech-recognition pioneer that in the ’90s spun out of Silicon Valley research lab SRI International. In 2005 the firm was bought by ScanSoft, which renamed itself Nuance and later worked with Apple (AAPL) on Siri. Croen took some time off after the sale, worked as an entrepreneur-in-residence at Tufts University and two years ago started Volio. The goal was to use speech recognition, artificial intelligence, and hours of recorded video to simulate a back-and-forth conversation. Croen has raised $2 million in seed money, including from the venture capital firm Andreessen Horowitz, financier Bill Davidow, and angel investor 500 Startups. The company’s 12 employees, working from offices in San Francisco and Los Angeles, include a filmmaker, a comedy writer, a software architect, and a speech-recognition expert.
While early demos may be a bit unconvincing, Volio’s aims are big. Croen envisions one day selling the technology to corporations, advertising firms, and universities. Today his company produces the videos itself for use with its own app, but eventually it wants to license the tools to let its customers incorporate the feature into other apps and make their own videos. Croen imagines allowing media personalities, celebrities, professors, and advertisers to set up their own avatars. Professors can hold remote office hours, brands seeking to speak with their customers actually can, and online movie previews might include stars conversing in character with viewers.
At Esquire, which decided to become a guinea pig for the service, “It provided an opportunity to bridge the gap between our experts and the readers who come looking to us for advice,” says senior editor Richard Dorment. Cutler and two of his colleagues each spent a day speaking various phrases into the camera that would cover the likeliest paths a conversation might take. Based on that back-and-forth, Volio’s software arranges the snippets to respond to typical user questions and answers with the most relevant clips.
At first, the effect is eerie. Virtual Rodney Cutler has something to say about the challenges of fashionably messy hair, flat thin hair, and baldness (“I’m not sure I can help you out today”). Quickly, though, the user senses Virtual Rodney’s limits. Croen promises the technology will improve with time as the company refines it and mobile devices become more powerful. Already, Volio’s software has mastered patience. When you swear at Cutler for trivializing your grooming problems, he responds without missing a beat: “Ah, OK. I’m not sure how this is helping you, but for your sake, let’s try it again.”
The bottom line: A speech-recognition pioneer’s latest startup hopes to build conversation simulators that almost any business can use.