A few years ago, data scientists didn’t exist. Now it seems everyone in Silicon Valley either is a data scientist, claims to be one, or wants to be one. And why not? These people get paid $200,000-plus per year because they’re viewed as the wizards who actually know what to do with all the information that companies stockpile. They’re the overlords of data manipulation and analysis software like Hadoop, Pig, and Hive that scare off mere mortals by their ridiculous names alone.
Well screw that, says Ben Werther, a veteran data-analytics wonk. He’s started a company called Platfora in a bid to make data analysis easy—or at least easier—so people with such job titles as product manager, marketing manager, and business analyst can become as effective as data scientists.
You can think of Platfora in some ways as the difference between a command line and graphical user interface. Instead of typing strings of complex queries into software like Hadoop, you just open up Platfora and click on various menus that determine which data sets you want to manipulate and how you want to manipulate them. So instead of needing a software engineer to go rooting through a database, you can basically click around with a mouse and go to town.
To prove that Platfora works, Werther (or rather his demo assistant) jumps on the Internet and finds a publicly available data set from the city of Chicago. It covers permit applications over several decades and is basically a giant spreadsheet. The assistant sucks the database into a Hadoop data-analysis system, but then fires up Platfora to begin working with the information.
Right off the bat, Platfora goes through the database and sorts the info into different categories—permit applicant, address, date—and then throws up a clickable menu. From there you can ask to look at, say, permits from the last 20 years compared by type and cost. Seconds later you get back a chart plotting all this information and showing, for example, that the average permit cost was $965 and applications dropped off big-time as the 2008 recession kicked in.
If you want to send this chart to a co-worker, you click another button and off it goes. Your colleague can then annotate the chat and send it back to you with comments or see the source of the data and perform another analysis job. At that point, you’re officially a data scientist and can ask for a raise.
Under the covers, Platfora is solving a pretty interesting problem. Older data-analysis systems tried to make jobs go faster by requiring companies to set up rigid guidelines for what they’re looking for. Newer options are more flexible in that they collect just about everything and let people search for just about everything, but they’re sucking in so much data that it takes a while to process new queries. Platfora, by contrast, acts as a middle ground between these two methods. It determines what data sets you will need for particular queries and puts them aside from the total pool of data so that analysis jobs can run much quicker on the limited pool.
The company was founded in June 2011 and has been selling its product since March. It’s also raised close to $30 million from In-Q-Tel (the CIA’s venture capital arm), Battery Ventures, Andreessen Horowitz, and others. (Bloomberg LP, which owns Businessweek.com, is an investor in Andreessen Horowitz.)
You can expect to see more and more companies like this cropping up and promising to have made big data easy. The whole big data thing seems to have reached that stage where people are tired of hearing about the promise of the technology and would like to see more actual results.
“People have just been doing it wrong,” Werther says. Perhaps.