DataSift, the big data company tackling the web’s social feeds and making them compatible with other types of data streams is unveiling a couple of new product offerings this week, aimed at making it easier to manipulate and work with the information it processes from sources like Twitter and Facebook. The new introductions are called Push and Query Builder, and they’re in keeping with a growing trend in the Big Data industry to make dealing with massive amounts of information easier for staff who wouldn’t necessarily have had the technical expertise required to do so before.
The Push feature makes it possible for companies to easily combine data with social sources with data from their own company, without the use of APIs. Users get control over when, how and where information culled from DataSift’s social network feeds will be pushed, providing compatibility with a long list of popular existing database and business intelligence solutions, including DynamoDB, Amazon S3, MongoDB, CouchDB and more.
In an interview, DataSift CTO and co-founder Nick Halstead explained why providing that kind of easy access made sense for DataSift, and how it reflects customer needs.
“Push is actually a technology that means a couple really core things,” he said. “Right now, we deliver streams in real-time to our customers using a technology called HTTP streaming, which Twitter popularized and we adopted. But actually a lot of big companies don’t like it because it’s a technology they have to learn from scratch, and it’s actually not the most reliable of technologies, since when a connection drops its very hard to tell when it dropped and what data was lost at that time.”
So what Push provides is scalability and reliability, in addition to making it easy to connect to existing cloud-based infrastructure via a company’s credentials and authorization tokens, which DataSift can use to connect in-house data with data from the social web. It’s a change at a very essential level, but one that should mean DataSift customers get more reliable tracking of real-time information, and simpler integration with their existing data storehouses and feeds.
The Query Builder also makes a process that used to be somewhat complicated simpler, by attaching a graphical user interface to what used to be handled with DataSift’s own proprietary programming language. Using the Query Builder, users can see exactly what kind of filters are applicable to any data set they’re monitoring, and easily compile a query from scratch. So, for instance, if you wanted to see only tweets from people with a Klout score of 50 or higher in a certain part of the world with the word “consultant” in their bio, that’s all done now through a point-and-click visual interface, whereas before you’d have to learn and write quite a few lines of code to accomplish the same result.
“We have a great programming language within DataSift, and a lot of people are learning it, it’s a great language with a huge amount of power, but we actually did want to lower the barrier to processing a huge amount of data,” Halstead explained. “So we spent the last several months building a visual Query Builder that allows you to build the same kind of queries in DataSift using a graphical form.”
DataSift has thus far been concerned with harnessing the latent power in real-time social feeds and making that available to companies, but these new introductions take that to a different level, by increasing the accessibility of the historical and live data the startup processes. It’s a smart move, and one that means more employees at any given company can get hands-on with DataSift, which is good for growth in the long term.