Exasol was founded 14 years ago with a mission in mind: to build an ultra-fast, highly scalable database for analytics. Having been in the business of crafting an in-memory database for nearly a decade and a half, Exasol is uniquely placed to offer insights into the data science ecosystem, and will be giving talks at our events in Berlin, Munich, and London. We recently caught up with Graham Mossman, Exasol’s Senior Solutions Architect to discuss Exasol’s development, product, and his upcoming talk at Data Enthusiasts London.
Exasol has been around for 14 years. How has the product changed over time?
Over the past 14 years, our priority has been to stay true to our roots and be absolutely world-class when it comes to analytical SQL queries. We haven’t tried to build our OLTP capabilities on top of that. Rather, we have gone for being the best at our particular niche.
We have worked tirelessly on making this the easiest product to use on the market, and crucially, not having to compromise on speed. On top of this, our aim really has been to make our product easier to integrate with other systems. Our software engineers have made this product move from strength to strength, and I think it shows in how quick, easy and adaptable our product is.
Speed is a differentiating factor for Exasol- you were recently named the world’s fastest database according to the TPC-H Benchmark 2014. What else makes you unique?
It’s undoubtedly the fact that there are very few tuning knobs on the Exasol system. We believe that the system should do as much of the work as possible. There are actually only two things you can change in an Exasol system, otherwise it just calculates the best way of operating.
The system is really like a car engine, where the system does not rely on the user to have an in-depth knowledge of engine mechanics to operate. To put it quite simply, the system just works!
Exasol’s system is quite intuitive, but I’m wondering how you deal with customers who have very specific requirements.
We provide a very simple tool for a job and we set the system up in such a way that it is extremely easy to integrate. Again, to give the engine analogy – our system is like a Mercedes engine; they are extremely versatile. They fit easily into formula 1 car, a pick up truck, a cab, etc. Exasol’s engine is exactly like this. We make it easy for people to plug it into their particular architecture.
We do not choose sides when it comes to the ETL tools or the BI front end you use, we just make our system easy to talk to. That way, we provide you with maximum power, without making any significant compromises.
You offer Hadoop integration too. What is the basic distinction between what you would expect an enterprise to do with Hadoop and what you would expect them to do with a data warehouse?
This really depends on the way particular enterprise is set up. If we talk about our client King, they have a very large Hadoop implementation with petabytes of data and they use us to answer analytical queries and their integration is really straightforward.
For example, if King wants to know about a particular month of data, rather than pull all the data they’ve collected over the past 3 years, they will only pull a copy of the data they need into the database. Once they’ve loaded that data into our solution, they are able to run extremely quickly some very challenging queries that would have stymied their Hive solution. This is one integration we offer, which I call a “loose integration” where you pull from Hadoop down into a database.
What else do you provide with Hadoop?
We provide a Hadoop integration product, which uses 0MQ messaging and Google protocol buffers to build an infrastructure that allows the passing of messages between Hadoop and the database. This means that, as part of your SQL in our database, you can call an external function. This is actually a MapReduce job that will run of the Hadoop cluster, which is a much more intimate integration that is available to those who want to have both systems working close together.
From our experience, however, what most people want is to have Hadoop and the database world separate so that both can be optimised to do a particular job, rather than compromising both worlds.
Can you give us a brief summary of the talk you will be giving in our Meetup in London on November 13th and why you chose this topic?
The title we’re going with is “a tool for a job” and what we’re going to be talking about is how analytics queries are imperfectly dealt with by traditional multipurpose databases. Equally, we are also going to touch upon that fact that Hadoop is not well equipped to deal with such analytical queries either.
Although Hadoop has a number of projects trying to do SQL at speed, I believe that these approaches lead you to a very impure kind of architecture. By being grafted onto something else, which is what people are trying to do with SQL and Hadoop, you lose the purity of having a product that is developed from the ground up for a particular job.
I’ve also got some nice picture’s of me mowing the lawn, which, if you come to the event, will understand how it’s relevant!
This will be the backbone and I’m looking forward to seeing how people respond!
This interview was conducted Furhaad Shah – a journalist at Dataconomy.
(Image source: Exasol)