Big data was once the domain of big iron and big brand frameworks, but in recent years, data processing has relied on x86-based server farms based on CPU-driven architectures. Yet when it comes to real-time performance, these systems just don’t hold up. Von Neumann computing architectures – most commonly seen in x86 systems – are a prime example. I/O bottlenecks and need for time consuming indexing and extract, transform, and load (ETL) processes are just a few ways that these systems slowed data analytics to a near crawl. With data growth exploding, there must be some significant shift in today’s computing architectures in order to keep up.
Why?
The true value of an organization’s data is locked inside historical data, server logs, clickstream data, images and videos, social media feeds, news feeds, and there’s no end in sight. Today’s sequential-processing systems are forced to cluster in order to scale, which hampers performance. Another factor: current large-cluster approaches, using tools like Hadoop and Apache Spark, often require significant ETL efforts. ETL and indexing can take days and weeks in some cases, especially if multiple indexes are required. Making matters worse, as new data arrives and new questions need to be answered, data must often be re-indexed.
Anyone involved in big data analysis understands the pain of these delays and the impact on the time value of data. Customer transaction data constantly changes, as does network monitoring, surveillance, and trading data. Getting answers to big data questions after days or weeks can be the equivalent of getting a traffic report after being late for a crucial business meeting. The data is essentially useless. That’s why every industry and sector responsible for big data analysis and insights needs to have agile, easy to spin up infrastructure and analytics that deliver insights in a matter of seconds or minutes, not days or weeks.
And, the need for fast answers is becoming even more important as data volumes and variety explode. The Internet of Things (IoT) is generating more data than ever through video, sensors, apps, and more. Making sense of it all is quickly becoming a competitive differentiator. Today’s clustered analytics environments can’t keep up with even the recent social media explosion, and to think they can handle IoT is hugely misguided. Now may be the time to consider FPGA-driven systems as a complement to x86-based clusters as they are the best platform for achieving high-performance, ease-of-use with open APIs, and lower costs while still handling the IoT data explosion.
Using FPGAs to Turbocharge Analytics Performance Allows Enterprises to Unlock Value Trapped In Data Stores
When data agility and the speed of answers are important, FPGA-accelerated systems are the key to gaining value from data. While parallel workloads that aren’t time sensitive are usually a reasonable fit for today’s x86 infrastructures, instances where an organization’s data matches this description are dwindling. The growing trend of analyzing a combination of streams of social media, geo-location, and IoT data alongside vast stores of unstructured data has driven the need for a new way to glean value from data – and FPGA architectures have proven to be the answer.
Properly architected FPGA-based systems have several major benefits including performance, ease-of-use, and lower total cost of ownership. Performance is the main benefit most organizations care about. The gains that are achievable with an FPGA-based system, for targeted applications, easily approach two orders of magnitude (100X+) and higher, with some scenarios leading to even greater gains, when compared to contemporary CPU- and GPU-based cluster nodes.
Ease-of-use is an extremely important benefit that until recently has remained elusive for FPGA-based (and other hardware-based) systems, primarily because they required special parallel programming skills that are drastically different from traditional sequential programming paradigms which are typically taught in our Colleges and Universities. Recent open architectures and open APIs have allowed for a simple programming model, and even open APIs allowing for arbitrary – yet simple – web-based interaction with systems via a variety of open-source models (such a via RESTful APIs), so an end-user doesn’t even have to know that FPGAs are present.
An FPGA-based system that is not easy-to-use probably won’t last long-term no matter how well it performs. So, the ability to abstract away the complex technical considerations of FPGA fabric is essential in any good design, and by virtue of such a design, becomes a major benefit. This is especially true since an FPGA-enabled system can solve a problem with a single box that a more traditional approach using a Hadoop or Spark cluster might require a hundred nodes or more.  The best FPGA-based systems are able to fully abstract away the parallelism of the FPGA architecture that does the heavy-lifting when it comes to algorithmic processing, so that the end user can focus on their business objectives.
Performance and ease-of-use are critical for any emerging technology that hopes to compete against the status quo.  But, as we all know, total cost of ownership (TCO) matters too, and the mathematics involved with TCO become very simple for most FPGA-enabled systems. Because a single FPGA-based system can handle the workload of 100 or more traditional clustered nodes for specific applications, the TCO numbers become very interesting, very quickly. If an organization has to configure and maintain a single 1U box vs. 50 or 100 traditional servers, then the real-estate (rack-space) costs, power costs, cabling costs, networking costs, IT personnel costs, ongoing software maintenance costs, and so on add up very quickly.
Since these systems bring true high-performance, are easy-to-use, and are cost-favorable in comparison to large contemporary clustered solutions, many industries would benefit from FPGA systems – especially financial services, life sciences for disease research, national security, and likely many more. For instance in the financial space, where large clusters are routinely deployed for a variety of use cases including high-frequency trading and compliance, an FPGA-based system can provide extreme leaps in performance where every split-second (or split-nanosecond!) counts when it comes to making a trade, or meeting compliance targets. And, of course, making money.
The Future of FPGA Â
The computing world in general is coming to the realization that today’s sequential architectures aren’t cutting it. They don’t meet performance needs, they are too cumbersome to deploy, program, and maintain, and at scale they are too costly.  In fact, we’ve known this for quite some time in a variety of industries, and many industries have already attempted – with mixed results – to move away from traditional architectures and try something new. A great example is the growing use of clustered GPUs. GPUs were originally designed to offload graphics-intensive operations from traditional software running on traditional CPU architectures, but have recently found their place in a variety of high performance computing applications, to include some level of data analytics.
Yet they still leave much to be desired, because in the end, they too are sequential architectures, at their cores.  It’s quite clear that contemporary clustered systems can’t keep up with the current data growth, so we can expect to see a greater migration away from these legacy platforms to new solutions that can deal with the data explosion.  If history repeats itself, and it usually does, you’re going to see offload of data off of those contemporary architectures and onto FPGA-accelerated systems.
The early adopters of FPGA-enabled systems remain organizations that have true big data problems where performance matters ‘now’. There’s an old adage in the commercial real estate space that the top three things that matter are location, location, and location. Well, in the high performance data analytics space, traditional CPU clusters (and even GPU clusters) just can’t keep up with the data explosion, and so in the end what matters is performance, performance, and performance. FPGA-based systems, when developed and architected properly, deliver that performance in spades.
Early FPGA adoption in 2015 and into 2016 will likely stem from forward-thinking individuals throughout industry who see the performance walls of traditional computing architectures are real, and getting worse given the data explosion. In fact, in the financial services and national security industries, there is already a level of wide acceptance into the ideas that traditional computing cluster architectures are incapable of handling today’s real-time and near-real-time data processing requirements.
As adoption grows, some interesting milestones to watch for will be the marriage of traditional CPU to FPGA resources at the silicon level. Additionally, the FPGA vendors as well have begun to marry Advanced RISC Machine (ARM) CPUs and FPGA fabric in the same silicon. This has already begun in the GPU space and even the DSP space as well.  Intel Corporation’s recent announcement of its acquisition of Altera, a well-known and well-respected FPGA vendor, is another great example of this inevitable and fundamental change in the nature of computing, leveraging the right hardware at the right time for the right problem.
Another interesting milestone to watch for is the direct support in Hadoop or Spark for hardware acceleration functions using open APIs for real-world problems. Some of this work has already been done for GPU-based offload, and there’s a good chance we’ll eventually see the same for FPGA-based offload.
Adoption of FPGA for big data analytics is moving fast, as more big brands introduce products and users see the overwhelming benefits.  There is no stemming the flow of data – social media, IoT, and more mean that organizations want answers and they want them immediately, not next week or next month. However, current solutions haven’t allowed for fast, accurate data analysis, so decision makers have been forced to act on outdated and inaccurate data, which can harm the business. The development of fast, scalable and easy-to-use FPGA-accelerated systems gives organizations the ability to glean faster value from their data with real-time actionable intelligence.
Pat McGarry is Vice President of Engineering with Ryft Systems. He has over 20 years of hands-on experience in hardware and software engineering, computer networking, and managerial roles for a variety of technology-related disciplines, and bachelor’s degrees in both Computer Science (BSCS, Virginia Tech, ’93) and Electrical Engineering (BSEE, Virginia Tech, ’94). Ryft is a top provider of high performance data acceleration hardware to the world’s most sophisticated users of Big Data.
Image Credit: Travis Goodspeed / Chris’s FPGA Tic-Tac-Toe / CC BY 2.0