SAS – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Mon, 22 May 2017 15:12:22 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2022/12/DC-logo-emblem_multicolor-75x75.png SAS – Dataconomy https://dataconomy.ru 32 32 Keep it real — say no to algorithm porn! https://dataconomy.ru/2017/05/22/keep-real%e2%80%8a-%e2%80%8asay-no-algorithm-porn/ https://dataconomy.ru/2017/05/22/keep-real%e2%80%8a-%e2%80%8asay-no-algorithm-porn/#respond Mon, 22 May 2017 07:30:36 +0000 https://dataconomy.ru/?p=17941 For people in the know, machine learning is old hat. Even so, it’s set to become the data buzzword of the year — for a rather mundane reason. When things get complex, people expect technology to ‘automagically’ solve the problem. Whether it’s automated financial product consultation or shopping in the supermarket of the future — machine learning is the […]]]>

For people in the know, machine learning is old hat. Even so, it’s set to become the data buzzword of the year — for a rather mundane reason. When things get complex, people expect technology to ‘automagically’ solve the problem. Whether it’s automated financial product consultation or shopping in the supermarket of the future — machine learning is the answer. Data scientists are jumping on the bandwagon, trying to outdo each other in the race for the coolest algorithm. But is algorithm porn bringing us progress, or just a lot of showboating?

SAS Forum Germany in Bonn: how machine learning is used in practice / special offer for Dataconomy readers

Machine learning is no magic bullet. In fact, what’s behind it is basically conventional analytics technology. Analytic models are trained using example data sets. The training is supervised, for instance by specifying the desired output value such as the risk class of a bank customer. The machine is also given the input, in this case master data, demographics, and past transactions. Another example would be providing an error category, with maintenance reports as the input. Non-supervised learning, in contrast, is used to find new patterns in data and learn to distinguish categories.

In other words, the system learns from examples and is able to generalize after the learning phase is completed. What is happening here is not the simple memorization of the examples, but rather the recognition of patterns and laws in the example set. This allows the system to also correctly assess previously unseen instances by transferring what has been learned.

So machine learning helps us develop good models. But a data scientist is still needed to get those models ready for real-world use.

Let’s consider for example the maintenance routine for a CT machine which needs to be optimized to reduce downtime.  First, good models are needed that are capable of taking sensor data and event codes to predict the probability of a component failing to a high degree of accuracy and with minimal false alarms. Machine learning can help here.

The next step is operationalizing, which involves business rules that pair analytic predictions with recommended actions. What should I do if the probability of the motorized patient bed failing is high? How fast do I need to respond if the customer has a premium service agreement? How does the procedure differ if the device is located in a hospital versus a radiology clinic?

The application of the models and the rules must then be continuously monitored. This requires model governance which ensures auditability and the efficiency of the process used to register the models. It also enables automatic accuracy evaluation for of the statistical models and sends out an alert if an analytic model needs to be replaced. And that is something that the data scientist takes care of, not the device technician.

The procedure described above illustrates why machine learning by itself is no magic bullet. In the real world, what counts is the professional integration of analytics into business processes. The goal is not to have the coolest algorithm. Well, at least it’s not the only goal.

Keep it real — say no to algorithm porn!

At the 2017 SAS Forum Germany on June 29 in Bonn, machine learning will be one of the featured topics — presented from the perspective of both data scientists and the enterprise. Take a look at the program. Use the attractive discount to attend the conference for just €180 (instead of €380). Just select the “Special offer” when you register and then enter the code DCY-SF17 at the end of the registration process.

 

Register now!

 

Like this article? Subscribe to our weekly newsletter to never miss out!

Image: TxDonor

]]>
https://dataconomy.ru/2017/05/22/keep-real%e2%80%8a-%e2%80%8asay-no-algorithm-porn/feed/ 0
Top 10 Big Data Videos on Youtube https://dataconomy.ru/2015/08/19/10-best-big-data-videos-on-youtube/ https://dataconomy.ru/2015/08/19/10-best-big-data-videos-on-youtube/#comments Wed, 19 Aug 2015 13:48:19 +0000 https://dataconomy.ru/?p=9696 Whether you’re entirely new to the field of big data, or looking to expand your machine learning knowledge; whether you have 3 hours or 3 minutes; whether you want you want to know more about the technology, or the high-level applications- this list is a sample of the best Youtube has to offer on big […]]]>

Whether you’re entirely new to the field of big data, or looking to expand your machine learning knowledge; whether you have 3 hours or 3 minutes; whether you want you want to know more about the technology, or the high-level applications- this list is a sample of the best Youtube has to offer on big data. Hit play, and enjoy a time- and cost-effective way to continue your big data exploration.

(A note on selecting the videos: we took the most viewed videos in the “Science and Tech” and “Education” categories on Youtube on the subjects of big data, data science, machine learning and analytics. We then used a formulae that weighs up rates of engagement (likes and dislikes) with views. Got a great video that you think is missing from the list below? Let us know in the comments.)

1. Kenneth Cukier: Big Data is Better Data


It’s no surprise that a video from the ludicrously popular TED Talks has topped our list. This high-level talk on the nature of big data was filmed in Dataconomy’s hometown of Berlin, and explains why big data is better. As Cukier says, “More data allows us to see new. It allows us to see better. It allows us to see different”. It also introduces some applications, such as cancer cell detection and predictive policing- a compelling and easily digestible introduction for the uninitiated.

2. The Future of Robotics and Artificial Intelligence (Andrew Ng, Stanford University, STAN 2011)


Machine learning expert, Stanford Professor and Coursera Founder Andrew Ng discusses his research in robotics, covering fields of research from computer vision, reinforcement learning (including an autonomous helicopter), and the dream of building a robot who could clean your house.

3. Google I/O 2012 – SQL vs NoSQL: Battle of the Backends


In this video, Google employees Ken Ashcraft and Alfred Fuller battle it out to decide whether SQL or NoSQL is better for your application. Topics battled out include: queries, transactions, consistencz, scalability, management schema. Recommended viewing if you’re unsure whether relational or non-relational models would be better for your application.

4. Introduction to NoSQL by Martin Fowler


Still intrigued by the prospect of NoSQL databases? This 50-minute talk by Martin Fowler of ThoughtWorks offers a rapid and comprehensive overview of NoSQL. This talk takes us through where NoSQL databases came from, why you should consider using them, and why SQL vs. NoSQL is not a ‘death or glory’ battle.

5. Big Data and Hadoop- Hadoop Tutorial for Beginners


This video is a free recording of a tutorial from a monetised, interactive course by Edureka! If you’re looking to delve a bit deeper beyond the high-level talks and into actually learning about the mechanisms of Hadoop, this video’s a pretty good place to start. You’ll learn about nodes in Hadoop, how to read and write and HDFS and data replication. If you like what you see, Edureka! have hours of free Hadoop tutorials available here.

6. Explaining Big Data Lecture 1 | Machine Learning (Stanford)


This video is the first part in Stanford’s incredibly popular machine learning video lecture course. These lectures formed the basis of Andrew Ng’s Coursera course on machine learning, and feature extra content which was omitted from the 10-week Coursera tutorial for the same of brevity. If you want to look into machine learning without spending a penny, this is the video series for you.

7. Brains, Sex, and Machine Learning


Co-inventor of Boltzmann machines, backpropagation, and contrastive divergence and all-round machine learning god Geoffrey Hinton is one of the best people in the world to guide you through the world of neural networks. This talk focuses on how machine learning provides explanations for puzzling phenomena in biology- including the field of sexual reproduction, hence the intriguing title.

8. What is Hadoop?


Intricity attempt to answer in three minutes a question which has made people outside the realm of data science tremble in fear for years- what exactly is Hadoop? If you this video enlightening, you may want to check out their other short videos which demystify terms such as “data governance”, “OLAP” and “Metadata Management”.

9. Big Ideas: How Big is Big Data?


This list is certainly dominated by clips for beginners- and it doesn’t get more back-to-basics than this video from EMC. If you think “big data” is defined by size, hit play and prepare to be enlightened.

10. Big Data Analytics: The Revolution Has Just Begun


SAS presents a talk from Dr. Will Hakes of Link Analytics, giving an extensive, high-level talk on the way big data analytics is changing- and will continue to change- the business intelligence landscape.

If you have another great big data video tip, be sure to let us know in the comments!

(Featured image credit: TED)

]]>
https://dataconomy.ru/2015/08/19/10-best-big-data-videos-on-youtube/feed/ 34
15 IT & Big Data Titans Come Together to Establish the US Open Data Platform https://dataconomy.ru/2015/02/27/15-it-big-data-titans-come-together-to-establish-the-us-open-data-platform/ https://dataconomy.ru/2015/02/27/15-it-big-data-titans-come-together-to-establish-the-us-open-data-platform/#respond Fri, 27 Feb 2015 11:45:11 +0000 https://dataconomy.ru/?p=12187 IT companies and business heavyweights are joining forces in a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise. Introducing the Open Data Platform Initiative (ODP) which will in essence, help enterprises enhance business value globally with Hadoop. Founding outfits Pivotal and Hortonworks are […]]]>

IT companies and business heavyweights are joining forces in a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise.

Introducing the Open Data Platform Initiative (ODP) which will in essence, help enterprises enhance business value globally with Hadoop. Founding outfits Pivotal and Hortonworks are participating with GE Software, IBM, Infosys, SAS, AltiScale, Capgemini, CenturyLink, EMC, Splunk, Verizon Enterprise Solutions, Teradata, and VMware in this effort.

Gavin Sherry, VP of Engineering, Data, at Pivotal writes: “Our goal is to create a standardized core platform of compatible versions of Apache Software Foundation projects that provide a stable base against which Big Data solutions providers can qualify solutions. This will facilitate compatibility and ease interoperability across the Big Data ecosystem while enabling a new level of choice for enterprises.”

In line with ASF guidelines the effort will enable collaboration between vendors and end users of Big Data technology.

Going by the annual TechTarget/Computer Weekly IT Spending Priorities survey for 2015, globally about 30% respondents said they were undertaking big data initiatives this year as compared to 2014’s 17%. In Europe the fraction was 26%, and for the UK 21%, Computer Weekly pointed out.

Navin Budhiraja,  head of architecture and technology, Infosys, speaking for services noted that the ODP’s ecosystem would preserve “the rapid innovation cycles of open source software, while still providing the benefits of broad vendor support and interoperability.”

The announcement for the organisation came earlier last week through Pivotal’s blog post, where its nuances have been further detailed.


(Image credit: Andy Wilkinson, via Flickr)

]]>
https://dataconomy.ru/2015/02/27/15-it-big-data-titans-come-together-to-establish-the-us-open-data-platform/feed/ 0
MapR and SAS Collaborate Around Business Analytics and Enterprise Hadoop https://dataconomy.ru/2014/12/19/mapr-and-sas-collaborate-around-business-analytics-and-enterprise-hadoop/ https://dataconomy.ru/2014/12/19/mapr-and-sas-collaborate-around-business-analytics-and-enterprise-hadoop/#respond Fri, 19 Dec 2014 10:44:30 +0000 https://dataconomy.ru/?p=11120 MapR Technologies, Inc., the enterprise software provider that specialises in distribution for Apache™ Hadoop®, has come into an alliance with business analytics software innovator SAS, to combine powerful SAS analytics with production-ready Hadoop. MapR and SAS intend to carry out Hadoop integration, provide support of joint customers, and tap future go-to-market opportunities. “SAS is committed […]]]>

MapR Technologies, Inc., the enterprise software provider that specialises in distribution for Apache™ Hadoop®, has come into an alliance with business analytics software innovator SAS, to combine powerful SAS analytics with production-ready Hadoop. MapR and SAS intend to carry out Hadoop integration, provide support of joint customers, and tap future go-to-market opportunities.

“SAS is committed to helping customers leverage the power of advanced analytics on their big data. The use of SAS analytics software with MapR will help data scientists and analysts gain new business insights from their Hadoop-based data,” explained Elishia Rousos, SAS senior manager, alliance management.

“Through this alliance, customers benefit from the ability to make better business decisions faster through the market-leading analytics of SAS and the advanced architecture and performance of MapR,” said Steve Campbell, director of business development, MapR Technologies.

Through this partnership SAS, with its powerful business analytics combined with the advanced distributed data platform from MapR will provide customers organizations with valuable insights needed to make the best decisions possible for competitive advantage, explains a statement making the announcement.


(Image credit: MapR)

]]>
https://dataconomy.ru/2014/12/19/mapr-and-sas-collaborate-around-business-analytics-and-enterprise-hadoop/feed/ 0
How Ford Uses Data Science: Past, Present and Future https://dataconomy.ru/2014/11/18/how-ford-uses-data-science-past-present-and-future/ https://dataconomy.ru/2014/11/18/how-ford-uses-data-science-past-present-and-future/#comments Tue, 18 Nov 2014 11:36:55 +0000 https://dataconomy.ru/?p=10439 Success stories of how data-driven practices can revitalise businesses are rife today, but there are few as compelling as the story of Ford. In 2006, the legendary car manufacturers were in trouble; they closed the year with a $12.6 billion loss, the largest in the company’s history. As we reported earlier in the year, through […]]]>

Mike-CavarettaSuccess stories of how data-driven practices can revitalise businesses are rife today, but there are few as compelling as the story of Ford. In 2006, the legendary car manufacturers were in trouble; they closed the year with a $12.6 billion loss, the largest in the company’s history. As we reported earlier in the year, through implementing a top-down data-driven culture and using innovative data science techniques, Ford was able to start turning profits again in just three years. I was recently lucky enough to speak with Mike Cavaretta, Ford’s Chief Data Scientist, who divulged the inside story of how data saved one of the world’s largest automobile manufacturers, and well as discussing how Ford will use data in the future.


As an overview, how do Ford use data science?
So at the moment, we’re primarily trying to break down our data silos. We have a number of projects that are using Hadoop, and we’re actually setting up our Big Data Analytics Lab, where we can run our own experiments and have a look at some of the more research questions.

Back in 2006/07, Ford was having a downturn. Since then, it’s dramatically turned things around. What role did data science play in this?

Thanks for that question, and thanks so much for phrasing it as “data science” and not “big data”. I think at this point in time, “big data” has come to mean so many things to so many people, I think it’s better to focus on the analytical techniques, and I think data science does a pretty good job of narrowing in on that.

So back to 2006-2007- that was around the time Alan Mulally was brought on. He brought with him this idea that important decisions within the company had to be based on data. He forged that from the very beginning, and from the top down. It really didn’t take a long time for people to realize that if the new CEO is asking, “Hey where is the data you are basing your decision on?”, you’d better go out and find the data, and have a good reason why that data matters to this particular decision.

So, it became apparent quickly that we needed people who could manipulate the data. We didn’t call it “data science” at the time, but being able to bring data to bear against different problems became of primary importance.

The idea was that the roadmap really needed to be based on the best data that we had at that time, and the focus was not only good data and analysis, but also being able to react to that analysis fast.

So an 80% solution would allow us to move quickly, and benefit the business more than a 95% solution where we missed the decision point. I think there were a lot of benefits to being able to bring these methods, ideas and data-driven decisions using good statistical techniques. This approach helps to build your credibility, as you’re able to bring great results with good timing- it just worked out well.

What technologies were you using?

At the time, the primary technologies we were using were really on more on the statistical side, so none of the big data stuff- we were not using Hadoop. The primary database technologies were SQL-driven. Ford has a mix of a lot of different technologies from a lot of different companies- Microsoft, Teradata, Oracle… The database technologies allowed us to go to our IT partners and say “This is the data that is important, we need to be able to make a decision based on this analysis”- and we could do it. On the statistical side, we did a lot of stuff in R. We did some stuff with SAS. But it was much more focused on the statistical analysis stuff.

What technologies have you since added?

So I think the biggest change from our perspective is a recognition that the visualization tools have got much better. We are big fans of Tableau and big fans of Qlikview, and those are the two primary ones we use at Ford.

We’ve done a lot more with R and we’re currently evaluating Pentaho. So we’ve really moved from more point solutions for solving particular problems, to more of a framework and understanding different needs in different areas. For example, there may be certain times when SAS is great for analysis because we already have implementations, and it’s easier to get that into production. There are other times when R is a better choice because it’s got certain packages that makes that analysis a lot easier, so we’re working on trying to put all that together.

Ford Big Data Science Mike Caveretta

You’ve now begun to collect data from the cars themselves- what insights has this yielded?

So there’s a good amount of analytics that can be done on the data we collect. It’s all opt-in data- it’s all data that the customers have agreed to share with us. Primarily, they opt-in to find charging stations, and to better understand how their electric vehicles are working. A lot of the stuff we are looking at has to do with how people are using their vehicles, and how to make sure that the features are working correctly.

Ford Big Data Science Mike Caveretta

Ford use text mining and sentiment analysis to gauge public opinion of certain features and models; tell us more about that.

So a lot of the work that we’ve done to support the development of different features, and to figure out what feature should go on certain vehicles, is based on what we call very targeted social media. Our internal marketing customers will come to us and ask us, “We’re thinking about using this particular feature, and putting it on a vehicle”- the power liftgate of the Ford Escape is a good example, the three-blink turn signal on the Ford Fiesta is another one. In those circumstances, we will take a look at what most people think about the features on similar vehicles. What are they saying about what they would like to see? But we don’t pull in terabytes of Twitter and we don’t use Facebook- we go to other sources that we found to be good indicators what customers like. It’s not shotgun blasts, so to speak; it’s more like very specific rifle shots. This gives us not only quantitative understanding- this customer likes it and this customer doesn’t- but also stories that we can put against it. And these stories are usually when the customers are talking with each other. One great story is for the three-blink turn signal when one customer was describing, “So I got the vehicle. I got the three-blink turn signal and I’m not sure whether I like it or not.” And other people were chiming in saying “You know what, I kind of got the same impression, give it another couple of weeks and just think about how you’re using it on the highway and if you give it a couple of weeks you’ll like it.”

The first person signed back on a few days later and said “You know you what, you were right, now that I understand how it works and where it should be used- I think I like it now!” It was actually kind of beautiful, and that story we can put in front of people and say “This is the way people are using it, these are the some of things they’re talking about”. So now, we’re not only getting the numbers, but also the story behind it. Which I think is very important.

What can we expect from Ford in the future?

I think the position that we’re in right now is really looking at instantiating the experiments we want to do in the analytics space, linking up the different analytics groups, and really focusing on the way that big data technologies allow us to break down data silos.

This company’s been around for over 100 years, and there’s data in different areas that we’ve used for different purposes. So we’ll start looking at that- and start providing value across the different spaces. We’ve put some good effort into that space and got some good traction on it. I can see that as an area that’s going to grow in volume and in importance in the future.


(Featured Image Credit: Hèctor Clivillé)

]]>
https://dataconomy.ru/2014/11/18/how-ford-uses-data-science-past-present-and-future/feed/ 8
Data Scientists in UK Risk Crumbling Under Mounting Stress as Big Data Industry Thrives https://dataconomy.ru/2014/10/10/data-scientists-in-uk-risk-crumbling-under-mounting-stress-as-big-data-industry-thrives/ https://dataconomy.ru/2014/10/10/data-scientists-in-uk-risk-crumbling-under-mounting-stress-as-big-data-industry-thrives/#comments Fri, 10 Oct 2014 07:38:11 +0000 https://dataconomy.ru/?p=9760 The new but rapidly evolving field of Data Science, seems to be taking its toll on the “gears” of its machinery. As more organisations focus on unlocking the insight held in ‘big data’, data scientists are under a lot of work-related stress reveals a psychometric survey launched by SAS. 600 data scientists in the UK […]]]>

The new but rapidly evolving field of Data Science, seems to be taking its toll on the “gears” of its machinery. As more organisations focus on unlocking the insight held in ‘big data’, data scientists are under a lot of work-related stress reveals a psychometric survey launched by SAS.

600 data scientists in the UK and Ireland were found “exhibiting high levels of work-related stress”, with 27 percent of male data scientists saying they were “mildly stressed” and 25 percent said they were “heavily stressed”, reports Steve Ranger of ZDNet. The percentage rose to 28 and 30 percent, respectively, for female scientists.

There is a gaping need for 69,000 additional big data specialists over the period 2012-2017, where 60 per cent enterprises are struggling to hire people with data science skills. Mounting pressures on existing data scientists, which may be unsuited to their personality types and skills, as the importance of big data analytics to organisations grows.

The survey categorises the following top six personality profiles:

  1. The Geeks made up the largest group of respondents (41%). They have a technical bias, strong logic and analytical skills. They focus on detail. Roles include: defining system requirements, designing processes, and programming.
  2. The Gurus (11%) are pre-disposed to scientific and technical subjects but are persuasive communicators with strong social skills. Roles include: promoting data science benefits to management.
  3. The Drivers (11%) are highly pragmatic and determined to realise their goals. They are self-confident and results-oriented. Roles include: project management and team leadership.
  4. The Crunchers (11%) like routine and display high technical competence. Roles include: technically oriented support roles such as data preparation and entry, statistical analysis and data quality control.
  5. The Deliverers (7%) have technical skills but also bring focus and momentum to ensure project success. Roles include: project and man management with a high level of technical knowledge.
  6. The Voices (6%) generate enthusiasm for data science at a conceptual level with their strong, inspiring communication skills. Roles include: presentation of results to senior business audiences.

“The fast evolution and demanding nature of the data science discipline, and the importance for businesses, highlights the need to build diverse, complementary and highly-skilled teams,” says the report.

“Better definition of roles within data science must be a priority to avoid an analytics talent burn out,” points out Peter Robertshaw, Marketing Director at SAS UK & Ireland.

“Unlocking insights from big data is the challenge of the 21st century and data scientists are a precious resource. Organisations must recognize the need to create teams that are technically proficient, mathematically gifted, business savvy as well as being great communicators,” he added.

Organisations need to address the causes and individual characteristics of their data scientists in order to reduce the piling stress. Combined with an industry effort to develop the range of skills needed and a push to attract more people into this vital, evolving discipline, Robershaw believes could be a solution.

Read more here

(Image Credit: Eamon Curry)

]]>
https://dataconomy.ru/2014/10/10/data-scientists-in-uk-risk-crumbling-under-mounting-stress-as-big-data-industry-thrives/feed/ 2
The History of BI: The 2000’s and Now https://dataconomy.ru/2014/07/19/the-history-of-bi-the-2000s-and-now/ https://dataconomy.ru/2014/07/19/the-history-of-bi-the-2000s-and-now/#comments Sat, 19 Jul 2014 00:38:07 +0000 https://dataconomy.ru/?p=7164 Our three part Business Intelligence series has looked at the key developments in BI from the 1960’s all the way to the late 1990’s. In the first edition, we focused on the way data storage changed from hierarchical database management systems (DBMS), like IBM’s IMS in the 60’s, to network DMBS’s and then to relational […]]]>

Our three part Business Intelligence series has looked at the key developments in BI from the 1960’s all the way to the late 1990’s.

In the first edition, we focused on the way data storage changed from hierarchical database management systems (DBMS), like IBM’s IMS in the 60’s, to network DMBS’s and then to relational database management systems (RDMBS) in the late 70’s.

The second part of the series investigated the technological advancements through the 80’s and 90’s, predominantly mapping the evolution from mainframes to personal computers, DBMS’s to RDBM’s, and the emergence of new methods and tools like Data Warehousing, Extract Transform Load (ETL), and Online Analytical Processing (OLAP).

In this edition, we will take a brief look at how BI transitioned from a tool based, IT-centric activity, to one that is now accessible to technical and non-technical users alike.

The transformation of BI 1.0.

BI 1.0 refers to an era of BI that existed through the late 1990’s and early 2000’s. With the advent and development of data warehousing, SQL, ETL and OLAP, data was consolidated into a unified system and queries could be written to extract data from many The History of BI: The 2000's and Nowtables at once, ultimately helping companies access and store their data more effectively.

At its core, BI 1.0. could be distilled into two components: data and reports, or aggregation and presentation. As Neil Raden, Principal Analyst at HiredBrains Research commented, “most of the effort in BI…[was] focused on data integration, data quality, data cleansing, data warehouse, data mart, data modelling, data governance, data stewardship.”

However, within this period, the major problem with BI projects was that they were still owned by IT departments, data was siloed, and reports often took extended periods of time to be delivered to management. BI solutions were predominantly designed for an analytics-trained minority and those who were already capable of understanding data models.

BI 2.0 and the Influence of Web 2.0: Mid-2000’s

The mid-late 2000’s marked a significant step forward for BI as it entered its acclaimed 2.0 phase; it went far beyond simple data and reporting by integrating near real-time processing, collaboration, self-service, discoverability, as well as offline and online access.

Whereas BI 1.0 centered mostly around the refinement of different tools – the aforementioned data warehouse, OLAP, and ETL technologies – BI 2.0 focused mostly on using the connectivity of the Web to create a BI environment that would encourage access, flexibility and getting the right data to the right people.

Many of these changes were influenced by the direction that the Web began to take in the early 2000’s (often dubbed “Web 2.0”) with social networking and web applications. One significant example was the arrival of platforms like Facebook, Twitter, and even Google, where the consumer became an important source of critique – anyone could exchange opinions on widely accessible sites, as well as gain instant information on competitors.

Businesses in the mid-2000’s therefore required access to immense amounts of real-time information in order to, among other things, track customers’ reactions to their products, what their competitors were offering, and, with the advancement of mobile and tablet technologies, the best interfaces on which to approach their consumers. In other words, the “new” Web environment demanded a simultaneous reconstitution of BI technologies that emphasized agility, dynamism, and immediacy.

BI 2.5 and the Democratization of Data: 2000’s – Current

After this explosion of data throughout the 2000’s, businesses in our current environment now require visualization tools – interactive dashboards, bar graphs, animations – to effectively analyse the information coming from inside and outside of the organization. BI is becoming jointly governed by IT and business users themselves, and is aimed at empowering the ‘Data Explorer’ through content delivery and creation.  The emergence of visualisation tools and other techniques means that BI uptake across the organisation is rising, essentially empowering business users  with the ability to independently explore their data.

Everyone from huge IT companies like Oracle, IBM, SAP, SAS, Microsoft, as well as other

The History of BI: The 2000's and Now
Tableau: source

companies like Tableau, Birst, Qlikview, Tibco Jaspersoft, SiSense (the list continues), are all competing to make data easier to store, more accessible across devices, and processable at a speed like never before.

As such, the battle between BI companies today is to provide speed, affordability, and high capacity storage. With mobile technology and PC’s generating such incredible amounts of data – estimated at 2.5 quintillion bytes a day — companies are no longer looking for access alone. Rather, the data has to be accessed in breakneck speeds across all devices, instantly analysable and stored in a cost effective manner.

It is no surprise, therefore, that the BI market is expected to reach $20.8 billion by 2018, at an estimated CAGR of 8.3%, of which $4 billion is expected to come from cloud-based BI. The same goes for data visualisation, with forecasts suggesting the market will grow at a CAGR of 9.21% to reach $6.40 billion by 2019.

Whether BI is set to enter yet another phase – BI 3.0? – and what it will look like is as yet undetermined. But as Brian Gentile suggests, the fierce competition among BI vendors may already have reached its tipping point:

“We joke about it inside TIBCO Jaspersoft ‘here’s the new competitor of the week’. Everyone apparently thinks that they can do analytics because that’s what it looks like. While this will clearly generate some good ideas, not all these companies are going to make it. Many of them are going to fail, or be acquired, and so on along the path.”

With this in mind, the History of Business Intelligence series has come to a conclusion. However, we will now look at dispelling the myths around BI, compare vendors against one another and offer guidance as to which technologies are right for your specific business needs.


Furhaad Shah – Editor

photo-2Furhaad worked as a researcher/writer for The Times of London and is a regular contributor for the Huffington Post. He studied philosophy on a dual programme with the University of York (U.K.) and Columbia University (U.S.) He is a native of London, United Kingdom.

Email: furhaad@dataconomy.ru


Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!

[mc4wp_form]

(Image Credit: M.A. Cabrera Luengo)

PREVIOUS ENTRIES:

The History of BI: The 2000's and NowThe History of BI: The 1960′s and 70′s

This is the first edition to a three part series and gives a brief overview of the history of business intelligence. Starting in the 1960’s and 70’s, the article looks at the advancements made in data storage, database management systems, and companies that were pioneering BI from the early stages.

10698296464_4b03e98acf_zThe History of BI: The 1960′s and 70′s

The second edition of our business intelligence series takes a deeper look at the transition from DBMS’s to RDBM’s, and the emergence of Data Warehousing, ETL, and OLAP. The 1980’s and 90’s were revolutionary in many aspects for BI, and ultimately transformed the way businesses extracted value from their data.

]]>
https://dataconomy.ru/2014/07/19/the-history-of-bi-the-2000s-and-now/feed/ 3
Statistical Language Wars [INFOGRAPHIC] https://dataconomy.ru/2014/06/09/statistical-language-wars-infographic/ https://dataconomy.ru/2014/06/09/statistical-language-wars-infographic/#respond Mon, 09 Jun 2014 10:41:41 +0000 https://dataconomy.ru/?p=5381 The use of statistical language tools like R, SAS and SPSS is growing at unprecedented rates. Forums are full of questions about which program to learn, is one easier than the other, how popular are they, etc. However, what is the current state of these programming languages? – what companies are using them? How easy […]]]>

The use of statistical language tools like R, SAS and SPSS is growing at unprecedented rates. Forums are full of questions about which program to learn, is one easier than the other, how popular are they, etc. However, what is the current state of these programming languages? – what companies are using them? How easy is R to learn in comparison to SPSS or SAS? How marketable are they?

The infographic below gives a nice overview of the these questions, comparing R, SAS and SPSS against one another.

infograph

 

(Image Credit: Tim Lucas)

]]>
https://dataconomy.ru/2014/06/09/statistical-language-wars-infographic/feed/ 0
SAS Expand Their Range of In-Memory Analytics for Hadoop https://dataconomy.ru/2014/05/15/sas-expand-range-memory-analytics-hadoop/ https://dataconomy.ru/2014/05/15/sas-expand-range-memory-analytics-hadoop/#comments Thu, 15 May 2014 09:41:56 +0000 https://dataconomy.ru/?p=4386 Over the past two months, SAS have been staking their claim for in-memory analytics for Hadoop. They have been providing data scientists and analysts with fast, powerful new tools for advanced analytics. First, there was SAS In-Memory Statistics for Hadoop, allowing users to build, explore and model data, all within the Hadoop framework. Now, they’re […]]]>

Over the past two months, SAS have been staking their claim for in-memory analytics for Hadoop. They have been providing data scientists and analysts with fast, powerful new tools for advanced analytics. First, there was SAS In-Memory Statistics for Hadoop, allowing users to build, explore and model data, all within the Hadoop framework. Now, they’re launching SAS Visual Statistics, which will allow users to build and modify predictive and prescriptive models on large volumes of data and visualise them, with SAS’ fast in-memory capabilities.

SAS’ speed lies in its in-memory framework, which bypasses the traditional (and much more timely) MapReduce framework, and stamps out costly data movement. “The problem with MapReduce is one node can’t talk to another,” states Wayne Thompson, chief data scientist at SAS. “If I’m trying to do analytic computation, and I’m trying to minimize some kind of residual–some kind of mean square estimate or something–those nodes can work together to reach the solution, so that’s a big advantage to why we’re quicker.”

Comparisons have been drawn between SAS’ products and other offerings such as ApacheSpark, which also relies on In-Memory Processing. However, Thompson remains confident in SAS’ unique capabilities. “The advantage of SAS is we can do all that without moving outside of Hadoop,” he says. “We can read the data just once into memory and go through that full analytical lifecycle. No other vendor is doing that. There’s some open source work going on out in California with the Spark project that has some of these kinds of capabilities, but none of the other vendors are doing this.”

The Visual Statistics package has many similarities with SAS’ Visual Analytics package, released 18 months ago and already used by over 1,300 sites. SAS state the previous Visual Analytics package is tailored for data analysts, whilst the new Visual Statistics package is tailored for data scientists. Visual Statistics also features a new ‘Group By’ tool, allowing users to group by certain parameters (say, purchase affinity with a promotion in a particular store) on a second, separate model within seconds.

SAS has been a leader in the field of advanced analytics since its foundation in 1976. It’s good to see older, more established companies innovating with newer technologies such as Hadoop and capitalizing on the in-memory gap in the market.

Read more here.
(Photo credit: Software Insider)

]]>
https://dataconomy.ru/2014/05/15/sas-expand-range-memory-analytics-hadoop/feed/ 1