Neo4J – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Tue, 26 May 2020 14:59:58 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2022/12/cropped-DC-logo-emblem_multicolor-32x32.png Neo4J – Dataconomy https://dataconomy.ru 32 32 Neo Technology Proves Demand and Mainstream Adoption of Neo4j with Latest $20M Funding https://dataconomy.ru/2015/01/16/neo-technology-proves-demand-and-mainstream-adoption-of-neo4j-with-latest-20m-funding/ https://dataconomy.ru/2015/01/16/neo-technology-proves-demand-and-mainstream-adoption-of-neo4j-with-latest-20m-funding/#respond Fri, 16 Jan 2015 10:07:30 +0000 https://dataconomy.ru/?p=11446 Graph database innovator Neo Technology has raised $20 million in Series C funding, it announced yesterday. The fresh funding will help the outfit to work on the further development of their graph database Neo4j, as well as fuel growth in the open source community all the while expand outreach globally to address the ‘increasing demand […]]]>

Graph database innovator Neo Technology has raised $20 million in Series C funding, it announced yesterday.

The fresh funding will help the outfit to work on the further development of their graph database Neo4j, as well as fuel growth in the open source community all the while expand outreach globally to address the ‘increasing demand for graph databases,’ explains their news release.

“We have followed Emil and the Neo team over many years as they have built their leading position in graph databases. We are thrilled and honored to now be on board this exciting journey,” explains Johan Brenner, General Partner at Creandum.

“There are two strong forces propelling our growth: one is the overall market’s increasing adoption of graph databases in the enterprise. The other is proven market validation of Neo4j to support mission-critical operational applications across a wide range of industries and functions,” said Emil Eifrem, Neo Technology’s CEO and co-founder.

The company believes that the demand for their graph database product has increased made evident by over 500,000 downloads since the launch of Neo4j 2.0 last year along with “thousands of production deployments, a thriving community of developers worldwide and record turnout for Neo’s GraphConnect San Francisco 2014 conference,” the firm said.

The company boasts of both market veterans as well as startups using their product like Walmart, eBay, Earthlink, CenturyLink,Cisco and startups such as Medium, CrunchBase, Polyvore and Zephyr Health incorporate Neo4j.

The investment was headed by Creandum with Dawn Capital and saw participation from existing investors like Fidelity Growth Partners Europe, Sunstone Capital and Conor Venture Partners. Neo Technology also pointed out that veteran venture capital investor, Johan Brenner of Creandum will join Neo Technology’s Board of Directors.


(Image credit: Screenshot of Neo4j’s graph viz by Karsten Schmidt)

]]>
https://dataconomy.ru/2015/01/16/neo-technology-proves-demand-and-mainstream-adoption-of-neo4j-with-latest-20m-funding/feed/ 0
Discovering the Power of Dark Data https://dataconomy.ru/2014/12/15/discovering-the-power-of-dark-data/ https://dataconomy.ru/2014/12/15/discovering-the-power-of-dark-data/#comments Mon, 15 Dec 2014 11:04:02 +0000 https://dataconomy.ru/?p=11012 ‘Big data’ has become an industry buzzword amongst today’s business leaders. Corporate spending on infrastructure to capture and store diverse volumes of rapidly-changing data has risen significantly in recent years as organisations have scrambled to collect all of the consumer information they believe will help them stay ahead of the competition. However, it’s becoming clear […]]]>

‘Big data’ has become an industry buzzword amongst today’s business leaders. Corporate spending on infrastructure to capture and store diverse volumes of rapidly-changing data has risen significantly in recent years as organisations have scrambled to collect all of the consumer information they believe will help them stay ahead of the competition. However, it’s becoming clear that collecting data alone isn’t enough. Indeed, this year’s Gartner’s Hype Cycle Special Report cites big data as moving beyond the peak of inflated expectations – with businesses now beginning to see that data must not only be harvested and stored, but also analysed and mined efficiently for new insights if it is to be of any strategic value. However, with so much data now being collected, how do organisations know what is useful and what isn’t, and how does one make the insights actionable?

Typically, most organisations focus their data analysis efforts on transactional data – the information customers supply when they purchase a product or service – because they perceive it to be the most valuable. This typically includes names, addresses, credit card information etc. However, in the course of collecting transactional data, large amounts of additional customer information are also accumulated as a byproduct. This non-transactional data is commonly referred to as “dark data”, which Gartner defines as ‘information assets that organisations collect, process and store during regular business activities, but generally fail to use for other purposes.’

What is Dark Data?

Dark data can consist of various insights such as which marketing pieces a specific individual responded to, on which platform they answered a questionnaire, or what they’ve said about an organisation or brand on social media. Dark data can also include customer purchase history, frequency of website visits or geographical spread of customers etc.

While it can appear obscure and unhelpful, if approached in the correct way, dark data can reveal all kinds of patterns and insights that would otherwise have been missed. In short, it is information that can really make a difference if interpreted correctly.

One key to unlocking dark data’s secrets lies in the ability to understand the relationships between seemingly unrelated pieces of information. The way that data is stored plays a critical role in this. Traditional relational databases, and indeed even many big data technologies, simply aren’t designed to show relationships and patterns between data records. You may be able to unearth some connections at a very high level, but the results will be extremely slow and lack real definition. It’s the difference between understanding that two people living in one house are married, siblings or flatmates, and then going a step further to predict how those differences might influence their decisions.

Thanks to advances in technology over the last few years, deriving business value from dark data is now a real possibility. Broadly speaking, the recipe for doing this involves three steps:

  1. Discovering the hidden patterns – This requires technology infrastructure to store and crunch data, and data scientists to ask the questions. Key technologies here are your usual bulk analytic suspects: Hadoop (increasingly with Spark over MapReduce as a way of processing data), Splunk, SAS, etc.
  2. Developing hypotheses – Also in the domain of off-line analytics and data science, developing hypotheses combines various forms of forward (A/B) and back testing.
  3. Putting your newfound insights to use – As the new algorithms developed above make the rules more complex, older technologies lose the ability to execute in real time. As a result, new technologies such as graph databases like Neo4j are needed to run the algorithms at the appropriate junctures, and with the right level of timeliness, to have their effect on the business.

There is quite a lot of discourse aimed at the first two activities, which belong to the realm of data analysis. The third, it turns out, is the critical ingredient that makes insights actionable.  And it’s not a negligible problem. The more intricate and subtle the algorithms coming out of the analytics processes become, the more pressure is exerted on the operational systems. Take the example an e-commerce recommendation. The golden recommendation might very well require combining up-to-the-second information about the products in someone’s shopping cart, the products they’ve browsed and bought, and then examining what other people in their situation have bought in the past in similar product categories. These kinds of real-time multi-hop recommendation algorithms are usually where relational databases either crumble, or get unjustifiably expensive.

The result is an increasingly common phenomenon called “polyglot persistence”, where new database technologies such as graph databases (which are ideal for solving, among other things, the recommendation problem above) are used alongside existing systems to solve specific high-value problems.

Because many of the insights yet to be made rely on a deeper understanding of complex causalities (else they’d have already been discovered), it’s no surprise that new technologies are needed. more and more businesses are discovering graph databases as a powerful enabler of the real-time execution engine, for bringing to life insights in data: light and dark.

Dark data isn’t just useful for customer insights either. It can be equally useful when applied to employees. For example, Gate Gourmet, an airline industry catering provider, was struggling to lower an unusually high 50 per cent attrition rate amongst its one thousand employees at the O’Hare Airport in Chicago. Using dark data already easily accessible in internal systems, such as demographics, salaries and transportation options, the company confirmed its suspicion that the attrition rate was directly related to the distance and transportation options from employees’ homes to the airport. This realisation enabled them to change the hiring process and reduce attrition by 27 per cent. Gate Gourmet did not need to invest a huge amount of money into collecting data to solve the company’s attrition problem. Rather, they needed to look closer at the data they already had available in a way that enabled them to see patterns and connections between employees that were staying with the company and those that were leaving.

Although many businesses are not yet leveraging their dark data, the example of Gate Gourmet demonstrates what can happen when they do. While companies will, and must, continue to actively collect data, it is essential not to neglect the information already available, free of cost! It is clear that there is a need to be more creative by asking new questions from the same old data to throw up exciting and surprising results.

The key in monetizing dark data lies not only in gathering it, but in analysing it to discover hidden patterns, developing hypotheses, and then putting the insights to use. Doing this successfully requires a variety of different technologies, each suited to a particular job. By combining data science and number crunching on large-scale analytic technologies, with the real-time execution of complex algorithms by using a graph database, businesses can bring transformative insights to their operational decisions, and combine the latest technologies with their existing data and systems.


Emil_Eifrem-300x300Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project.  Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NOSQL databases.
 


(Image credit: Jason Eppink)

 

]]>
https://dataconomy.ru/2014/12/15/discovering-the-power-of-dark-data/feed/ 1
How the Internet of Things Can be Best Explored Using Graph Databases https://dataconomy.ru/2014/11/05/how-the-internet-of-things-can-be-best-explored-using-graph-databases/ https://dataconomy.ru/2014/11/05/how-the-internet-of-things-can-be-best-explored-using-graph-databases/#comments Wed, 05 Nov 2014 14:14:53 +0000 https://dataconomy.ru/?p=10211 Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project.  Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NOSQL databases. In this article, Emil explains how the Internet of Things can be best explored […]]]>

Emil_Eifrem-300x300Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project.  Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NOSQL databases. In this article, Emil explains how the Internet of Things can be best explored using graph databases. 


Morgan Stanley recently predicted there will be 75 billion connected devices in use worldwide by 2020, a number rapidly approaching the estimated 86 billion neurons in the human brain. We know that human intelligence comes not from the sum of neurons in our brains, but from the connections between them and the way they interact with each other. While one could easily take this analogy too far, two points are worth taking away. The first is that it’s the connections as much as the devices that will truly bring forth the latent possibilities in the Internet of Things (IoT). Devices in isolation will do very little. The second is that we’re not just speaking about billions of connections, the IoT will have many, many trillions of connections.

As a result, the Internet of Things should really be referred to as the ‘Internet of Connected Things’. This concept shifts the meaning of ‘product’ to encompass more than just individual tangible objects and enables businesses to consider not just what products are, but what they could become if connected in different ways. Understanding and managing these connections by using the right tools will ultimately be at least as important for businesses as understanding and managing the devices themselves.

Imagination is key to unlocking the value of connected things. For example, in a telecommunciations or aviation network, the question, “what cell tower is experiencing problems?” or “Which plane is going to arrive late?” becomes: “What is the impact of the problem on the rest of the network?”

The connections between devices and other entities can change faster than the data describing each thing. With Telco data, for example, each time you call a new person or authorise a new device, you make a connection. The same is true in an industrial setting, when a new piece of equipment comes online it will look for the relevant controllers or other devices that it needs to listen to or send data to.

Understanding connections is the key to understanding dependencies and uncovering cascading impacts. Such insight allows businesses to identify opportunities for new services and products that make the most of the IoT. To identify these opportunities, businesses need access to the tools that can show these connections in a quick and easy way.

This is where graph databases come in. Graph databases are essential to discovering, capturing and making sense of complex interdependences and relationships, both for running an IT organisation more effectively and to build the next generation of functionality for businesses. They are designed to easily model and navigate networks of data, with extremely high performance, which is why they have been so popular with social networks such as Facebook and LinkedIn already. And now they are increasingly being adopted by forward thinking companies looking to extract maximum value from the Internet of Things.

Identifying the connections in the Internet of Things using graph databases can bring various benefits to a variety of industries. For example they can enable retail chains to see the waxing and waning of demand for products across geographies and the devices that consumers use to place orders to re-route shipments to stores or warehouses in which a particular item is in demand and low in stock. Manufacturers, on the other hand, can use graph data to chart seasonal demand and online buying habits, projecting revenues based on that data for the next four quarters.

Additionally, an insurance provider can use a graph database to recognise the interrelationships between seemingly unconnected people involved in a rash of car accidents across a wide area. A deeper understanding of the data could reveal complex insurance fraud activity involving staged car accidents, which if identified quickly, can save the provider hundreds of thousands of dollars.

The answer to the complexity and interconnectedness of the IoT is to reduce the wash of data to its common denominators. Managing and understanding the connections is a particularly big challenge. Graphs are a natural way to represent these connections, making graph databases a natural choice for extracting real business value from the Internet of Things. After all, just like neurons in the human brain, true intelligence comes not from knowing the sum of devices, but from understanding the connections that link them together.

(Image Credit: Michael Coghlan)

]]>
https://dataconomy.ru/2014/11/05/how-the-internet-of-things-can-be-best-explored-using-graph-databases/feed/ 2
10 Big Data Stories You Shouldn’t Miss this Week https://dataconomy.ru/2014/11/04/10-big-data-stories-you-shouldnt-miss-this-week-4/ https://dataconomy.ru/2014/11/04/10-big-data-stories-you-shouldnt-miss-this-week-4/#respond Tue, 04 Nov 2014 15:01:42 +0000 https://dataconomy.ru/?p=10305 “Where there is data smoke, there is business fire.” – Thomas Redman  The idea of Data Smoke is quite a brilliant analogy by Thomas Redman. It adequately explains why so much emphasis has been placed on tools, personnel, and culture within companies over the past decade. To understand the quote, however, it’s important to note that this […]]]>

“Where there is data smoke, there is business fire.” – Thomas Redman 

The idea of Data Smoke is quite a brilliant analogy by Thomas Redman. It adequately explains why so much emphasis has been placed on tools, personnel, and culture within companies over the past decade. To understand the quote, however, it’s important to note that this process works backwards: what companies are trying to do is extinguish the already existing fire, rather than prevent a fire from beginning at all. Essentially, they are trying to control the incredible amounts of data  they have — the existing “fire” — through investing in new services and tools — to reduce the “smoke”.

It’s no wonder that Intel Capital announced this week that it would invest $62 million in innovative technology companies, where a major chunk of this money will be given to companies showcasing Big Data and Cloud infrastructure. Moreover, we saw an incredible partnership formed at the beginning of the week between IBM and Twitter to amalgamate Twitter’s vast data silos with IBM’s cloud-based analytics, customer engagement platforms, and consulting services. In other news, a new study revealed that big data jobs earn 24% more than other IT positions.

Aside from this, below we have selected a number of our favourite articles this week. We hope you enjoy reading them as much as we did writing them!

TOP DATACONOMY ARTICLES

Which Environment to Choose for Data Science?Which Environment to Choose for Data Science?

In the past, R seemed like the obvious choice for Data Science projects. This article highlights some of the issues, such as performance and licensing, and then illustrates why Python with its eco-system of dedicated modules like Scikit-learn, Pandas and others has quickly become the rising star amongst Data Scientists.

Top Tips for Implementing a Big Data StrategyTop Tips for Implementing a Big Data Strategy

Ali Rebaie is a Big Data & Analytics industry analyst and consultant of Rebaie Analytics Group. He provides organizations with a vendor-neutral selection of business intelligence & big data technologies and advice on big data and information management strategy and architecture. We picked his brain on big data in the Middle East, the future of BI, and his top tips for implementing a big data strategy.

How the Internet of Things Can be Best Explored Using Graph DatabasesHow the Internet of Things Can be Best Explored Using Graph Databases

Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project. Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. In this article, Emil explains how the Internet of Things can be best explored using graph databases.

TOP DATACONOMY NEWS

LinkedIn’s Veteran Data Science Team Splits Up to Enhance Productivity<br /><br /> LinkedIn’s Veteran Data Science Team Splits Up to Enhance Productivity

LinkedIn, the social networking company with one of the world’s first pioneering data science teams, has split its crew to be placed under different departments. The data science team which had worked in the product division, had consisted of two branches through the years – the product data science team, responsible for “new data-powered features,” generating new data for analysis, and the decision sciences team, that tracks and monitors product metrics and usage data.

10 Big Data Stories You Shouldn’t Miss this WeekWith New Tor Connector, Is There Now a Way to Hide from Facebook?

Following the rise of data breaching and web surfers defending themselves against traffic analysis online, anonymity has because more prevalent than ever. Facebook announced this week that users of the social media can now connect directly to the network via the free anonymity software, Tor.

10 Big Data Stories You Shouldn’t Miss this WeekGridgain’s In-Memory Data Fabric Enters the Apache Software Foundation as Apache Ignite

Gridgain announced yesterday that their in-memory data fabric has been accepted into the Apache Software Foundation Incubator programme, under the name Apache Ignite. The Gridgain team hope the move will fuel greater adoption of in-memory computing technologies, and build a greater community around the data fabric.

UPCOMING EVENTS

Informs Annual Meeting November 2014 San Francisco9-12 November, 2014- INFORMS Annual Meeting, San Francisco

INFORMS returns to the City by the Bay for its 2014 Annual Meeting with a rich and varied program, bridging data and decisions. Each year, the INFORMS meeting brings together experts from academia, industry, and government to consider a broad range of ORMS and analytics research and applications. In 2014, we’ll offer that program excellence in one of America’s most exciting cities. Join us for INFORMS 2014!

IEEE VIS 2014 Paris November9-14 November, 2014- IEEE VIS 2014, Paris

IEEE VIS 2014 is the premier forum for advances in visualization. The event-packed week brings together researchers and practitioners from academia, government, and industry to explore their shared interests in tools, techniques, and technology.


(Image credit: Bob Jagendorf)

]]>
https://dataconomy.ru/2014/11/04/10-big-data-stories-you-shouldnt-miss-this-week-4/feed/ 0
22 October, 2014- GraphConnect 2014, San Francisco https://dataconomy.ru/2014/09/09/october-22-2014-graphconnect-2014-san-francisco/ https://dataconomy.ru/2014/09/09/october-22-2014-graphconnect-2014-san-francisco/#respond Tue, 09 Sep 2014 11:26:42 +0000 https://dataconomy.ru/?p=8969 GraphConnect is the only conference that focuses on the rapidly growing world of graph databases and applications, and features Neo4j, the world’s leading graph database. Join the hundreds of graphistas from startups to Global 2000 companies, and see how they are leveraging the power of the graph to solve their most critical connected data issues. […]]]>

GraphConnect is the only conference that focuses on the rapidly growing world of graph databases and applications, and features Neo4j, the world’s leading graph database.

Join the hundreds of graphistas from startups to Global 2000 companies, and see how they are leveraging the power of the graph to solve their most critical connected data issues.

Held at SF Jazz Centre, Graph Connect will feature talks from ebay, Polyvore and CrunchBase representatives, as well as training workshops for Neo4J.

A full programme and ticket information can be found here.

]]>
https://dataconomy.ru/2014/09/09/october-22-2014-graphconnect-2014-san-francisco/feed/ 0
Making Big Data Meaningful with Graph Technology https://dataconomy.ru/2014/07/11/making-big-data-meaningful-with-graph-technology/ https://dataconomy.ru/2014/07/11/making-big-data-meaningful-with-graph-technology/#comments Fri, 11 Jul 2014 12:26:56 +0000 https://dataconomy.ru/?p=6837 Big Data – Bigger Challenge The volume of net new data being created each year is growing exponentially — a trend that is set to continue for the foreseeable future. The higher the volumes of data get, the more complex data becomes, and the more challenging it gets to generate insights and values from that […]]]>

Big Data – Bigger Challenge

The volume of net new data being created each year is growing exponentially — a trend that is set to continue for the foreseeable future. The higher the volumes of data get, the more complex data becomes, and the more challenging it gets to generate insights and values from that data. But increased volume isn’t the only force we are facing today: On top of this staggering growth in the volume of data, we are also seeing an increase in both the amount of semi-structure and the degree of connectedness present in that data.

Google, Facebook, Twitter, Adobe and American Express among them have turned to graph technologies to tackle this complexity at the heart of Big Data. Just recently an article by Dr. Roy Martsen outlines how Google started the graph analysis trend in the modern era using links between documents on the Web to understand their semantic context. Google has since then continued to write history and its graph-centric approach has seen the company deliver innovation at scale and dominate not only in its core search market, but also across the information management space.

Graph Technology – Unlocking the Meaning of Big Data

Graphs are a new way of thinking for explicitly modelling the factors that make today’s big data so complex: Semi-structure and connectedness. Putting it in a nutshell: a graph database is an online transactional system that allows you to store, manage and query your data in the form of a graph, i.e. a graph database enables you to represent any kind of data in a highly accessible, elegant way using nodes and relationships, both of which may host properties. The key thing about such a model is that it makes relations first-class citizens of the data, rather than treating them as metadata. As real data points, they can be queried and understood in their variety, weight and quality.

And another very good thing about Graph Technology: its available for everyone, right off the shelf. For example, the Neo4j project is a mature open-source graph database used in production at all kinds of organisations from Global 2000s like Walmart, Lufthansa, and Cisco, to innovative start-ups like FiftyThree, Medium, and CrunchBase. Graph databases like Neo4j have risen to prominence. Just recently Neo Technology saw itself listed as “Cool Vendor” in the Gartner Cool Vendor in DBMS 2014. And as 451 Research analyst Matt Aslett notes: Graphs are moving of the general NOSQL umbrella into a category in their own right. Bearing this in mind it comes as no surprise that Forrester Research estimates that over 25 percent of enterprises will use graph databases by 2017.

Graphs are Eating The World

Graphs don‘t only provide a competitive advantage in domain search. Apart from Twitter and Facebook using the social graph to dominate their markets and Google‘s Knowledge Graph and Facebook‘s Graph Search already all geared up for the next wave of hyper-accurate and hyperpersonal recommendations, graphs are becoming very widely deployed in a host of other industries. One great example here is eBay: owing to the recent acquisition of Shutl, ebay provides a same day delivery service that uses graphs to compute fast, localized door-to-door delivery of goods between buyers and sellers, scaling their business to include the supply chain. Incidentally, eBay observed that before turning to graphs the latency of their longest query was higher than their shortest physical delivery, both around 15 minutes – something that can’t now be replicated when an average query is powered by a graph database and takes 1/50th of a second!

The eBay example is not isolated. Organisations large and small are adopting and winning with graphs in retail, finance, telecoms, IT, gaming, real estate, healthcare, science, and dozens more areas.

The Power of Graph technology

So how do these companies succeed with graphs? Well, just over a year ago, my colleague Max de Marzi undertook a little exercise to show just how easy it to answer difficult questions with a graph. Max built a version of Facebook’s Graph Search that can answer even more questions than the original – over a single weekend using a graph database as his backend! You can take a look at the full story at: http://maxdemarzi.com/2013/01/28/facebook-graph-search-with-cypher-and-neo4j/

To the point, the story shows how far graph technology has matured in recent years, that such powerful graph-based systems can be built over a weekend. Even bringing physical objects into the mix is straightforward: with the burgeoning Internet of Things it is easy to add nodes into the graph that represent physical assets and add spatial indexes (which are themselves graphs) to find their location.

The power of a graph database is exactly like having a mini-web inside your application. You crawl that “web” of nodes via named, directed relationships until you find your goal – and that can be anything. You may want to know where exactly you put your keys, where your long-lost college buddy is working, you may want to find evidence about the efficacy of a clinical trial, or access permissions for computer systems (all graph problems, by the way). The graph database’s role is to store that data safely, and to make querying it fast and easy. Using Neo4j for example, we write a Cypher query that visually describes the graph structure we’re looking for (a pattern) and let the database find matches for that pattern in amongst the network of data it holds.

We at Neo4j see that creating and analysing graphs will bring us to answers, and when we let data connect itself meaning will emerge. We also believe that our ability to understand graphs is greatly enhanced with the right tools, and we’re very excited about where graph technology is heading. You should be too!


Making Big Data Meaningful with Graph TechnologyEmil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project. Before founding Neo, he was the CTO of Windh AB, where he headed the development of highly complex information architectures for Enterprise Content Management Systems. Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NOSQL databases.


(Image Credit: stockholminnovation)

]]>
https://dataconomy.ru/2014/07/11/making-big-data-meaningful-with-graph-technology/feed/ 2
SQL vs. NoSQL – Know the Difference https://dataconomy.ru/2014/07/01/sql-vs-nosql-need-know/ https://dataconomy.ru/2014/07/01/sql-vs-nosql-need-know/#comments Tue, 01 Jul 2014 09:03:51 +0000 https://dataconomy.ru/?p=6330 ‘SQL is outdated’. ‘RDBMS can no longer meet businesses’ data management needs’. ‘New database technologies like NoSQL are the solution for today’s enterprises’. We hear statements like these alot, both inside and outside the database technologies industry. But are they accurate? Is SQL a thing of the past, and are NoSQL solutions the way forward? […]]]>

fig2large

‘SQL is outdated’. ‘RDBMS can no longer meet businesses’ data management needs’. ‘New database technologies like NoSQL are the solution for today’s enterprises’. We hear statements like these alot, both inside and outside the database technologies industry. But are they accurate? Is SQL a thing of the past, and are NoSQL solutions the way forward?

In this article, we’ll outline the differences between SQL and NoSQL, the vast array of differences within NoSQL technologies themselves, and discuss if Relational Database Management Systems really are a thing of the past.

SQL vs. NoSQL; An Overview

SQL NoSQL
Data storage Stored in a relational model, with rows and columns. Rows contain all of the information about one specific entry/entity, and columns are all the separate data points; for example, you might have a row about a specific car, in which the columns are ‘Make’, ‘Model’, ‘Colour’ and so on. The term “NoSQL” encompasses a host of databases, each with different data storage models. The main ones are: document, graph, key-value and columnar. More on the distinctions between them below.
Schemas and Flexibility Each record conforms to fixed schema, meaning the columns must be decided and locked before data entry and each row must contain data for each column. This can be amended, but it involves altering the whole database and going offline. Schemas are dynamic. Information can be added on the fly, and each ‘row’ (or equivalent) doesn’t have to contain data for each ‘column’.
Scalability Scaling is vertical. In essence, more data means a bigger server, which can get very expensive. It is possible to scale an RDBMS across multiple servers, but this is a difficult and time-consuming process. Scaling is horizontal, meaning across servers. These multiple servers can be cheap commodity hardware or cloud instances, making it alot more cost-effective than vertical scaling. Many NoSQL technologies also distribute data across servers automatically.
ACID Compliancy (Atomicity, Consistency, Isolation, Durability) The vast majority of relational databases are ACID compliant. Varies between technologies, but many NoSQL solutions sacrifice ACID compliancy for performance and scalability

The Many Faces of NoSQL

Having heard the term “NoSQL”, you could be forgiven for thinking all technologies under this umbrella have the same data model. In fact, NoSQL refers to a whole host of technologies, which store and process data in different ways. Some of the main ways include:

Document Databases

This image from Document Database solution CouchDB sums up the distinction between RDBMS and Document Databases pretty well:SQL vs. NoSQLInstead of storing data in rows and columns in a table, data is stored in documents, and these documents are grouped together in collections. Each document can have a completely different structure. Document databases include the aforementioned CouchDB and MongoDB.

Key-Value Stores

Data is stored in an associative array of key-value pairs. The key is an attribute name, which is linked to a value. Well-known key value stores include Redis, Voldemort (developed by LinkedIn) and Dynamo (developed by Amazon).

Graph Databases

Used for data whose relations are represented well in a graph. Data is stored in graph structures with nodes (entities), properties (information about the entities) and lines (connections between the entities). Examples of this type of database include Neo4J and InfiniteGraph.

Columnar (or Wide-Column) Databases

Instead of ‘tables’, in columnar databases you have column families, which are containers for rows. Unlike RDBMS, you don’t need to know all of the columns up front, each row doesn’t have to have the same number of columns. Columnar databases are best suited to analysing huge datasets- big names include Cassandra and HBase.

SQL vs. NoSQL- Which to Use?

The idea that SQL and NoSQL are in direct opposition and competition with each other is flawed one, not in the least because many companies opt to use them concurrently. As with all of the technologies I’ve previously discussed, there really isn’t a ‘one-system-fits-all’ approach; choosing the right technology hinges on the use case. If your data needs are changing rapidly, you need high throughput to handle viral growth, or your data is growing fast and you need to be able to scale out quickly and efficiently, maybe NoSQL is for you. But if the data you have isn’t changing in structure and you’re experiencing moderate, manageable growth, your needs may be best met by SQL technologies. Certainly, SQL is not dead yet.

(Featured image source: InfoQ)


Eileen McNulty-Holmes – Editor

1069171_10151498260206906_1602723926_n

Eileen has five years’ experience in journalism and editing for a range of online publications. She has a degree in English Literature from the University of Exeter, and is particularly interested in big data’s application in humanities. She is a native of Shropshire, United Kingdom.

Email: eileen@dataconomy.ru


]]>
https://dataconomy.ru/2014/07/01/sql-vs-nosql-need-know/feed/ 42
Jaspersoft and Talend Extend Partnership https://dataconomy.ru/2014/04/02/jaspersoft-talend-extend-partnership-2-4/ https://dataconomy.ru/2014/04/02/jaspersoft-talend-extend-partnership-2-4/#comments Wed, 02 Apr 2014 14:50:35 +0000 https://dataconomy.ru/?post_type=news&p=1481 Jaspersoft, the Intelligence Inside applications and business processes, and Talend, the global big data integration software leader, today announced the launch of Jaspersoft Extract, Transform, and Load (ETL) Expanded Big Data Edition. Powered by Talend, the Jaspersoft ETL Expanded Big Data Edition rounds out Jaspersoft’s suite of access technologies for Big Data sources, which now […]]]>

Jaspersoft, the Intelligence Inside applications and business processes, and Talend, the global big data integration software leader, today announced the launch of Jaspersoft Extract, Transform, and Load (ETL) Expanded Big Data Edition. Powered by Talend, the Jaspersoft ETL Expanded Big Data Edition rounds out Jaspersoft’s suite of access technologies for Big Data sources, which now address virtually all reporting and analysis use cases.

Based on Talend’s award-winning Big Data integration technology, Jaspersoft ETL Expanded Big Data Edition joins Jaspersoft’s existing native and data federation connectors for Big Data to provide additional scalability, flexibility and improved resource management. Talend natively supports virtually all major Big Data platforms including Amazon EMR, Apache Hadoop (HBase, HDFS, and Hive), Cassandra, Couchbase, CouchDB, Cloudera, Google BigQuery, Greenplum/Pivotal HD, Hortonworks Data Platform, IBM PureData System for Hadoop, MapR, MongoDB, Neo4J, Riak, SAP HANA, Teradata, and Vertica. Further, Big Data Edition includes high availability and load balancing features to provide high reliability and performance for mission critical reporting and analysis needs.

Jaspersoft enables customers to easily transform their Big Data into more flexible database structures for advanced reporting and analytics. This reduces the burden on production systems and provides a scalable, intuitive way to blend Big Data with traditional data sources. For other types of use cases, Jaspersoft native connectors to Big Data leverage all of the underlying data source functionality, such as geospatial functions. Alternatively, Jaspersoft data federation enables real-time blending of data from Big Data and other data sources without moving the data. Unlike other vendors, Jaspersoft offers this full spectrum of techniques to encompass all typical Big Data reporting and analysis use cases.

Jim Bell, CMO at Jaspersoft said, “Jaspersoft ETL extends Jaspersoft’s reporting and analytics leadership in Big Data. It allows our customers’ business requirements to dictate how they access big data for visualization, reporting and analytics rather than having to conform to a vendor’s capabilities. With our native, data federation, and ETL connectors, customers now have a full range of choices to seamlessly incorporate Big Data into their information systems.”

Fabrice Bonan, Chief Product Officer at Talend said, “Jaspersoft ETL Extended Big Data Edition offers industry-leading capabilities for customers who can now combine all of their Big Data and traditional data sources with enterprise-grade performance and reliability. Talend has been partnering with Jaspersoft for several years and is excited to now team with Jaspersoft to provide high value to Big Data customers.”

(Image credit: Mervyn Chua)

]]>
https://dataconomy.ru/2014/04/02/jaspersoft-talend-extend-partnership-2-4/feed/ 1