User Data – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Fri, 27 Feb 2015 14:47:07 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2025/01/DC_icon-75x75.png User Data – Dataconomy https://dataconomy.ru 32 32 eBay Open Sources Pulsar to Analyse User Data in Real Time https://dataconomy.ru/2015/02/26/ebay-open-sources-pulsar-to-analyse-user-data-in-real-time/ https://dataconomy.ru/2015/02/26/ebay-open-sources-pulsar-to-analyse-user-data-in-real-time/#comments Thu, 26 Feb 2015 11:39:29 +0000 https://dataconomy.ru/?p=12170 E-commerce giant eBay has released an open-source, real-time analytics platform and stream processing framework, dubbed Pulsar. Through Pulsar, user and business events will be collected and processed in real time, enabling superior interaction, to the tune of a million events per second with high availability. Owing to the ever increasing buyer and seller traffic eBay […]]]>

E-commerce giant eBay has released an open-source, real-time analytics platform and stream processing framework, dubbed Pulsar. Through Pulsar, user and business events will be collected and processed in real time, enabling superior interaction, to the tune of a million events per second with high availability.

Owing to the ever increasing buyer and seller traffic eBay faces, newer use cases are generated that need collection and processing in real-time. So far, user experience optimization and behaviour analysis was carried out using batch-oriented data platforms like Hadoop.

To enable better interaction, “derive actionable insights and generate signals for immediate action”  eBay decided to develop their own distributed CEP framework. Pulsar CEP provides a Java-based framework as well as tooling to build, deploy, and manage CEP applications in a cloud environment, explains the blog post making the announcement.

The post points out that Pulsar CEP includes the following capabilities:

  • Declarative definition of processing logic in SQL
  • Hot deployment of SQL without restarting applications
  • Annotation plugin framework to extend SQL functionality
  • Pipeline flow routing using SQL
  • Dynamic creation of stream affinity using SQL
  • Declarative pipeline stitching using Spring IOC, thereby enabling dynamic topology changes at runtime
  • Clustering with elastic scaling
  • Cloud deployment
  • Publish-subscribe messaging with both push and pull models
  • Additional CEP capabilities through Esper integration

Built within Pulsar is a real-time analytics data pipeline that provides higher reliability and scalability through processes. such as data enrichment, filtering and mutation,aggregation and stateful processing.

The now open-sourced Pulsar has been deployed in production at eBay and is processing all user behavior events, the post says. A dashboard is under development which will make integration with metrics stores like Cassandra and Druid, easier.


(Image credit: Justin Sullivan, Getty Images)

]]>
https://dataconomy.ru/2015/02/26/ebay-open-sources-pulsar-to-analyse-user-data-in-real-time/feed/ 1