Universal connectivity is fueling streams of event data from a variety of event sources. Increasingly, organizations are developing and deploying event driven applications to harness the growing volumes of event data. IBM EventStore offers a scalable integrated system for enterprises to ingest, persist and analyze event data of any type. For an introduction, refer to the part 1 of 3-part blog series, Ingest and Analyze Streaming Event Data at Scale with IBM EventStore.
Use cases of event driven applications span across a spectrum of scenarios around IoT, Web Analytics, Gaming, Fraud Detection etc. This part 2 of 3-part blog series describes a popular web analytics use case and outlines implementation details on ingesting web events into IBM EventStore.
Over the years, web applications have become ubiquitous in driving online commerce. Increasingly online sales are driving a significant share of revenue for many businesses. As web applications become the primary sales channels, understanding customers’ online behavior is critical for businesses to drive sales. Click Stream analysis tracks and persists the sequence of web clicks from all users to analyze the Clicks data for better understanding of customer interests.
Click Stream analysis requires building out the following components.
Typically, Tracker is Java Script code embedded into web pages in most implementations. The sections below outline how IBM EventStore is used to implement Click Stream analysis for a fictitious retail business. The implementation limits the scope to Collector and EventStore components. The intent is to cover steps associated with ingesting and analyzing web events using IBM EventStore.
CYBERSHOP is a retail business that sells merchandise across multiple product lines ranging from smart phones, computers, appliances and electronics. Using click stream analysis the business intends to understand customers browsing behavior. It seeks insight into what products are of interest to which customers and how much time a customer is spending exploring different products. Leveraging these insights, the business intends to target customers with personalized offers in real-time and drive sales.
The web application tracks multiple web events for every user:
Each web event includes the following details:
Web application uses embedded trackers to submit web events to a server-side collector. The server-side collector ingests event data to IBM EventStore.
IBM EventStore offers multiple interfaces to ingest event data. Current build of developer preview supports IBM Streams and Scala API.
Here are the steps for ingesting event data.
To persist events with different schemas, Event Store will require different tables with matching schema definitions. The web events in the clickstream analysis for CYBERSHOP use a single schema.
The code uses Spark Data Frames to have a collection of event records. The records are inserted in batch mode into the EventStore.
The Scala API for EventStore supports multiple modes of ingestion. Applications ingesting event data can choose between real-time or batch modes. Both modes support synchronous and asynchronous invocations. Typically, asynchronous invocation in batch mode achieve highest ingestion rates and performance workloads in lab tests achieved 1 million records per second per node.
This is part 2 of 3-part blog series (click here for part 1). The blog describes a web analytics use case with details on ingesting web events into IBM EventStore. The next part in the multi-part series will cover details on analyzing the web events data to track and collect insights on customer’s browsing interests.