Gainsight’s Data Architecture Explained Image

Gainsight’s Data Architecture Explained

As a multi-product system, Gainsight helps customers accomplish Customer Success (CS)  by making decisions based on data that trigger appropriate actions. 

The actions can happen through in-house applications or offline completely. It’s a dynamic system that changes with each individual’s needs. As you might imagine, a system of this nature needs to deal with large data volumes with varied characteristics and application needs. 

To help you understand how we execute this variety of functions in a completely frictionless way, we’ve outlined the data architecture of Gainsight. We hope this helps illustrate the depth and breadth of what is possible inside our solution. Let’s start with a basic overview

Data Architecture Overview

The below diagram illustrates the major components of architecture and their interplay.

Ingest Data and Reporting

Reports and meaningful dashboards are essential pillars of a system designed to help customers make decisions. The visualization of data makes the difference in product usage as well as adoption and expansion. Data without context is meaningless, and therefore not helpful to those who use your product. That is why we start with this as a top priority in our design process. 

The data must be modeled in a way that is easy to understand quickly. Various data such as event, batched, and more can be loaded and objects such as standard and custom can be built using this design. The objects are also used to build reports. These objects reside in different data-stores based on purpose that determines the access patterns and volume.

  1. Postgres for transactional data, called Low Volume objects. These objects provide record level CRUD operations. Most customer success use cases and dimensional data leverage this object category. Supports dynamic joins with other  low volume objects. 
  2. Redshift for big data called High Volume Objects. Typically these are used for bulk data operations i.e. bulk ingestion and aggregated reads. Supports dynamic joins with other high volume objects. A special provision for core standard objects is given, which is explained later in this blog.
  3. Presto query engine with S3 storage. It is used to store snapshot of data retrieved or transformed from other stores (Low Volume, High Volume or any other existing object) or integrations  

To build objects and schemas that help in reporting, Gainsight provides seamless extraction  through out-of-the-box connectors (CRM and others) and rich transformation tools. With these capabilities Gainsight can pull data or the customers can push data into Gainsight. The transformation tools then help convert raw data into useful information in objects that can be reported on to help customers make decisions. In addition to these systems, Gainsight provides a sophisticated data-designer that can be used to visualize objects from various sources and datastores. 

Gainsight – Actionable System

Separate from data that the customer ingests, Gainsight has many products that generate actionable data through its processes. This data is also available to customers for reporting with a uniform schema. Data from these objects can be exported, intelligence extracted through data science features or transformed and reported

Application Persistence

In addition to the reportable data stores, Gainsight uses purpose specific application data stores to complement specific product needs. The multi-product system uses microservices architecture to bridge the interactions between the Reportable Tenant Data and the application-specific data stores. The application-specific data stores are abstracted by the microservices and also provide fixed reports to customers.

Blending Microservices Architecture and Multi-Tenancy

Typical Microservice architecture employs data abstraction between them so that the services do not share the same persistence layer. The Gainsight system employs a hybrid model. While the Application persistence of one microservice is abstracted from the other the Reportable Tenant Data is accessed by all Microservices. The highlights of this model are – 

  1. The Microservice’s specific persistence is isolated through Application persistence thus ensuring autonomy for its evolution.  
  2. The Tenant Data is accessed across microservices using consistent metadata. This shields the system from the pitfalls of coupling and allows the microservices to evolve independent of the others
  3. For the user a cohesive Tenant data model of System and Standard objects is made available. Beneath the surface, these objects are powered by multiple Microservices. 
  4. In a multi product system over time the  features and functionality tend to become cohesive. This makes the Microservices chatty and creates unnecessary operational overhead. Having the metadata abstract the Tenant Data allows microservices to interact directly with the persistence layer.

Data Volume

Decisions through data typically involve the need for both transactional and analytical data. The size of analytical data can be pretty large compared to transactional data. E.g. A website with 50 pages tracking for 1000 users usage data will approximately involve 18 million (50x1000x365 = 18,250,000) data points at day level granularity. Extrapolating this to 100 customers will yield 1.8 billion (18,250,000 *100) data points.

Hence Gainsight in addition to Postgres employs two stores Redshift and S3 through Presto to deal with large data. 

Data Consistency 

Gainsight uses multiple data stores to power “Reportable” Tenant Data. While Data designer provides intuitive  mechanisms to fetch data dynamically and merge (join) them to build a reportable dataset, the high volume objects allow runtime joins between the objects. To provide consistency in the high volume datastore, Gainsight internally employs change data capture to make the core standard objects available in it. This helps in building essential dimensional-fact models effectivley. 

Because we know how important it is for our customers to be able to make the best decisions as quickly as possible, Gainsight’s data architecture will continue to be a top priority for us. To learn more about how we create our product and ensure value to all our customers, read more from our engineering team.