Skip to main content
Log inGet a demo

What is Simon Data?

Learn everything there is to know about Simon Data, including products, key features, real-time capabilities, and pros/cons.

Luke Kline.

Luke Kline

March 18, 2024

11 minutes

Simon Data CDP.

Before 2016, nobody had heard of Snowflake. Fast forward to today, and it’s everywhere. Now, every company wants to activate their data and sync it to their marketing tools to deliver personalized experiences, optimize ad performance, or simply move faster. If you’ve been looking for a way to move data out of Snowflake to empower your marketing teams, then there’s a good chance you’ve stumbled across Simon Data.

This blog post will cover everything there is to know about Simon Data, including:

  • What is Simon Data?
  • Core products and capabilities
  • Key features like data collection, data storage, data modeling, and audience management
  • Real-time capabilities
  • Reverse ETL
  • Security
  • Pros and cons

What is Simon Data?

Simon Data is a fully managed Customer Data Platform (CDP) offering that runs on top of Snowflake. The platform provides all of the traditional capabilities of a CDP and focuses on helping marketers build audiences, orchestrate campaigns, and sync data directly to downstream operational tools.

Overview of Simon Data Architecture

Overview of Simon Data Architecture

The company was founded by Jason Davis and Matt Walker back in 2013 as a messaging platform with the goal of bringing simplicity to the enterpise CDP space to help marketers power personalization across channels. Since then, Simon Data has gone all in on Snowflake from an architectural standpoint and has raised a considerable amount of money as a platform primarily aimed at marketers.

Core Products and Capabilities

There are a lot of intricacies to the Simon Data platform, but fundamentally, you can break the company down into six core components:

  1. Simon Signal is the event tracking framework that helps you collect, process, and ingest events into the Simon Data.
  2. Identity is Simon’s identity resolution feature that lets you build unified profiles. This product helps you resolve customer identities using known and anonymous identifiers so you can create a comprehensive identity graph.
  3. Segments is Simon Data’s no-code audience builder that helps you build audiences based on specific attributes or user-completed events that you’ve defined.
  4. Flows let you sync audiences or send messages to specific segments on a one-time, recurring, or triggered basis.
  5. Simon Mail is an email service platform (ESP) that allows you to send direct communications to your customers using all of the data available within the platform.
  6. Journeys enables you to orchestrate multiple different flows together so you can coordinate cross-channel engagement with your users.

Data Collection

Simon Data collects and ingests both batch and real-time data. This means the platform can collect data from databases, third-party systems, APIs, webhooks, and flat files via traditional ETL pipelines. For non-supported data sources, Simon Data integrates with Stitch so you can bring in data from additional data sources using managed ETL pipelines.

The platform also provides both client-side and server-side tracking so you can collect behavioral data about your users to understand how they are engaging with your brand. All event tracking is powered by Signal, which is Simon Data’s event collection framework. However, bear in mind that Simon Signal is an add-on product that’s not natively available underneath the CDP offering.

For client-side tracking, Simon Data offers a JavaScript pixel that you can deploy on your website using traditional tag managers. For server-side events, there’s an event ingestion API that lets you collect and send customer events directly to Simon Data using JavaScript or any other development language of your choice. Simon Signal can also ingest data from other event-tracking tools, which comes in handy if you’re already collecting behavioral data through another solution.

Data Storage

Architecturally, all of your data is stored within Snowflake. However, Simon Data provides two deployment options when implementing the platform:

  • Managed Deployment: Data is replicated from your warehouse and stored in a Snowflake instance managed by Simon Data.
  • Connected Deployment: Simon Data connects to your existing Snowflake deployment and runs within your environment.

Architecture Diagram of Managed Deployment vs. Connected Deployment

Managed Deployment vs. Connected Deployment

Architecturally, the managed deployment introduces complexity because you inevitably end up with two copies of data and higher latency syncing that data. Unless you’re using Snowflake, Simon Data will always store a copy of your data within their infrastructure via a managed Snowflake instance.

If you are a Snowflake customer, the connected deployment eliminates the problem of data copy via a Snowflake data share, but this option is only available to current Snowflake customers. Composable CDPs like Hightouch are designed to run on top of any cloud infrastructure, which means you can leverage your existing data in its current structure, no matter where it’s stored.

Data Modeling

Regardless of which deployment you choose to implement and set up Simon Data, you have to create datasets within the platform in order to actually use the data. These datasets are the models that define what data you can use to construct your audiences. Without these datasets, your marketers cannot leverage the platform’s no-code audience-building features.

In addition to tools for creating datasets, Simon Data also offers capabilities to manage your schema and perform identity resolution.

Schema Management

Simon Data’s schema structure is more flexible than traditional CDPs because your data lives in Snowflake, but it still has a few fundamental problems. You can’t just leverage your data models out of the box as they’re currently structured in your warehouse. They have to be re-configured to fit within Simon Data’s platform. Additionally, you must have a stable identifier or a unique primary key for your customer profiles in order to actually use Simon Data–and each profile can only have one identifier.

There is a schema builder to manage and join your datasets, but all of your properties have to maintain a 1:1 relationship with customers. Anything that extends out of this can prevent your marketing team from building segments. To workaround these problems, many companies are choosing to leverage fully Composable CDP solutions like Hightouch because you can leverage any and all of your existing data models and tables as they’re currently formatted.

Identity Resolution

Simon Data’s basic identity service allows you to merge, update, and create profiles based on behavioral events using a unique identifier. Simon Data will also create a unique simon_id for each profile you manage. However, all identity rules are based on a predetermined list of identifiers, and there is no support for fuzzy matching. Defining your identity resolution rulesets requires you to write SQL, and if you want to update the underlying logic, you can’t do so without going through a Simon Data account manager.

For more complicated use cases where you need to resolve anonymous identities, Simon Data integrates with an identity resolution provider called FullContact to make deterministic identity matches.

The problem with both of these features is that you don’t actually own the identity graph. It’s managed by Simon Data and is only available on the platform. With a Composable CDP, your identity graph is stored and available within your data warehouse, which means you own it from end to end and can use it for other use cases outside of marketing.

Audience Management

Simon Data’s segmentation feature allows you to define audiences using AND/OR conditions. You can create specific flows to send one-to-one personalized messages to your customers or even build more complex touchpoints in the platform’s journey builder. You can also create segments that reference other segments you’ve previously built.

Overview of Simon Data's Audience Builder

Simon Data's Audience Builder

One advantage here is that you can automatically enroll users into flows or journeys once they perform specific actions or meet specific audience criteria. There’s also an experimentation feature that allows you to perform multivariate testing with variants and holdout groups, as well as a metrics view where you can track campaign performance.

The downside to these marketing features is that they’re not self-serve. In order to build audiences, you have to set up datasets (more on that up above). Updating these datasets requires quite a lot of managed service support. For example, if you want to do something as simple as deleting a field, you’ll need to work with your account manager. Additionally, organization within the platform is almost solely dependent upon tags, so if you forget to tag your segments, flows, or journeys, it will be difficult to keep clear track of them.

Real-time Capabilities

Simon Data is relatively limited when it comes to supporting real-time use cases. Currently, the platform offers two real-time products: Audience API and Real-Time Content.

  • The Audience API allows you to programmatically pull and access customer profile data stored within Simon Data for use cases outside of the platform’s core capabilities. This feature is especially useful when you need to personalize content or serve dynamic product recommendations based on behavioral traits like total spend, location, purchase history, etc.
  • The Real-Time Content feature is an API client that allows you to pull data from the platform the moment you need it available so you can dynamically insert personalized content or offers in marketing messages across channels. This is especially helpful if you have product catalogs or promotional offers that are changing.

Other use cases like event forwarding (sending behavioral data directly to your destinations as it’s collected) and Streaming Reverse ETL (syncing data directly from your warehouse tables to your destinations at low latencies) are not supported today in the platform.

Reverse ETL

If you need to move data out of your warehouse and sync it to other operational tools via Reverse ETL, Simon Data provides some basic capabilities. The platform can sync data from Snowflake to downstream destinations. However, other data warehouses and databases are not supported natively, and the destination catalog is very limited. Data-centric product features like version control via git, support for dbt, and integrations with other modern data tooling are not available in the platform.

Sync performance and speed are directly dependent upon your Simon Data deployment. For example, with the managed deployment, data latency tends to be about one hour, but with the connected deployment of Simon Data, data latency is reduced to minutes.

Composable CDP solutions, which are built on top of an underlying Reverse ETL engine designed to sync data from any warehouse (not just Snowflake), are much more reliable and scalable for these types of use cases than traditional CDPs like Simon Data.

Security

From a security standpoint, Simon Data operates as a data processor and can be configured to meet GDPR, CCPA, and HIPAA requirements. The platform is also SOC 2 certified. All of your customer data is encrypted at rest and in transit, and Simon Data has features like multi-factor authentication and support for single-sign (SSO). You can even set up SSH tunnels, rotate AWS access keys to manage access to your S3 bucket, and set up policies and procedures to manage user access based on the roles that you define.

However, unless you are a current Snowflake customer and can deploy Simon Data on top of your existing instance, your data will always be replicated and stored in Simon Data’s own managed Snowflake instance, which opens you up to some security risks. With a Composable CDP like Hightouch, you don’t have to replicate or store another copy of your data because it simply acts as an activation interface on top of your existing infrastructure.

Pros and Cons

Simon Data has a lot of neat marketing features, but the platform requires a substantial amount of legwork to get up and running, and it’s not the most appealing option for non-Snowflake customers. If you want every feature a traditional CDP has to offer, you’ll also be forced to pay for a number of add-ons. That being said, here’s a list of the biggest pros and cons to the platform:

Pros

  • Non-supported cloud sources can be ingested into Simon Data via Stitch ETL pipelines
  • Built-in email service provider (ESP) via Simon Mail
  • Some AI capabilities to build predictive data science models

Cons

  • Only integrates with Snowflake
  • Limited number of activation destinations
  • Some aspects of IDR are outsourced via FullContact

Closing Thoughts

If you’re a current Snowflake customer, Simon Data can be an appealing option to bridge the gap between your data and marketing teams. However, the platform still has a lot of the same downfalls as traditional CDPs, such as data copy, structured models, and lack of self-serviceability.

Your decision to purchase a CDP should always be based on your specific requirements (e.g., your existing data assets and your current infrastructure) and your use cases. If you’re currently evaluating traditional CDPs vs. Composable CDPs, then book a demo with one of our solution engineers to learn how Hightouch can help you.

More on the blog

  • What is a Composable CDP?.

    What is a Composable CDP?

    Learn why Composable CDPs are seeing such rapid adoption, how they work, and why they're replacing traditional CDPs.

  • Is the Composable CDP Bullsh*t?.

    Is the Composable CDP Bullsh*t?

    Suddenly everyone and everything claims to be a Composable CDP. Let’s cut through the noise and call things what they are.

  • Traditional CDP vs. Composable CDP (What's the Difference?).

    Traditional CDP vs. Composable CDP (What's the Difference?)

    A tactical guide to understand the key differences in CDP architectures.

Recognized as an industry leader
by industry leaders

G2

Reverse ETL Category Leader

Snowflake

Marketplace Partner of the Year

Gartner

Cool Vendor in Marketing Data & Analytics

Fivetran

Ecosystem Partner of the Year

G2

Best Estimated ROI

Snowflake

One to Watch for Activation & Measurement

G2

CDP Category Leader

G2

Easiest Setup & Fastest Implementation

Activate your data in less than 5 minutes