Search documentation...

K
ChangelogBook a demoSign up

Amazon Kinesis

Ingest and process your data via a data stream

Overview

Hightouch integrates directly with Amazon Kinesis to support high-throughput, distributed, or asynchronous workloads, letting you build a custom connector to your internal systems.

Getting started

Connect to your AWS account

When setting up the Amazon Kinesis destination for the first time, you need to enter your AWS Credentials to give Hightouch access to your AWS account. Hightouch needs permission to send streaming data to Amazon Kinesis on your behalf.

Entering AWS Credentials in Hightouch

Hightouch believes in the principle of least privilege. We ask for no more permissions than necessary. For the Amazon Kinesis destination, Hightouch requires kinesis:PutRecords and kinesis:ListStreams.

When configuring your credentials in Hightouch, you have two options:

  • Setting up a cross-account role (recommended)
  • Providing an access key

Set up cross-account role

Cross-account roles are the most secure method for granting Hightouch access to Amazon Kinesis in your AWS account.

You need the Account ID and External ID in the Hightouch UI to set up a cross-account role.

Account ID and External ID in Hightouch

  1. In the AWS Console, navigate to IAMRoles.
  2. Click Create Role.
  3. Under Trusted entity type, choose AWS account.
  4. Select Another AWS account and also click Require external ID.
    AWS account configuration
  5. Copy/paste the Account ID from Hightouch into the Account ID field in AWS.
  6. Copy/paste the External ID from Hightouch into the External ID field in AWS.
  7. On the Add permissions screen, proceed to attach permissions to the role using any policy that includes at least kinesis:PutRecords and kinesis:ListStreams. Then create the role.
    AWS ARN
  8. Copy/paste the Role ARN from AWS into Hightouch. Click Create to finish setting up your cross-account role.
    Pasting the AWS ARN in Hightouch

Provide access key

If you don't want to create a cross-account role, you can create a regular IAM user and share a Access Key ID and Secret Access Key with Hightouch.

Entering Access Key ID and Secret Access Key into Hightouch

Be sure to attach the user to a permission policy that includes kinesis:PutRecords and kinesis:ListStreams for the specific resource you want to use via a Hightouch sync.

An example policy looks like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["kinesis:PutRecords", "kinesis:ListStreams"],
      "Resource": [
        "YOUR_KINESIS_ARN_1",
        "YOUR_KINESIS_ARN_2",
        "YOUR_KINESIS_ARN_3"
      ]
    }
  ]
}

Select your AWS region

In this step, you specify which region has the resources you want to sync to.

Syncing data

Once you've authenticated your AWS account in Hightouch and selected an AWS Region, you've completed setup for a Amazon Kinesis destination in your Hightouch workspace. The next step is to configure a sync that send messages whenever rows are added, changed, or removed in your model.

Configure your events trigger

Hightouch monitors your data model for added, changed, and removed rows. In this step, you specify which of these events should trigger message publishing.

Events to trigger message publishing

Choose your data stream

In this step, you choose which data stream to send messages to. Hightouch allows you to sync to existing data streams that are already in your Amazon Kinesis.

Customize your streaming data

JSON streaming data options in the Hightouch UI

In this step, you tell Hightouch how to build the JSON streaming data object using data from your model.

Configure partition key

This field is sent along with the streaming data and determines which shard in the stream the data record is assigned to. Amazon Kinesis Data Streams uses the partition key as input to a hash function that maps the partition key and associated data to a specific shard. Specifically, an MD5 hash function is used to map partition keys to 128-bit integer values and to map associated data records to shards. As a result of this hashing mechanism, all data records with the same partition key map to the same shard within the stream.

Configure partition key

This destination offers three methods of composing a JSON object:

Use JSON editor

Selecting the JSON editor method in the Hightouch UI

With the JSON editor, you can compose any JSON object using the Liquid template language. This is particularly useful for complex streaming data bodies containing nested objects and arrays, which can sometimes be difficult to model entirely in SQL.

Suppose your data model looks like this:

full_nameageemail_addressphone_number
John Doe30john@example.com+14158675309

And you want your streaming data like this:

{
  "name": "John Doe",
  "age": 30,
  "contact_info": [
    {
      "type": "email",
      "value": "john@example.com"
    },
    {
      "type": "phone",
      "value": "+14158675309"
    }
  ]
}

Your Liquid template should look like this:

{
  "name": "{{row.full_name}}",
  "age": {{row.age}},
  "contact_info": [
    {
      "type": "email",
      "value": "{{row.email_address}}"
    },
    {
      "type": "phone",
      "value": "{{row.phone_number}}"
    }
  ]
}

This makes it so you can reference any column using the syntax {{row.column_name}}. You can also use advanced Liquid features to incorporate control flow and loops into your dynamic streaming data.

When injecting strings into your JSON body, be sure to surround the Liquid tag in double quotes.

Use one column from model

Selecting using one column from the model as the JSON construction method in the Hightouch UI

If you're already storing JSON data in your source, or if you have the ability to construct a JSON object using SQL, you can select one column in your model that already contains the full streaming data.

This setting is commonly used when syncing web events that have already been collected and stored as JSON objects in your database.

Use multiple columns from model

Selecting using multiple columns from the model as the JSON construction method in the Hightouch UI

For the simplest use cases, Hightouch can construct a JSON object with key/value pairs based on multiple columns in your model.

Suppose your model looks like this:

emailfirst_namelast_name
alice.doe@example.comAliceDoe
bob.doe@example.comBobDoe
carol.doe@example.comCarolDoe

The field mapping in the screenshot above would generate the following streaming data for the first row:

{
  "customer_first_name": "Alice",
  "customer_last_name": "Doe",
  "customer_email": "alice.doe@example.com"
}

You can use the field mapper to rename fields. For example, first_name can be mapped to customer_first_name.

Configure optional message properties

Along with your row data in JSON format, you can also optionally include an explicit hash key to explicitly determine the shard the data record is assigned to.

ExplicitHashKey

This is a string field that should be a hash value and will override the partition key hash.

Configure initial sync behavior

In this step, you tell Hightouch how to handle rows present in your model results during the first sync run.

Certain workflows may require performing a backfill of all rows during the initial sync. For other use cases, you might only want to send messages in response to future data changes.

Declaring how to handle initial sync behavior in Hightouch

Tips and troubleshooting

Common errors

If you encounter an error or question not listed below and need assistance, don't hesitate to . We're here to help.

ProvisionedThroughputExceededException: Rate exceeded for shard.

This error refers to Amazon Kinesis’ quotas and limits. Manual intervention isn't required since our sync engine retries rejected rows during the following sync run.

Live debugger

Hightouch provides complete visibility into the API calls made during each of your sync runs. We recommend reading our article on debugging tips and tricks to learn more.

Sync alerts

Hightouch can alert you of sync issues via Slack, PagerDuty, SMS, or email. For details, please visit our article on alerting.

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

Last updated: Apr 5, 2024

On this page

OverviewGetting startedConnect to your AWS accountSelect your AWS regionSyncing dataConfigure your events triggerChoose your data streamCustomize your streaming dataConfigure optional message propertiesConfigure initial sync behaviorTips and troubleshootingCommon errorsLive debuggerSync alerts

Was this page helpful?