How to Stream Data from SQL Server To ClickHouse

Paul Dudley

LinkedIn Profile

AUTHOR BIO

Paul is the CEO and Co-Founder of Streamkap

June 30, 2025

Introduction

‍

As demand for real-time data insights grows, organizations need tools that can quickly move data from production applications into their data platforms. Streamkap is a powerful data streaming platform that enables teams to build real-time data pipelines through an intuitive interface.

This guide demonstrates how to use Streamkap to stream data from AWS RDS SQL Server to ClickHouse, two widely adopted technologies in modern data stacks. We'll walk through the configuration and setup required to build a robust, scalable pipeline that keeps your analytical environment continuously in sync with your production database. This eliminates the need for complex ETL workflows or batch processing delays.

Guide Sections:

Prerequisites	You’ll need accounts for AWS, ClickHouse, and Streamkap.
Setting up RDS SQL Server for Streamkap Compatibility	Setting up a new AWS RDS SQL Server instance and modifying an existing instance for Streamkap compatibility.
Setting up ClickHouse for Streamkap Compatibility	Creating a data warehouse in ClickHouse cloud as a streaming destination.
Connecting RDS SQL Server and ClickHouse using Streamkap	Using Streamkap to create a data pipeline connecting AWS RDS SQL Server as the source and ClickHouse Cloud as the destination.

‍

Prerequisites

To follow along with this guide, ensure you have the following in place:

Streamkap Account: To get started, you'll need an active Streamkap account with admin or data admin privileges. If you don't have one yet, no worries—you can sign up here or ask your admin to grant you access.
Amazon AWS Account: To set up and manage an AWS RDS instance, you’ll need an active Amazon AWS account with core RDS permissions. These permissions are essential for creating and managing database instances, configuring networking, and ensuring proper security settings. If you don’t already have an AWS account, you can create one from here to get started.
ClickHouse Cloud Account: Ensure you have an active ClickHouse Cloud account with admin privileges. These privileges are essential for managing databases and setting up data integrations. If you haven’t signed up yet, you can get started by creating a ClickHouse Cloud account here.

‍

Setting up RDS SQL Server for Streamkap Compatibility

AWS RDS SQL Server is a scalable database solution designed for enterprise applications, offering strong performance and reliability trusted by organizations worldwide. Amazon has streamlined the RDS SQL Server experience, enabling users to set up an SQL Server instance from scratch in just a few minutes.

The next two sections will go through how to set up a new RDS SQL server instance and how to set up an existing RDS SQL Server for Streamkap connectivity.

Setting up a New RDS SQL Server from Scratch

Note: To set up a new AWS RDS SQL Server instance, you’ll need IAM permissions for creating and managing RDS instances, parameter groups, subnet groups, networking, security, monitoring, and optional encryption and tagging.

Step 1: Access the AWS Management Console

Log in to your AWS management Console and type “RDS” on the search bar.
Click on “RDS” as shown below.

Step 2: Create a new RDS SQL Server Instance

On the RDS dashboard, select one of the following regions to create your database instance in, note that these are the AWS regions which Streamkap is available in:

Oregon (us-west-2)

North Virginia (us-east-1)

Europe Ireland (eu-west-1)

Asia Pacific - Sydney (ap-southeast-2)

Navigate to the “Databases” section or select “DB Instances” under the “Resources” section.

‍

‍

Click on "Create database" to start the setup process.

In the "Choose a database creation method" section, choose "Standard create" option.
Under the "Engine options" section, select "SQL Server" as the database engine.
In the "Database management type" section, select "Amazon RDS" for a fully managed RDS service.

‍

Choose the “Standard Edition” for cost-effective, small-to-medium workloads with essential features, or opt for the “Enterprise Edition” for high scalability, advanced security, and complex workloads that require features like advanced analytics and high availability.
Select “SQL Server version 2016 or higher” to ensure compatibility and optimal performance.
If you are setting up for the first time, opt for a "Dev/Test environment" to ensure proper configuration.

‍

Use “source-database-instance” as your database identifier name to follow standard naming conventions for clarity and consistency. If you'd like something more personalized, feel free to choose a name that suits your style.
Pick a “master username” that's easy for you to remember and aligns with your setup.
Opt for a “Self-managed password”, enter a strong, secure password, and make sure to save it in a safe place-you’ll need it later.

‍

If you're just getting started, it's a good idea to gain some hands-on experience before going live. Choose "memory optimized classes (r and x classes)" and select "db.r5.large" for a cost-effective and suitable instance type for testing.
Select "General Purpose SSD" for a balanced performance-to-cost ratio. If you have specific needs, you can choose a different storage option that best fits your requirements.

‍

‍

In “Availability and Durability”, select No for redundancy if you don't need high availability or failover capabilities at this stage.
Choose "Don’t connect EC2" to simplify your setup and focus only on the database instance.
Select "Create new VPC," then set up a new subnet using the dropdown menu in the VPC Settings section for streamlined connectivity.
Enable local access by clicking "Yes" for public access in the Connectivity section, which is ideal for easy management and access.

‍

‍

Head to the VPC settings section, select "Choose existing VPC," and use the "default VPC" for a simple and hassle-free configuration.

Uncheck “Performance Insights”—to reduce complexity.
Once you’ve reviewed all the settings and configurations, take a final look at everything, and then click on “Create Database” to initiate the creation process.

‍

Wait for the status to change to “Available,” as displayed below.

‍

Step 3: Configuring Your SQL Server Instance for Streamkap Compatibility

Note: This query can be executed using a local client like DBeaver or SQL Server Management Studio.

To adhere to naming conventions, create the database with the name “source-database” using the following query:

‍

```sql

CREATE DATABASE [source-database];

```

‍

Verify if the database has been created by running the query shown below. It will display the name of the database if it has been created successfully.

‍

```sql

SELECT name FROM sys.databases  WHERE name = 'source-database';

```

‍

To enable Change Data Capture (CDC) on the “source-database”, allowing you to track and capture changes for further analysis or replication purposes, run the following query:

‍

```sql

EXEC msdb.dbo.rds_cdc_enable_db 'source-database';

```

‍

To verify whether CDC has been enabled or not, run the following query:

‍

```sql

SELECT name, is_cdc_enabled
FROM sys.databases
WHERE name = 'source-database';

```

It should return an output that matches the one shown in the picture, indicating the status of CDC for the database.

‍

Configuring an Existing RDS SQL Server for Streamkap Compatibility

If you already have an existing AWS RDS SQL Server instance, it might not be CDC-compatible. To check its compatibility, we’ll run a verification query. You need to verify if Change Data Capture (CDC) is enabled on the instance.

To check if CDC is enabled on your database, execute the query below:

Note: Make sure to replace “your-database-name" with the actual name of your database for all queries.

‍

``sql

SELECT name, is_cdc_enabled
FROM sys.databases
WHERE name = 'your-database-name';

‍
```

‍

If CDC is already enabled, you'll see "1" and can skip this section and proceed with your setup.

‍

Enable CDC on database

To enable Change Data Capture (CDC) on the your database, allowing you to track and capture changes for further analysis or replication purposes, run the following query:

‍

```sql

EXEC msdb.dbo.rds_cdc_enable_db 'your-database-name';

```

‍

To verify whether CDC has been enabled or not, run the following query:

‍

```sql

SELECT name, is_cdc_enabled
FROM sys.databases
WHERE name = 'your-database-name';

```

‍

It should return an output that matches the one shown in the picture, indicating the status of CDC for the database.

‍

Setting up ClickHouse Cloud for Streamkap Compatibility

ClickHouse is a fast and scalable columnar database management system that is widely used for real-time data analytics. It allows users to efficiently store and query large volumes of data, making it ideal for applications that require high-performance analytics. ClickHouse is known for its ability to handle complex queries with low latency, making it a popular choice for many cloud applications.

‍

Creating a New ClickHouse Account

Now that the source database is set up and ready the next step is to configure the destination data warehouse in ClickHouse.

As the endpoint for your Streamkap pipeline, ClickHouse will ingest real time data. In this section we’ll walk you through the process of setting up your ClickHouse instance to work with SQL Server on AWS RDS through Streamkap and handle incoming data streams.

After signing up for ClickHouse and providing your personal details, enter the data warehouse details and select AWS as your provider. Select either a Production or Development deployment and click Create service.

On the navigation menu, select Connect to obtain your connection details.

‍

The dialog presents all the details you need to set up your destination connection in Streamkap.

‍

To create a database and a dedicated Streamkap user, open the SQL Console and execute queries like the following.

‍

```sql

-- Create database 
CREATE database streamkap_datawarehouse;


-- Create Streamkap role and user 
CREATE ROLE STREAMKAP_ROLE; 
GRANT SELECT, CREATE, SHOW, INSERT, ALTER ON streamkap_datawarehouse.* TO STREAMKAP_ROLE; 
CREATE USER streamkap_user IDENTIFIED BY '{password}'; 
GRANT STREAMKAP_ROLE TO streamkap_user;

‍
```

‍

With both a source database and destination data warehouse, you now have both parts needed to set up a data stream.

‍

Fetching credentials from existing Destination warehouse

If you already have a ClickHouse instance set up, fetch the credentials to connect to it with Streamkap. These credentials are host, port, username and password. Without these credentials the connection to your ClickHouse data warehouse will not be possible.

Getting the correct credentials is a crucial step in the integration process as Streamkap will stream real time data updates directly into your ClickHouse instance.
Select Connect from the ClickHouse navigation menu.

The credentials and connection details will be visible in the dialog.

The above credentials will be used to set up the destination connector in Streamkap. You only need the hostname part of the URL as https:// is implied and the port is specified separately.

‍

Connecting RDS SQL Server and ClickHouse using Streamkap

To connect Streamkap to SQL Server, we need to ensure the database is configured to accept traffic from Streamkap by safe listing Streamkap’s IP addresses.

Note: If SQL Server accepts traffic from anywhere in the world, you can move on to the “Configuring SQL server for Streamkap Integration” section.

Safe Listing Streamkap’s IP Address

‍

Streamkap’s dedicated IP addresses are

Service	IPs
Oregon (us-west-2)	52.32.238.100
North Virginia (us-east-1)	44.214.80.49
Europe Ireland (eu-west-1)	34.242.118.75
Asia Pacific - Sydney (ap-southeast-2)	52.62.60.121

‍

When signing up, Oregon (us-west-2) is set as the default region. If you require a different region, let us know. For more details about our IP addresses, please visit this link.

To safelist one of our IPs, access your RDS SQL Server instance's security group with type “CIDR/IP - Inbound” and select “Edit inbound rules.”
Click “Add rule,” choose “SQL Server-RDS” as the type, select “Custom” as the source, and enter any of our relevant IP addresses followed by “/32.” For example, to add an IP address for North Virginia you would enter “44.214.80.49/32” as shown in the image below.

Configuring RDS SQL Server for Streamkap Integration

Access your RDS SQL Server database using a local client like DBeaver or SQL Server Management Studio

Executing the following query will create a new schema, which will be used to verify the successful data transfer.

‍

```sql

-- Switch to source-database
USE [source-database];
GO


-- Create the schema 'hr' if it doesn't exist
CREATE SCHEMA hr;
GO


-- Create table in 'hr' schema
CREATE TABLE hr.employees ( 
    employee_id INT PRIMARY KEY, 
    first_name NVARCHAR(100) NOT NULL, 
    last_name NVARCHAR(100) NOT NULL, 
    department NVARCHAR(100), 
    salary DECIMAL(10, 2) NOT NULL 
); 
GO


-- Insert data into the table
INSERT INTO hr.employees (employee_id, first_name, last_name, department, salary)  
VALUES (1, 'Paul', ' Dudley ', 'FI', 85000.00);  
GO

‍
```

‍

The following script will create a new user with the necessary privileges for the connector.
The new user will have the required access and permissions to interact with the database for streaming or integration tasks.

‍

```sql

-- Create the login for streamkap_user with your specified password
CREATE LOGIN streamkap_user WITH PASSWORD = 'your_password_here';
GO


-- Create user for streamkap_user in the source-database
CREATE USER streamkap_user FOR LOGIN streamkap_user;
GO


-- Create a role named streamkap_role
CREATE ROLE streamkap_role;
GO


-- Add streamkap_user to the streamkap_role
ALTER ROLE streamkap_role ADD MEMBER streamkap_user;
GO


-- Grant SELECT permissions on the entire database to streamkap_role
GRANT SELECT ON DATABASE::[source-database] TO streamkap_role;
GO


-- Grant ALTER and SELECT permissions on the hr schema to streamkap_role
GRANT ALTER, SELECT ON SCHEMA::hr TO streamkap_role;
GO


-- Grant CREATE TABLE permissions to streamkap_role
GRANT CREATE TABLE TO streamkap_role;
GO

‍

```

‍

To allow the Connector to perform "Snapshots" for data backfilling, you need to create a dedicated table for this process.
Run the following query to create the required table:

Note: Ensure the table name is exactly "STREAMKAP_SIGNAL". Any variation will not be recognized

‍

```sql

-- Create the schema 'streamkap' if it doesn't exist
IF NOT EXISTS (SELECT * FROM sys.schemas WHERE name = 'streamkap')
BEGIN
    EXEC('CREATE SCHEMA streamkap');
END;


-- Create the table 'streamkap_signal' within the 'streamkap' schema
CREATE TABLE streamkap.streamkap_signal (
  id VARCHAR(255) PRIMARY KEY, 
  type VARCHAR(32) NOT NULL, 
  data VARCHAR(2000) NULL
);


-- Grant necessary privileges on the table to the role 'streamkap_role'
GRANT SELECT, UPDATE, INSERT ON streamkap.streamkap_signal TO streamkap_role;

‍
```

‍

Note: Microsoft recommends storing CDC data files separately from primary database files. However, in RDS SQL Server, direct access to file paths is not available.

To enable CDC on a table, execute the following query:
Make sure to replace {schema} and {table} with the correct schema and table names. In our case, the schema is “hr”, and the table is “employees”.

‍

```sql

EXEC sys.sp_cdc_enable_table  
    @source_schema        = N'{schema}', 
    @source_name          = N'{table}', 
    @role_name            = N'streamkap_role',
    @supports_net_changes = 0;
GO

‍
```

‍

For enabling CDC on the STREAMKAP_SIGNAL table, use:

‍

```sql

EXEC sys.sp_cdc_enable_table  
@source_schema        = N'streamkap',
@source_name          = N'streamkap_signal',
@role_name            = N'streamkap_role',
@supports_net_changes = 0;
GO

```

‍

Adding RDS SQL Server Database as a Source Connector

Before adding SQL Server RDS as a source connector, ensure the following:

IP Address Safe listing: Ensure that your IP address is added to the safelist or configure your SQL Server RDS instance to allow traffic from Streamkap by default. This step is essential for establishing a secure connection between Streamkap and your RDS instance.
SQL Configuration: Verify that you have executed all required SQL commands from the "Configuring AWS RDS SQL Server for Streamkap Integration" section. This includes setting up users, roles, permissions, and enabling Change Data Capture (CDC) on the required tables.

Both steps are critical for establishing a seamless connection between AWS RDS SQL Server and Streamkap.

Note: If you're using SQL Server 2016 or later, you can integrate with Streamkap using CDC. For older versions, additional configuration may be required.

Step 1: Log in to Streamkap

Sign into the Streamkap. This will direct you to your dashboard.

Note: You need admin or data admin privileges to proceed with the steps below.

Step 2: Set Up a SQL Server Source Connector

On the left sidebar, click on “Connectors,” then select the “Sources” tab.
Click the “+ Add” button to begin setting up your SQL Server source connector.

‍

Enter "SQL Server" in the search bar and select "SQL Server RDS."
This will take you to the new connector setup page.

‍

‍

You’ll be asked to provide the following details:
- Name: Choose a name for the SQL Server source in Streamkap.
- Endpoint: Copy the Endpoint from the “Connectivity & Security” tab of your SQL Server RDS instance.
- Port: The default port is 1433, If you’ve changed the port, update it accordingly.
- Signal Table Schema: Enter “STREAMKAP” if you followed the previous step or provide your own schema name.
- Username: Use “STREAMKAP_USER” which was created earlier.
- Password: Enter the password for the user.
- Source Database: Use “source-database” (or specify a custom database name if necessary).

‍

If you choose not to use an SSH Tunnel, click "Next" and proceed to step 3.

If you opt to connect to your AWS RDS SQL Server through an SSH Tunnel, ensure the following:

Your bastion host / SSH Tunnel server is connected to your SQL Server RDS instance.
Your bastion host /SSH Tunnel allow traffic from any source or has the Streamkap IP Address safelisted

Log in to your Linux SSH Tunnel host and execute the commands below. If you are using a Windows server, refer to our SSH Tunnel guide for Windows.

-- Create group Streamkap: 
 sudo groupadd Streamkap 


-- Create user Streamkap: 
 sudo useradd -m -g Streamkap Streamkap 


-- Switch to the Streamkap user: 
 sudo su - Streamkap 


-- Create the .ssh directory: 
 mkdir ~/.ssh 


-- Set permissions: 
 chmod 700 ~/.ssh 


-- Change to the .ssh directory: 
 cd ~/.ssh 


--Create the authorized_keys file: 
 touch authorized_keys 


-- Set permissions: 
 chmod 600 authorized_keys

‍

After running the commands above, fetch the “SSH Public Key” from Streamkap’s SQL Server source as shown below.

Replace <key> with the SSH Public key copied from earlier and run the code below.

-- Using the key given to you from Streamkap, add this to the authorized_keys file. 
 echo "<key>" >> ~/.ssh/authorized_keys 


-- Set the SSH daemon configuration for Port Forwarding: 
 sudo nano /etc/ssh/sshd_config 
 AllowTcpForwarding yes 


 --Restart the SSH Service 
 sudo systemctl restart sshd

After your SSH Tunnel server is restarted, enter the public IP address of your SSH Tunnel server as the SSH Host in Streamkap and click Next. You’ll then be directed to the schema and table selection page, like the one below.

Step 3: Plug in Schema and Table Names

If you've followed the guide, your SQL server should now have a "hr" schema within the "employees" table.

If you're using a different schema, plug in your own details.
Once you've updated, click Save to proceed.

‍

Once all the steps are completed correctly, you should be able to successfully create a SQL Server source connector as shown below.

In case you encounter any fatal errors, they could be caused by one of the following:

Our IP addresses were not safe listed.
Incorrect credentials.
Incorrect database, schema, or table name.
Insufficient permissions to query the table.

Adding a Destination Connector

After you’ve set up the source database connector, next you need to set up the destination connector in Streamkap. This connector connects Streamkap to your ClickHouse datawarehouse and writes data from your SQL Server database to the correct tables in ClickHouse.

The destination connector is a key part of the puzzle that allows Streamkap to relay changes to your data warehouse in real-time and maintain data integrity. While this sets up the connection to the destination, the full pipeline is still not working yet.

In this section we’ll walk you through setting up the destination connector and bring you one step closer to your real-time data pipeline.

In the Connectors section of Streamkap, click on the Destinations tab.

Click the Add button.

Select ClickHouse from the list of available connectors.

Enter your ClickHouse connection details and click Save.

‍

Adding a Pipeline

Navigate to “Pipelines” on the left side and then press “+Create” to create a pipeline between source and destination.

‍

Select source and destination and click on “Next” on the bottom right corner.

‍

Select all the schemas or the specific schema you wish to transfer.

‍

Type in the pipeline name and click “Save”.

‍

‍

Once the pipeline is successfully created, it should appear like the screenshot below, with its status displayed as “Active.”

‍

Go to the ClickHouse Cloud console and navigate to “SQL Console”. Select “Tables” and select the “employees” table to preview the data in the table.

By design Your “Table” in the destination will have the following meta columns apart from the regular columns.

Column Name	Description
_STREAMKAP_SOURCE_TS_MS	Timestamp in milliseconds for the event within the source log
_STREAMKAP_TS_MS	Timestamp in milliseconds for the event once Streamkap receives it
__DELETED	Indicates whether this event/row has been deleted at source
_STREAMKAP_OFFSET	This is an offset value in relation to the logs we process. It can be useful in the debugging of any issues that may arise

‍

If you're looking to modify the schema of the table, you must follow SQL Server's schema evolution process, as outlined below, or choose to skip it.

SQL Server Schema Evolution

Structural changes to a source table in SQL Server, such as adding or removing columns, do not automatically reflect within ClickHouse. To ensure that your ClickHouse tables reflect the updated schema, you'll need to follow a specific process. Here's a step-by-step guide to handle schema evolution.

Step 1: Modify the Source Table

First, make the desired changes to your source table in SQL Server. For example, to delete the last_name column from the hr.employees table:

‍

```sql

--delete last_name column from hr.employees table
ALTER TABLE hr.employees DROP COLUMN LAST_NAME;

‍
```

‍

Step 2: Refresh the Change Data Capture (CDC) Table

Refresh the CDC table by running the following script. Replace the placeholders with relevant values.

Note: There cannot be more than 2 change tables for every source table.

‍

```sql

-- Replace {database}, {schema} and {table} placeholders
-- {database}: name of the CDC enabled database
-- {schema}: name of the schema with tables to refresh
-- {table}: name of the table to refresh
-- {refresh_table}: a unique name for the refreshed change table
USE {database};
GO
EXEC sys.sp_cdc_enable_table
@source_schema        = N'{schema}',
@source_name          = N'{table}',
@role_name            = N'streamkap_role',
@supports_net_changes = 0,
@capture_instance     = N'{refresh_table}'
GO

```

‍

Step 3: Verify Changes in ClickHouse

Once you've refreshed the CDC table and confirmed with Streamkap Support that your SQL Server Source has started streaming from the refreshed change table, verify the changes in ClickHouse.

‍

Step 4: Disable CDC on the Outdated Change Table

Once the change has been confirmed on the new refresh table outdated previous CDC table needs to be disabled using the following script.

‍

```sql

-- Replace {database}, {schema}, {table} and {refresh_table} placeholders 
-- {database}: name of the CDC enabled database 
-- {schema}: name of the schema with the table refreshed earlier 
-- {table}: name of the table refreshed earlier 
-- {refresh_table}: name of the previous refresh table, usually {schema}_{table} 
USE {database}; 
GO 
EXEC sys.sp_cdc_disable_table 
@source_schema        = N'{schema}', 
@source_name          = N'{table}', 
@capture_instance     = N'{refresh_table}' 
GO

```

‍

By following these steps, you can ensure that schema changes in your SQL Server source tables are properly reflected in your ClickHouse environment, maintaining data consistency and integrity across your systems.