Skip to main content

Sink data from RisingWave to Cassandra or ScyllaDB

You can sink data from RisingWave to Cassandra. As ScyllaDB can work as a drop-in replacement for Cassandra, it means you can sink data from RisingWave to ScyllaDB as well.

This guide describes how to sink data from RisingWave to Cassandra or ScyllaDB using the Cassandra sink connector in RisingWave.

Beta Feature

The Cassandra sink connector in RisingWave is currently in Beta. Please contact us if you encounter any issues or have feedback.

Prerequisites

  • Ensure your Cassandra or ScyllaDB cluster is accessible from RisingWave.

  • If you are running RisingWave locally from binaries and intend to use the native CDC source connectors or the JDBC sink connector, make sure that you have JDK 11 or later versions is installed in your environment.

Syntax

To sink data to Cassandra or ScyllaDB, create a Cassandra sink in RisingWave using the syntax below:

CREATE SINK [ IF NOT EXISTS ] sink_name
[FROM sink_from | AS select_query]
WITH (
connector='cassandra',
type='<type>',
cassandra.url = '<node1>,<node2>,<node3>',
cassandra.keyspace = '<keyspace>',
cassandra.table = '<cassandra_table>',
cassandra.datacenter = '<data_center>'
);

Once the sink is created, data changes will be streamed to the specified table.

Parameters

Parameter NamesDescription
sink_nameName of the sink to be created.
sink_fromA clause that specifies the direct source from which data will be output. sink_from can be a materialized view or a table. Either this clause or select_query query must be specified.
AS select_queryA SELECT query that specifies the data to be output to the sink. Either this query or a sink_from clause must be specified. See SELECT for the syntax and examples of the SELECT command.
typeRequired. Specify if the sink should be upsert or append-only. If creating an upsert sink, you must specify a primary key.
primary_keyOptional. A string of a list of column names, separated by commas, that specifies the primary key of the Cassandra sink.
force_append_onlyIf true, forces the sink to be append-only, even if it cannot be.
cassandra.urlRequired. The URL or IP address of the Cassandra or ScyllaDB cluster or node you want to connect to.
cassandra.keyspaceRequired. The name of the keyspace within the Cassandra database or ScyllaDB where you want to store the data. A keyspace is a logical container for organizing data in Cassandra.
cassandra.tableRequired. The name of the table in the specified keyspace where you want to insert or update the data.
cassandra.datacenterOptional. If you are working with a multi-data center Cassandra setup, you may need to specify the name of the target data center where the data should be written.
cassandra.max_batch_rowsOptional. The number of batch rows sent at a time. The value must be between 1 and 65535. The default value is 512.
cassandra.request_timeout_msOptional. The waiting time for each batch. The default value is 2000. It is recommended to reduce batch size first before trying to change the waiting time.
note

The Cassandra sink in RisingWave provides at-least-once delivery semantics. Events may be redelivered in case of failures. We recommend using the upsert sink type to avoid duplicates.

Data type mapping - RisingWave and Cassandra

RisingWave Data TypeCassandra Data Type
booleanboolean
smallintsmallint
integerint
bigintbigint
numericdecimal
realfloat
double precisiondouble
character varying (varchar)text
byteablob
datedate
time without time zonetime
timestamp without time zoneunsupported. You need to convert timestamp to timestamptz in RisingWave before sinking.
timestamp with time zonetimestamp
intervalduration
structunsupported
arrayunsupported
JSONBunsupported

Help us make this doc better!

Was this page helpful?

Happy React is loading...