Getting "error 261: DML count in transaction exceeds config parameter" on using Kafka connect JDBC sink to write data

ruturaj · October 2, 2021, 11:36am

We have data in Kafka that we want to sink to ronDB.

we use Kafka JBDC sink heavily in out stack and thought to use the same here as well.

On setting up a Kafka connect jdbc sink with a batch size of 5000 we started seeing errors while writing data:

Caused by: java.sql.SQLException: java.sql.BatchUpdateException: Got error 261 'DML count in transaction exceeds config parameter MaxDMLOperationsPerTransaction/MaxNoOfConcurrentOp' from NDBCLUSTER

is there something we are missing here or some property we need to set or change…?

cc: @mikaelronstrom

mikaelronstrom · October 2, 2021, 11:52am

In RonDB 21.04 there is a maximum limit to the number of concurrently active operations
in a transaction thread. In RonDB 21.10 this will be replaced by a maximum limit on the size
of one transaction (MaxDMLOperationsPerTransaction).

This limit is set in RonDB 21.04 by setting MaxNoOfConcurrentOperations.

Another things to consider is that a batch size of 5000 is probably a bit too extensive.
My experience is that batch sizes between 100 and 500 are optimal a bit dependent on
the cluster setup.

So my advice is to decrease batch size and increase the setting of MaxNoOfConcurrentOperations.

You can change this setting using a rolling restart procedure if you need the cluster to stay
up during the change.

The reason for this max limit is to ensure that we can take over a transaction by another
transaction thread if the transaction coordinator fails.

mikaelronstrom · October 2, 2021, 12:01pm

The default setting for MaxNoOfConcurrentOperations is 32768. There is no specific limit to
its size, but one will need about 500 bytes per operation of memory to handle it.

ruturaj · October 2, 2021, 12:28pm

cool will take a look at your suggestions and get back

ruturaj · October 2, 2021, 1:16pm

@mikaelronstrom so is this property set at a global level? or at a table level?

as we will have multiple people/application writing(doing upsert operations) to multiple tables so wanted to understand the limitation here

mikaelronstrom · October 2, 2021, 2:40pm

Global level, it is set in the config.ini for the entire cluster.
It should have the same value on all nodes in the cluster.
Thus set in the section
[NDBD DEFAULT]
in the configuration file for the MGM server.

I looked into more details now, if you set it to e.g. 1M which is
a very high number and should suffice for most use cases, the transaction
thread will reserve around 160 MByte of memory space to handle the
node failure handling. The default setting 32k uses only 5 MByte for this
purpose, so the impact of increasing should be very small.

Having more than 1M concurrent operations is usually not a good idea
anyways, so setting it beyond that I don’t think is a good idea.

ruturaj · October 2, 2021, 7:00pm

thanks again @mikaelronstrom

reducing the batch size and increasing the MaxNoOfConcurrentOperations got rid of the error