Queries about RonDB

I have following queries about RonDB:

  1. Consider a column in a table of type varchar(255) or varbinary(255). Does it occupy entire space of 255 characters or 255 bytes if we store less data like 50 chars or 50 bytes?

  2. Is there a TTL-like capability in RonDB? What I’m looking for some kind table level TTL, so any row inserted or updated before a x seconds will expire.

  3. Is there any approach to store data in RonDB such that, it is served from memory but also persisted on disk? So even if all of the data nodes go down, data can restored from last persisted data on disk.

  4. We experimented with backup and restore commands in RonDB. We had 4 data node setup with 256 GB RAM, replication 2. On issuing backup command, it produced backup files ~50 GB in each node and process took around 2.5 hours. While performing restore, we observed that time taken for data restore (ndb_restore --restore-data) varied in different data nodes (1 node took 6 hrs, 2 nodes took 9 hours, 1 node took close to 12 hours). On going through documentation, we came across MinDiskWriteSpeed and MaxDiskWriteSpeed parameters which defaults to 2MBps and 20MBps. If we increase these disk speeds by 5x or 10x, what kind of impacts would be there? Will it improve backup and restore time?

  1. 1 byte to store the length of the varchar (2 bytes if longer yhan 255 chars) + actual length of string. Each row also has a PK overhead and data is word aligned.
  2. Not yet. Coming to Hopsworks feature store soon.
  3. Even in memory tables are recovered if there is a cluster failure. Transaction redo and undo logs are syncd to disk. Data nodes recover from logs. You can also have columns stored on disk, and there is a disk page cache, so if your working set fits in memory, you will get close tomemory like read latencies.
  4. You can increase those values, but there is more at play here. You ca just shut your cluster down and restart it. It will be much faster than backup and restore. It loads first the db from the latest snapshot, then applies the redo/undo logs.

Mikael may have more to add.

Addon to answer of Question 4:
MinDiskWriteSpeed and MaxDiskWriteSpeed governs the speed of Backups.
So increasing those decreases time to take backup. Handling 100Mbyte of backup
takes about 1 second of CPU-time. So with a setup with 16 CPUs we have about
8 CPUs handling the data. Consuming 10-20% of this for backups should be ok in
most cases.

To speed up restore there are parameters that one can use for ndb_restore.

Above answers have been helpful. Earlier I was under the impression that if all data nodes go down then in-memory data is lost. That understanding now stands corrected.

I have couple of new queries:

  1. The value of PartitionsPerNode is 2 by default. If we increase it 4 or 8, will it improve write throughput?
  2. What is the default value of TransactionMemory in RonDB? Is it configured as part of AutomaticMemoryConfiguration setting based on available RAM?

Answer on PartitionsPerNode:
Increasing PartitionsPerNode to higher than 2 means that you get more partitions that can update
the table and thus throughput of writes to a single table can increase. Obviously for it to increase the number of LDM threads must be higher than the number of partitons. So in a 2-replica, 2 node cluster you need more than 4 LDM threads to see any advantage of increasing PartitionsPerNode.

Thus it is mainly of interest to increase PartitionsPerNode if you have large data nodes combined
with a single table that receives most of the writes.

In managed RonDB the standard is 6 partitions per table per node group since we always
create 3 nodes per node group even if the user only starts up with a single or 2 data nodes.

Answer on TransactionMemory:
The value of TransactionMemory is calculated and reported in the node log as part of
AutomaticMemoryConfig.
However TransactionMemory can use memory from SharedGlobalMemory as well, so
it can use most of the memory in the data node except the memory assigned to
DataMemory and DiskPageBufferMemory (Disk page cache).
The default of TransactionMemory is 0 which means that it is calculated, it can be set
by the user and this will be used instead of the calculation. The TransactionMemory can
still use memory from SharedGlobalMemory even if set. Thus the TransactionMemory
setting is no hard limit on the amount of memory available for operations and transactions.

It is correct that AutomaticMemoryConfig uses the available memory in the VM/machine
unless the TotalMemoryConfig is set in which case this is used as the total memory size.
Using AutomaticMemoryConfig=1 there is no need to set any configuration parameter
(and AutomaticMemoryConfig=1 is default).