Skip to main content

Upserts with version awareness in a relational database - Part Two: MariaDB / MySQL

Insert, or Conditional Update 

I started off trying out MariaDB as a comparison to how PostgreSQL supports upserts with version awareness.  MariaDB started off as a fork of MySQL, so it makes sense that the same approach is also applicable to MySQL. 

From part one of this blog post series, we have a simple table definition:

CREATE TABLE event (id UUID PRIMARY KEY, 
name varchar(255) NOT NULL,
version bigint NOT NULL DEFAULT 0)

Unlike PostgreSQL, we don't have:

ON CONFLICT

as a clause to fall back from insert to update.

Instead there is:

ON DUPLICATE

that can be applied as follows (once again, the syntax is JDBC, where each '?' character represents a value that is passed in at runtime):

INSERT INTO event (id, name, version) VALUES (?, ?, ?)
ON DUPLICATE KEY UPDATE
name = (IF(version < VALUES(version), VALUES(name), name)),
version = (IF(version < VALUES(version), VALUES(version), version))

Stylistically it doesn't seem like a scalable approach that I would want to apply for a table that has more attributes, as we have to specify the version clause on each individual property. I would prefer it if we could specify a where clause to guard the update, but that is not available. 

An anomaly in rows affected? 

Normally when we use executeUpdate on a JDBC Statemenbt we receive back a response that is the row count of the number of rows that were added or altered by the statement. In this situation there is different behaviour, as documented on the MariaDB website. Basically an insert is represented by the value 1, and an update is represented by the value 2 (Sidenote: this is something ChatGPT does not seem to be aware of, as it takes a literal approach to the actual number of rows affected).

I thought that the rows affected values would be useful for being able to keep track of the relative frequencies of inserts versus updates, until I noticed something odd when running some tests...

It seems that when the update is only setting the existing values then the rows affected value that will be returned is 1, making that situation indistinguishable from the insert.

Update

Based on this documentation it seems that we have an explanation for why the update can return rows affected as being 1, "existing row is set to its current values, and the CLIENT_FOUND_ROWS is set".

If I change my JDBC Connection setup to specify the non-default value for useAffectedRows then we get a 1 when an insert occurs, a 2 when an update changes the existing record, and a 0 when the update is setting the values to be the same as the existing state.

Properties properties = new Properties();
properties.put("user", DB_USER);
properties.put("password", PASSWORD);

properties.put("useAffectedRows", true);

Connection connection = DriverManager.getConnection(jdbcUrl, properties);

Disclaimer

The approach shared in this post is not something that I am offering up as a recommended or optimal way to get conditional logic applied when upserting into MariaDB or MySQL.

It's more a case of me thinking aloud, because thinking is allowed.  

Comments

Popular posts from this blog

Having a go at learning some Kotlin

What's this about?  The year 2025 is almost over, so that means that it has been a bit over a decade since my old colleague Filippo gave a presentation to the development team of ScienceDirect covering the merits of the Kotlin programming language. So, it's about time that I had a proper go at using it. This blog post is intended to trace what the experience has been like, covering surprises that I encounter along the way. Getting started The programming language that I am most experienced with is Java, so I have chosen to try out implementing some functionality in Kotlin from a recent hobby project that I developed in Java involving spinning up a database in a Docker container and running some queries. JVM version support IntelliJ IDEA includes some automation for creating a new project, so I selected the relevant options to use the latest LTS version of the Java virtual machine with Spring Boot, Kotlin, Postgresql and Test containers. After a few seconds I had a new project i...

The Importance of Segmenting Infrastructure

Kafka for Logging I was recently poking around in the source code of a few technologies that I have been using for a few years when I came across KafkaLog4jAppender. It enables you to use Kafka as a place to capture application logs. The thing that caught my eye was the latest commit associated with that particular class, "KafkaLog4jAppender deadlocks when idempotence is enabled" . In the context of Kafka, idempotence is intended to enable the system to avoid producing duplicate records when a producer may need to retry sending events due to some - hopefully - intermittent connectivity problem between the producer and the receiving broker. The unfortunate situation that arises here is that the Kafka client code itself uses Log4j, so it can result in the application being blocked from sending its logs via a Kafka topic because the Kafka client Producer gets deadlocked waiting on transaction state. Kafka For Metrics - But Not For Kafka Metrics This reminded me of a similar scen...

2022 - A year in review

Just a look back over the last 12 months. January I moved back to Christchurch to live, after having spent a few months further south since moving back from London. Work was mainly around balancing other peoples' understanding and expectations around our use of Kafka. February I decided that it would be worthwhile to have a year's subscription for streaming Sky Sports, as some rugby matches that I would want to watch would be on at time when venues wouldn't be open. Having moved to Christchurch to be close to an office, now found myself working from home as Covid restrictions came back into effect across New Zealand. March Got back into some actual coding at work - as opposed to mainly reviewing pull requests for configuration changes for Kafka topics.  This became urgent, as the command line interface tool that our provisioning system was dependent on had been marked for deprecation. April   Had my first direct experience with Covid-19.  I only went for a test because ...