Skip to main content

Posts

Showing posts from June, 2022

Speeding up Software Builds for Continuous Integration

Downloading the Internet Can you remember the last time you started out on a clean development environment and ran the build of some software using Maven or Gradle for dependency management? It takes ages to download all of the necessary third party libraries from one or more remote repositories, leading to expression like, "Just waiting for Maven to download the Internet". Once your development environment has been used for building a few projects the range of dependencies that will need to be downloaded for other builds reduces down as the previously referenced onces will now be cached and found locally on your computer's hard drive. What happens on the Continuous Integration environment? Now consider what goes on when Jenkins or your other preferred Continuous Integration server comes to build your software. If it doesn't have a local copy of the libraries that have been referenced then it is going to pay the cost of that slow " download the Internet" p...

Docker SBOM - Software Bill Of Materials

In an earlier post on this blog I was curious about comparing Docker images to try to track down the differences that might be causing performance problems. Since then I have had a play with the sbom Docker command for listing out what is included in the image. Following the documentation at: https://docs.docker.com/engine/sbom/ Below is an example of the output of a run of a locally built app: > docker sbom hello-world-alpine-jlink:latest   Syft v0.43.0  ✔ Loaded image              ✔ Parsed image              ✔ Cataloged packages      [16 packages] NAME                    VERSION       TYPE          alpine-baselayout       3.2.0-r20 ...

The Importance of Segmenting Infrastructure

Kafka for Logging I was recently poking around in the source code of a few technologies that I have been using for a few years when I came across KafkaLog4jAppender. It enables you to use Kafka as a place to capture application logs. The thing that caught my eye was the latest commit associated with that particular class, "KafkaLog4jAppender deadlocks when idempotence is enabled" . In the context of Kafka, idempotence is intended to enable the system to avoid producing duplicate records when a producer may need to retry sending events due to some - hopefully - intermittent connectivity problem between the producer and the receiving broker. The unfortunate situation that arises here is that the Kafka client code itself uses Log4j, so it can result in the application being blocked from sending its logs via a Kafka topic because the Kafka client Producer gets deadlocked waiting on transaction state. Kafka For Metrics - But Not For Kafka Metrics This reminded me of a similar scen...

Docker Images - Size matters, But So Does Performance

Introduction I recently went through the exercise of re-building a Docker image based on what was supposed to be a stable, well-known application codebase. Along the way I observed an unexpected performance issue. The application contained within the Docker image was just a Java command line utility for parsing some yaml files to provision kafka resources on our hosted development clusters. The code had not been changed for several months, so this was supposed to just be a matter of setting up a local copy of the Docker image instead of pulling down a trusted third party's image from Dockerhub. The application was bundled within a Docker contrainer whose Dockerfile was alongside the code, so it should have been a simple matter of using that to produce the image and pushing it to our own repo, and then pulling that down for our runtime use. It's the same, so why's it different? We had been running with the existing third party Docker image for several months, so there was a ...