Logo
Abdelghani Roussi's Blog
Published on

Optimizing java base docker images size

Authors
  • Name
    Twitter

If you are a java developer and you are using docker to package your applications, you may have noticed that the size of the final image can be quite large even for "hello world" kind of project. In this article we will cover some tips to optimize a docker image size for java applications.

We will use the same spring web application that we built in the previous article Error handling in Spring web using RFC-9457 specification to demonstrate the tips. Our application contains only 2 endpoints:

  • GET /users/:id : to get a user by id
  • POST /users: to create a new user
UserController.java
@RestController
@RequestMapping("/api/users")
@RequiredArgsConstructor
public class UserController {

    private final UserService userService;

    @GetMapping("{id}")
    public User getUser(@PathVariable Long id){
        return userService.getUserById(id)
                .orElseThrow(() -> new UserNotFoundException(id, "/api/users"));
    }

    @PostMapping
    public User createUser(@Valid @RequestBody User user) {
        return userService.createUser(user);
    }
}

Not too much right? but as you will see the size of the simplest docker image (without applying some optimization) can be quite large.

The source code for this article can be found on github

Why should we care about the image size ?

The image size can have a significant impact on your performance either as a developer or as an organization. Esepcially when you are working in large projects with multiple services, the size of the images can be quite large, and this could cost you a lot of money and time.

Some of the reasons why you should avoid large images are:

  • Disk space: You are wasting disk space in your docker registry and in your production servers.
  • Slower builds: The larger the image, the longer it takes to build and push the image.
  • Security: The larger the image, the larger dependencies you have and the more attack surface you have.
  • Bandwidth: The larger the image, the more Bandwidth consumption you have when pulling and pushing the image from and to the registry.

Using a straight forward Dockerfile

Base image Matter ✌🏽 : Choosing the right base image

Before even you start thinking about omptimization, you SHOULD ALWAYS be careful about the base image that you are using to package your application . The base image that you choose can have a significant impact on the size of the final image (like you will see bellow).

There are several base images that you can use to package your java application, some of them are:

  • JDK Alpine base images: These images are quite small in size, but they are not suitable for all applications, so you may face some compatibility issues with some libraries.
  • JDK Slim base images: These images are based on Debian or Ubuntu and they are quite small in size comparing to the full JDK images, but they are still quite large.
  • JDK full base images: These images are quite large in size, they contain all the modules and dependencies that are needed to run the application.

To give you an idea about the size of the base images, here is a comparison between the size of the openjdk:17-jdk-slim (slim) and eclipse-temurin:17-jdk-alpine (alpine) images:

Knowning that The size of the application artifact (jar) is: ~20MB

To package our artifacts in a docker image, we need to define a Dockefile in our application root directory as follows:

Using openjdk:17-jdk-slim as base image.

Dockerfile.base-openjdk
FROM openjdk:17-jdk-slim

# Set the working directory in the container
WORKDIR /app

# Create user
RUN addgroup --system spring && adduser --system spring --ingroup spring

# Change to user
USER spring:spring

COPY target/*.jar app.jar

EXPOSE 8080

CMD ["java", "-jar", "app.jar"]

After defining the Dockerfile, we can build the image using the following command:


docker build -t user-service .

After this you should have a docker image with the name user-service, and as you can see the size of the image is quite large comparing to the size of the application artifacts, it's about 674MB 🫣

Wait what 🤯 !! Yet this is only a small project with 2 endpoints with no dependencies, so what about an application with dozen of dependencies and files !!

Using eclipse-temurin:17-jdk-alpine as base image.

Dockerfile.base-temurin
FROM eclipse-temurin:17-jdk-alpine

ARG APPLICATION_USER=spring
# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER && adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER

# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app

# Set the user to run the application
USER $APPLICATION_USER

# Copy the jar file to the container
COPY --chown=$APPLICATION_USER:$APPLICATION_USER target/*.jar /app/app.jar

# Set the working directory
WORKDIR /app

# Expose the port
EXPOSE 8080

# Run the application
ENTRYPOINT ["java", "-jar", "/app/app.jar"]

After building the image using the following command:

docker build -t user-service:alpine -f Dockerfile.base-alpine . --platform=linux/amd64

🚨 Side note

Important note: If you are using MAC on Apple Silicon, you may face the following issue while building the image:

 > [internal] load metadata for docker.io/library/eclipse-temurin:17-jdk-alpine:
------
Dockerfile:2
--------------------
   1 |     # First stage, build the custom JRE
   2 | >>> FROM eclipse-temurin:17-jdk-alpine AS jre-builder
   3 |     
   4 |     # Install binutils, required by jlink
--------------------
ERROR: failed to solve: eclipse-temurin:17-jdk-alpine: no match for platform in manifest: not found

To fix this issue you can add this to you docker build command:

--platform=linux/amd64

or you can set the default platform to linux/amd64 by running the following command:

export DOCKER_DEFAULT_PLATFORM=linux/amd64

After building the image using eclipse-temurin:17-jdk-alpine as a base image, we got this:

Look at the size of both images, even without any tunning the size of the image using eclipse-temurin:17-jdk-alpine as a base image is 180MB which is 73% smaller than the image using openjdk:17-jdk-slim as a base image which is 674MB

Hands on optimization

Wait a minute, why can't we use JRE image instead of JDK image ?

Good question ! this is because starting from java 11, the JRE is no longer available

The most important note to consider from this is this part "Users can use jlink to create smaller custom runtimes."`

jlink is a tool that can be used to create a custom runtime image that contains only the modules that are needed to run your application;

👉 If your application don't interact with a database, you don't need to include the java.sql module in your image. If you are not interacting with Desktop GUI, you don't need to include the java.desktop module in your image. and so on.

It kind of like a replacement for the JREimage, but with more control over the modules that you want to use in your image.

So using jlink here is how our Dockerfile should look like:

# First stage, build the custom JRE
FROM eclipse-temurin:17-jdk-alpine AS jre-builder

# Install binutils, required by jlink
RUN apk update &&  \
    apk add binutils

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
         --verbose \
         --add-modules ALL-MODULE-PATH \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /optimized-jdk-17

# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=jre-builder /optimized-jdk-17 $JAVA_HOME

# Add app user
ARG APPLICATION_USER=spring

# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER &&  adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER

# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app

COPY --chown=$APPLICATION_USER:$APPLICATION_USER target/*.jar /app/app.jar

WORKDIR /app

USER $APPLICATION_USER

EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]

So let's explain what we did here:

  • We have two stages, the first stage is used to build a custom JRE image using jlink and the second stage is used to package the application in a slim alpine image.
  • In the first stage, we used the eclipse-temurin:17-jdk-alpine image to build a custom JRE image using jlink. We Then install binutils which is required by jlink and then we run jlink to build a small JRE image that contains all the modules by using --add-modules ALL-MODULE-PATH (for now) that are needed to run the application.
  • In the second stage, we used the alpine image (which is a quite small 3Mb) to package our application) as base image, we then took the custom JRE from the first stage and use it as our JAVA_HOME.
  • The rest of the Dockerfile is the same as the previous one, just copying artifacts and setting the entrypoint using a custom user (not root).

Then we can build the image using the following command:

docker build -t user-service:jlink-all-modules-temurin -f Dockerfile.jlink-all-modules.temurin .

If you run the command

docker images user-service

you will see that the new Docker image size of the image is now 85.3MB which is ~95MB less than the base image with eclipse-temurin base image 🎉🥳

And to be sure that the image is working as expected, you can run the following command:

docker run -p 8080:8080 user-service:jlink-all-modules-temurin

And you should see the application running as expected.

This is not enough 🤌🏽

As we are a good developers, we always want to improve our work, so let's see how we can improve the image size even more.

The image size is still large, this is because when we used --add-modules ALL-MODULE-PATH in the jlink command, we included all the modules that are needed to run the application, but we sure don't need all of them. So let's see how to get a smaller image size by including only the modules that are needed to run the application.

How to know which modules are needed to run the application ?

We can use the jdeps tool that comes with the JDK. jdeps is a tool that can be used to analyze the dependencies of a jar file and generate a list of the modules that are needed to run the application.

To do so, we can run the following command at the root of out project:

jdeps --ignore-missing-deps -q \
      --recursive \
      --multi-release 17 \
      --print-module-deps \
      --class-path BOOT-INF/lib/* \
      target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar

this print out the list of modules that are needed to run the application, in our case it's:

java.base,java.compiler,java.desktop,java.instrument,java.management,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.security.jgss,java.sql,jdk.jfr,jdk.unsupported

We can simply put this instead of ALL-MODULE-PATH in the jlink command:

Dockerfile.jlink-known-modules.temurin

# First stage, build the custom JRE
FROM openjdk:17-jdk-slim AS jre-builder

# Install binutils, required by jlink
RUN apt-get update -y &&  \
    apt-get install -y binutils

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
         --verbose \
         --add-modules java.base,java.compiler,java.desktop,java.instrument,java.management,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.security.jgss,java.sql,jdk.jfr,jdk.unsupported \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /optimized-jdk-17

# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=jre-builder /optimized-jdk-17 $JAVA_HOME

# Add app user
ARG APPLICATION_USER=spring

# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER &&  adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER

# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app

COPY --chown=$APPLICATION_USER:$APPLICATION_USER target/*.jar /app/app.jar

WORKDIR /app

USER $APPLICATION_USER

EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]

Then we can build the image using the following command:

docker build -t user-service:jlink-known-modules-temurin -f Dockerfile.jlink-known-modules.temurin .
And here is the size of the image after building it:

We got a smaller image size 57.8MB instead of 85.3MB.

This is good, but can't we automate this process, instead of running the jdeps command manually and then copying the modules to the jlink command ?

Automating the process inside the dockerfile

Dockerfile.jlink-with-jdeps.temurin
# First stage, build the custom JRE
FROM eclipse-temurin:17-jdk-alpine AS jre-builder

RUN mkdir /opt/app
COPY . /opt/app

WORKDIR /opt/app

ENV MAVEN_VERSION 3.5.4
ENV MAVEN_HOME /usr/lib/mvn
ENV PATH $MAVEN_HOME/bin:$PATH

RUN apk update && \
    apk add --no-cache tar binutils


RUN wget http://archive.apache.org/dist/maven/maven-3/$MAVEN_VERSION/binaries/apache-maven-$MAVEN_VERSION-bin.tar.gz && \
  tar -zxvf apache-maven-$MAVEN_VERSION-bin.tar.gz && \
  rm apache-maven-$MAVEN_VERSION-bin.tar.gz && \
  mv apache-maven-$MAVEN_VERSION /usr/lib/mvn

RUN mvn package -DskipTests
RUN jar xvf target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar
RUN jdeps --ignore-missing-deps -q  \
    --recursive  \
    --multi-release 17  \
    --print-module-deps  \
    --class-path 'BOOT-INF/lib/*'  \
    target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar > modules.txt

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
         --verbose \
         --add-modules $(cat modules.txt) \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /optimized-jdk-17

# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=jre-builder /optimized-jdk-17 $JAVA_HOME

# Add app user
ARG APPLICATION_USER=spring

# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER &&  adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER

# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app

COPY --chown=$APPLICATION_USER:$APPLICATION_USER target/*.jar /app/app.jar

WORKDIR /app

USER $APPLICATION_USER

EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]

Then we can build the image using the following command:

docker build -t user-service:jlink-with-jdeps.temurin -f Dockerfile.jlink-with-jdeps.temurin . --platform=linux/amd64

Bonus

Before we finish, please note that you can use a .dockerignore file to exclude some files and directories from being copied to the image, this can help reduce the size of the image in the intermediate stages.

You should also be aware that picking a small base image is good, but make sure that it comes with a good security policies and it's compatible with your application.

Conclusion

I hope you find this article helpful. If you have any questions or comments, please feel free to contact me on twitter or linkedin.

References