- Published on
Optimizing java base docker images size
- Authors
- Name
If you are a java developer and you are using docker to package your applications, you may have noticed that the size of the final image can be quite large even for "hello world" kind of project. In this article we will cover some tips to optimize a docker image size for java applications.
We will use the same spring web application that we built in the previous article Error handling in Spring web using RFC-9457 specification to demonstrate the tips. Our application contains only 2 endpoints:
- GET /users/:id : to get a user by id
- POST /users: to create a new user
@RestController
@RequestMapping("/api/users")
@RequiredArgsConstructor
public class UserController {
private final UserService userService;
@GetMapping("{id}")
public User getUser(@PathVariable Long id){
return userService.getUserById(id)
.orElseThrow(() -> new UserNotFoundException(id, "/api/users"));
}
@PostMapping
public User createUser(@Valid @RequestBody User user) {
return userService.createUser(user);
}
}
Not too much right? but as you will see the size of the simplest docker image (without applying some optimization) can be quite large.
The source code for this article can be found on github
Why should we care about the image size ?
The image size can have a significant impact on your performance either as a developer or as an organization. Esepcially when you are working in large projects with multiple services, the size of the images can be quite large, and this could cost you a lot of money and time.
Some of the reasons why you should avoid large images are:
- Disk space: You are wasting disk space in your docker registry and in your production servers.
- Slower builds: The larger the image, the longer it takes to build and push the image.
- Security: The larger the image, the larger dependencies you have and the more attack surface you have.
- Bandwidth: The larger the image, the more Bandwidth consumption you have when pulling and pushing the image from and to the registry.
Using a straight forward Dockerfile
Base image Matter ✌🏽 : Choosing the right base image
Before even you start thinking about omptimization, you SHOULD ALWAYS be careful about the base image that you are using to package your application . The base image that you choose can have a significant impact on the size of the final image (like you will see bellow).
There are several base images that you can use to package your java application, some of them are:
- JDK Alpine base images: These images are quite small in size, but they are not suitable for all applications, so you may face some compatibility issues with some libraries.
- JDK Slim base images: These images are based on Debian or Ubuntu and they are quite small in size comparing to the full JDK images, but they are still quite large.
- JDK full base images: These images are quite large in size, they contain all the modules and dependencies that are needed to run the application.
To give you an idea about the size of the base images, here is a comparison between the size of the openjdk:17-jdk-slim
(slim) and eclipse-temurin:17-jdk-alpine
(alpine) images:
Knowning that The size of the application artifact (jar) is: ~20MB
To package our artifacts in a docker image, we need to define a Dockefile
in our application root directory as follows:
openjdk:17-jdk-slim as base image.
Using Dockerfile.base-openjdkFROM openjdk:17-jdk-slim
# Set the working directory in the container
WORKDIR /app
# Create user
RUN addgroup --system spring && adduser --system spring --ingroup spring
# Change to user
USER spring:spring
COPY target/*.jar app.jar
EXPOSE 8080
CMD ["java", "-jar", "app.jar"]
After defining the Dockerfile, we can build the image using the following command:
docker build -t user-service .
After this you should have a docker image with the name user-service
, and as you can see the size of the image is quite large comparing to the size of the application artifacts, it's about 674MB 🫣
eclipse-temurin:17-jdk-alpine as base image.
Using Dockerfile.base-temurinFROM eclipse-temurin:17-jdk-alpine
ARG APPLICATION_USER=spring
# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER && adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER
# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app
# Set the user to run the application
USER $APPLICATION_USER
# Copy the jar file to the container
COPY target/*.jar /app/app.jar
# Set the working directory
WORKDIR /app
# Expose the port
EXPOSE 8080
# Run the application
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
After building the image using the following command:
docker build -t user-service:alpine -f Dockerfile.base-alpine . --platform=linux/amd64
🚨 Side note
Important note: If you are using MAC on Apple Silicon, you may face the following issue while building the image:
> [internal] load metadata for docker.io/library/eclipse-temurin:17-jdk-alpine:
------
Dockerfile:2
--------------------
1 | # First stage, build the custom JRE
2 | >>> FROM eclipse-temurin:17-jdk-alpine AS jre-builder
3 |
4 | # Install binutils, required by jlink
--------------------
ERROR: failed to solve: eclipse-temurin:17-jdk-alpine: no match for platform in manifest: not found
To fix this issue you can add this to you docker build
command:
--platform=linux/amd64
or you can set the default platform to linux/amd64
by running the following command:
export DOCKER_DEFAULT_PLATFORM=linux/amd64
After building the image using eclipse-temurin:17-jdk-alpine
as a base image, we got this:
Look at the size of both images, even without any tunning the size of the image using eclipse-temurin:17-jdk-alpine
as a base image is 180MB which is 73% smaller than the image using openjdk:17-jdk-slim
as a base image which is 674MB
Hands on optimization
JRE
image instead of JDK
image ?
Wait a minute, why can't we use JRE
is no longer available The most important note to consider from this is this part "Users can use jlink to create smaller custom runtimes."`
JRE
image using jlink
Build your own jlink
is a tool that can be used to create a custom runtime image that contains only the modules that are needed to run your application;
👉 If your application don't interact with a database, you don't need to include the java.sql
module in your image. If you are not interacting with Desktop GUI, you don't need to include the java.desktop
module in your image. and so on.
It kind of like a replacement for the
JRE
image, but with more control over the modules that you want to use in your image.
So using jlink
here is how our Dockerfile should look like:
# First stage, build the custom JRE
FROM eclipse-temurin:17-jdk-alpine AS jre-builder
# Install binutils, required by jlink
RUN apk update && \
apk add binutils
# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
--verbose \
--add-modules ALL-MODULE-PATH \
--strip-debug \
--no-man-pages \
--no-header-files \
--compress=2 \
--output /optimized-jdk-17
# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"
# copy JRE from the base image
COPY /optimized-jdk-17 $JAVA_HOME
# Add app user
ARG APPLICATION_USER=spring
# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER && adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER
# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app
COPY target/*.jar /app/app.jar
WORKDIR /app
USER $APPLICATION_USER
EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]
So let's explain what we did here:
- We have two stages, the first stage is used to build a custom JRE image using
jlink
and the second stage is used to package the application in a slim alpine image. - In the first stage, we used the
eclipse-temurin:17-jdk-alpine
image to build a custom JRE image usingjlink
. We Then installbinutils
which is required byjlink
and then we runjlink
to build a small JRE image that contains all the modules by using--add-modules ALL-MODULE-PATH
(for now) that are needed to run the application. - In the second stage, we used the
alpine
image (which is a quite small 3Mb) to package our application) as base image, we then took the customJRE
from the first stage and use it as ourJAVA_HOME
. - The rest of the Dockerfile is the same as the previous one, just copying artifacts and setting the entrypoint using a custom user (not root).
Then we can build the image using the following command:
docker build -t user-service:jlink-all-modules-temurin -f Dockerfile.jlink-all-modules.temurin .
If you run the command
docker images user-service
you will see that the new Docker image size of the image is now 85.3MB which is ~95MB less than the base image with eclipse-temurin base image 🎉🥳
And to be sure that the image is working as expected, you can run the following command:
docker run -p 8080:8080 user-service:jlink-all-modules-temurin
And you should see the application running as expected.
This is not enough 🤌🏽
As we are a good developers, we always want to improve our work, so let's see how we can improve the image size even more.
The image size is still large, this is because when we used --add-modules ALL-MODULE-PATH
in the jlink
command, we included all the modules that are needed to run the application, but we sure don't need all of them. So let's see how to get a smaller image size by including only the modules that are needed to run the application.
How to know which modules are needed to run the application ?
We can use the jdeps
tool that comes with the JDK. jdeps
is a tool that can be used to analyze the dependencies of a jar file and generate a list of the modules that are needed to run the application.
To do so, we can run the following command at the root of out project:
jdeps --ignore-missing-deps -q \
--recursive \
--multi-release 17 \
--print-module-deps \
--class-path BOOT-INF/lib/* \
target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar
this print out the list of modules that are needed to run the application, in our case it's:
java.base,java.compiler,java.desktop,java.instrument,java.management,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.security.jgss,java.sql,jdk.jfr,jdk.unsupported
We can simply put this instead of ALL-MODULE-PATH
in the jlink
command:
# First stage, build the custom JRE
FROM openjdk:17-jdk-slim AS jre-builder
# Install binutils, required by jlink
RUN apt-get update -y && \
apt-get install -y binutils
# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
--verbose \
--add-modules java.base,java.compiler,java.desktop,java.instrument,java.management,java.naming,java.net.http,java.prefs,java.rmi,java.scripting,java.security.jgss,java.sql,jdk.jfr,jdk.unsupported \
--strip-debug \
--no-man-pages \
--no-header-files \
--compress=2 \
--output /optimized-jdk-17
# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"
# copy JRE from the base image
COPY /optimized-jdk-17 $JAVA_HOME
# Add app user
ARG APPLICATION_USER=spring
# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER && adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER
# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app
COPY target/*.jar /app/app.jar
WORKDIR /app
USER $APPLICATION_USER
EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]
Then we can build the image using the following command:
docker build -t user-service:jlink-known-modules-temurin -f Dockerfile.jlink-known-modules.temurin .
We got a smaller image size 57.8MB instead of 85.3MB.
This is good, but can't we automate this process, instead of running the jdeps
command manually and then copying the modules to the jlink
command ?
Automating the process inside the dockerfileDockerfile.jlink-with-jdeps.temurin
# First stage, build the custom JRE
FROM eclipse-temurin:17-jdk-alpine AS jre-builder
RUN mkdir /opt/app
COPY . /opt/app
WORKDIR /opt/app
ENV MAVEN_VERSION 3.5.4
ENV MAVEN_HOME /usr/lib/mvn
ENV PATH $MAVEN_HOME/bin:$PATH
RUN apk update && \
apk add --no-cache tar binutils
RUN wget http://archive.apache.org/dist/maven/maven-3/$MAVEN_VERSION/binaries/apache-maven-$MAVEN_VERSION-bin.tar.gz && \
tar -zxvf apache-maven-$MAVEN_VERSION-bin.tar.gz && \
rm apache-maven-$MAVEN_VERSION-bin.tar.gz && \
mv apache-maven-$MAVEN_VERSION /usr/lib/mvn
RUN mvn package -DskipTests
RUN jar xvf target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar
RUN jdeps --ignore-missing-deps -q \
--recursive \
--multi-release 17 \
--print-module-deps \
--class-path 'BOOT-INF/lib/*' \
target/spring-error-handling-rfc-9457-0.0.1-SNAPSHOT.jar > modules.txt
# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
--verbose \
--add-modules $(cat modules.txt) \
--strip-debug \
--no-man-pages \
--no-header-files \
--compress=2 \
--output /optimized-jdk-17
# Second stage, Use the custom JRE and build the app image
FROM alpine:latest
ENV JAVA_HOME=/opt/jdk/jdk-17
ENV PATH="${JAVA_HOME}/bin:${PATH}"
# copy JRE from the base image
COPY /optimized-jdk-17 $JAVA_HOME
# Add app user
ARG APPLICATION_USER=spring
# Create a user to run the application, don't run as root
RUN addgroup --system $APPLICATION_USER && adduser --system $APPLICATION_USER --ingroup $APPLICATION_USER
# Create the application directory
RUN mkdir /app && chown -R $APPLICATION_USER /app
COPY target/*.jar /app/app.jar
WORKDIR /app
USER $APPLICATION_USER
EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]
Then we can build the image using the following command:
docker build -t user-service:jlink-with-jdeps.temurin -f Dockerfile.jlink-with-jdeps.temurin . --platform=linux/amd64
Bonus
Before we finish, please note that you can use a .dockerignore
file to exclude some files and directories from being copied to the image, this can help reduce the size of the image in the intermediate stages.
You should also be aware that picking a small base image is good, but make sure that it comes with a good security policies and it's compatible with your application.
Conclusion
I hope you find this article helpful. If you have any questions or comments, please feel free to contact me on twitter or linkedin.