Friday, August 4, 2017

Docker: Optimal Image for Microservices with Multi-Stage Build (and TomEE)

On an earlier post, I showed how TomEE builds a single executable jar and it makes an easier deployment.  Now, I would like to build a Docker image file with a TomEE RESTful web service, but noticed a problem with a size of a created Docker image file.  So, I spent some time to find a better way to reduce the image size.  In Docker, it is called a "multi-stage builds" which is introduced in Docker 17.05.

Simple RESTful service

package com.jihwan.javaee.test.endpoint;

import javax.ws.rs.DefaultValue;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.QueryParam;

@Path("/hello")
public class HelloRest {
   @GET
   public String sayHello(@DefaultValue("Friend") @QueryParam("name") String name){
      return "Hello " + name + "!!";
   }
}

Maven pom.xml
Same, but a simpler pom.xml without any hibernate related jar files

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

  <modelVersion>4.0.0</modelVersion>
  <groupId>com.jihwan.javaee</groupId>
  <artifactId>test</artifactId>
  <version>0.1-SNAPSHOT</version>
  <packaging>war</packaging>

  <properties>
    <tomee.version>7.0.2</tomee.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <failOnMissingWebXml>false</failOnMissingWebXml>
    <maven.compiler.target>1.8</maven.compiler.target>
    <maven.compiler.source>1.8</maven.compiler.source>
  </properties>
  
  <dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.tomee/tomee-embedded -->
    <dependency>
      <groupId>org.apache.tomee</groupId>
      <artifactId>tomee-embedded</artifactId>
      <version>${tomee.version}</version>
      <scope>provided</scope>
      <exclusions>
         <exclusion>
            <groupId>org.apache.openjpa</groupId>
            <artifactId>openjpa</artifactId>
         </exclusion>
      </exclusions>
    </dependency>
  </dependencies>

  <build>
    <finalName>${project.artifactId}</finalName>
    <plugins>
      <plugin>
        <groupId>org.apache.tomee.maven</groupId>
        <artifactId>tomee-maven-plugin</artifactId>
        <version>${tomee.version}</version>
        <configuration>
          <tomeeVersion>${tomee.version}</tomeeVersion>
          <tomeeClassifier>plus</tomeeClassifier>

          <webapps>
            <webapp>${project.groupId}:${project.artifactId}:${project.version}?name=${project.artifactId}</webapp> 
          </webapps>
      
          <synchronization>
            <extensions>
              <extension>.class</extension>
            </extensions>
          </synchronization>

          <reloadOnUpdate>true</reloadOnUpdate>
          <systemVariables>
            <tomee.serialization.class.blacklist>-</tomee.serialization.class.blacklist>
            <openejb.system.apps>true</openejb.system.apps>
          </systemVariables>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project> 


Dockerfile
This 'DockerfileFull' file follows a regular pattern of the Docker build process.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 FROM XXX-maven:3.5

 WORKDIR /app

 COPY . /app/

 RUN mvn clean package tomee:exec

 EXPOSE 8080

 CMD ["./start.sh"]

The XXX-maven:3.5 is a base image I used.  It is created by a private company, and only contains an (Alpine) linux OS, Java 8 and maven.  Its size is 617MB.

TomEE also provides a Docker image, but using a TomEE maven plugin is very simple as shown here.

Building an Image
     docker build -f DockerfileFull -t hellofull .

It perfectly creates a Docker image and run it, but this image has a few issues.  Let's look at the image size first.

i31596-is7:test jihwan.kim$ docker images
REPOSITORY      TAG             IMAGE ID          CREATED            SIZE
hellofull       latest          caf4ff63ac96      2 minutes ago      971MB

The size is 971 MB!  What happened? A size of the base image 'XXX-maven:3.5' is 617MB and a size of the test-exec.jar (explained on a previous post) with the pom.xml used here is only 50MB.  So why is this image so large?

When you ran the build command, you could notice that lots of jar files were downloaded at the line 7 of the DockerfileFull.  During the maven build, all necessary jar files were downloaded and saved in a maven repository of the OS in the base image.

Then, a target directory is created and saves lots of files along with the executable 'test-exec.jar'.  All of these files are saved in the 'hellofull' image.

Another issue is from the line 5 of the DockerfileFull file. It copies every files including all of your source codes.  Since a Docker image is built with several layers, your try of removing any files after the line 7 will not actually remove files.

By the way, this is the 'start.sh' script used in the Docker CMD.

#!/bin/bash
java -jar target/test-exec.jar

Multi-Stage Build
To handle the issues, you may build your application on a build server and copy only necessary files to the image without building the app in a Docker.  It may be a reasonable solution if your company and system are manageable size, but it doesn't always work this way.

"With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each 'FROM' instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image." (From Docker document)

Now, let's write another Dockerfile.  I named it 'Dockerfile'

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
FROM XXX-maven:3.5 as builder

WORKDIR /app

COPY . /app/

RUN mvn clean package tomee:exec

FROM XXX-maven:3.5

COPY --from=builder /app/target/test-exec.jar /app/target/test-exec.jar

COPY --from=builder /app/start.sh /app/start.sh

EXPOSE 8080

CMD ["./start.sh"]

Then, build an image 'hello' :     docker build -t hello .

This is an image size.  667MB!  This is the exact size of the base XXX-maven (617MB) plus the test-exec.jar (50MB)

i31596-is7:test jihwan.kim$ docker images
REPOSITORY       TAG         IMAGE ID          CREATED            SIZE
hello            latest      96217cabf95d      3 minutes ago      667MB

Now, let's see information of each layer of the image. (Only top part of the entire history is shown to protect the history of the base image).  It doesn't contain the first stage of layers.

i31596-is7:test jihwan.kim$ docker history 96217cabf95d
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
96217cabf95d        9 minutes ago       /bin/sh -c #(nop)  CMD ["./start.sh"]           0B                  
96bb88e1568d        9 minutes ago       /bin/sh -c #(nop)  EXPOSE 8080/tcp              0B                  
568b079ef6d5        9 minutes ago       /bin/sh -c #(nop) COPY file:f79c59694e5f31...   42B                 
3e7db3746768        9 minutes ago       /bin/sh -c #(nop) COPY file:3fb1ab0cf1e09e...   50.1MB              
d2d465d1a5db        4 days ago          /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           4 days ago          /bin/sh -c #(nop) WORKDIR /app                  0B                  
<missing>           4 days ago          /bin/sh -c #(nop) COPY file:a9b17ab946a74d...   1.97kB              
<missing>           4 days ago          |2 BASE_URL=https://apache.osuosl.org/mave...   10.2MB              
<missing>           4 days ago          /bin/sh -c #(nop)  ENV MAVEN_CONFIG=/root/.m2   0B                  
<missing>           4 days ago          /bin/sh -c #(nop)  ENV MAVEN_HOME=/usr/sha...   0B                  
<missing>           4 days ago          /bin/sh -c #(nop)  ENV MAVEN_VERSION=3.5.0      0B                  
<missing>           4 days ago          /bin/sh -c #(nop)  ARG BASE_URL=https://ap...   0B 


Running the App

      docker run -p 9000:8080 hello



Java 9: Flow - Reactive Programming

Programming world has always been changed fast enough and many programming / design paradigms have been introduced such as object oriented p...