Good practice of Dockerfile layout

Good practice of Dockerfile layout

Translator's Introduction

It can be seen from this article that the author Steve respects the long-standing good software engineering practices, which makes this article a reference for the Dockerfile specification.

Original author of this article: Steve Mushero Original link: steve-mushero.medium.com/dockerfile-...

Docker is ubiquitous, and there are many ways to write Dockerfile, but it is difficult to find one that can be used as an example, most of which are relatively simple. The few well-written ones have neither explained the reasons clearly nor given the necessary specifications or guidelines.

Therefore, this article aims to provide valuable specifications and guidelines for writing complex Dockerfiles. Below I will expand based on my DockerFile Annotated Example ( steve-mushero.medium.com/dockerfile-...

1. General Concepts

First of all, let me introduce the general specification, which is the good practice that software projects should follow for a long time, and can be directly applied to Dockerfile. These specifications play a more obvious role in the Dockerfile scenario. This is because Dockerfile is a highly dynamic file. From the beginning of the project to the maintenance phase, it is normal to modify the Dockerfile. Therefore, serious efforts are required to maintain its continuous high quality.

2. Syntax Block

The syntax block, if you want to use it, it must be placed on the first line. Although this puts beautiful titles and comment blocks in second place, we have no choice.

Few languages require a syntax block, but you may need to enable it when using Docker's various experimental options. The following is an example:

# syntax = docker/dockerfile:experimental Copy code

3. Title & Comment Block

The body of the document should start with the following section:

  • A real title (title);
  • Purpose (purpose);
  • Owner (owner);
  • And other file header annotations should have standard content.

Dockerfile is usually placed in the top directory, it needs to run independently, so in large projects, the title and information here are much more important than general source files. It also includes various assumptions, issues, and complexity explanations for later generations. For example, the required Docker version, how this file interacts with Composer, Kubernetes and other systems.

You can also point out how the container generated by this Dockerfile interacts with the larger system to which it belongs, and any key points related to the development, testing, and production environment should be explained.

4. TODO

There should be a unified TODO paragraph at any time. Although you can write a little bit here and a little bit other in the file, there are usually some larger or more meta TODOs.

5. Build-Time Arguments

It is not recommended to use, but if you must use it, you can add a commented out paragraph with an explanation, so that you can understand its purpose in time.

6. FROM

The FROM statement is the top priority;

FROM "paragraph" is the top priority; FROM "paragraph" with annotations, history records, and existing problem records is the top priority! If there are any special reasons for choosing this image or version, it must be recorded.

In the following example, we have recorded the special reasons for version selection, so as to avoid random changes by unknowing developers.

# Based on official PHP container # Note below on assumptions from that base # Use 7.3 for now as no mod_php yet via php7-apache2 on Alpine PHP the FROM: 7.3 .16 -alpine copy the code

7. Global Arguments

It is recommended to use it sparingly. If you have a hunch that you will use it in the future, prepare a commented paragraph for it in advance with an explanation, so as to limit the location and purpose of others adding Global Arguments (in this example, it must be after FROM ).

# Global Args from Docker/BuildKit must be added here after FROM ARG TARGETPLATFORM Copy code

8. ONBUILD

Don't use it, but if you think you might need it in the future, prepare a commented out paragraph for it in advance with an explanation. The reason is the same as Global Arguments.

# ONBUILD used to run things in downstream builds # Such as single layer copies # Not used for now # ONBUILD Copy code

9. Labels

Labels are very important, and there are many strange things, and its application scenarios are also relatively wide, including build, deployment, lifecycle management... and other processes.

In this case, we limited it and only allowed the use of OCI Labels, making it more versatile and easier to manage.

# OCI Annotations from https: //github.com/opencontainers/image-spec/blob/master/annotations.md LABEL org.opencontainers.image.maintainer= "Steve.Mushero@ELKman.io" /org.opencontainers.image. authors= "Steve.Mushero@ELKman.io" /org.opencontainers.image.title= "ELKman" # org.opencontainers.image.revision= "" FROM git Org.opencontainers.image.created = # "2020-05-01T01: 01: 01.01Z" Copy Code

10. Base Container Info

There must always be a basic image, which usually has some pre-installed programs, such as Apache, PHP, Java, or other containers. You should clearly know what assumptions you have made about the base image, which files are expected to be in which paths, and which preset environment variables are used. It is good practice to investigate these assumptions clearly and record them in the comments (even if they may change later). In this way, when we modify this file, we can distinguish whether the modified location and content are correct. The more complex the base image, the more you should do this, because subsequent image builders can easily get stuck on these issues.

You can also not use the preset of the basic image, and use the version you provide for all the parts you use to ensure that it is not accidentally modified. But this may damage the integrity of the original image, so it is not recommended.

# Offical PHP Apache container defaults & assumptions # From base Dockerfile: # User: www-data # WORKDIR: /var/ www/html (Note we change this ) # php.ini: In/usr/local/etc/php (Note we update this via sed) # Apache Conf: In/etc/apache2 (Note we update this via sed) The Packages #: the LOTS of dev like GCC, G ++, the make, etc. Probably Remove duplicated code

11. ENV Variables

You can set various users, paths, etc. here.

Be sure to set them to ENV instead of hard-code elsewhere, which will make the Dockerfile difficult to modify.

Although only changing an ENV value will invalidate the cache, it is still recommended to use ENV wherever it can be used, otherwise it will be prone to errors that are difficult to troubleshoot, especially when people and things change over time.

ENV MAINWORKDIR/var/www ENV MAINUSER root ENV MAINGROUP root ENV APACHEUSER www-data ENV APACHEGROUP www-data Copy code

12. Install & Repo Setup

Yum apt is only used to install what you need, if necessary, you can build yum or apt cache. This paragraph may also include notes about the installer, its options, etc., especially if you want to avoid caching and minimize space.

# apk supports --virtual to group & later remove packages # RUN apk add --no-cache --virtual .build-deps gcc # RUN apk del --no-cache .build-deps Copy code

13. ENV Install Tools

Putting basic OS tools in an ENV variable like this will make it easier to manage and tidier for future modifications. In order to support troubleshooting and deployment, this list will be longer at first, and will gradually be shortened as some tools are removed (of course, add it if necessary).

# Lists of tools-will shrink over time to reduce size # Alpha order, please # Telnet not available on alpine ENV INSTALL_TOOLS/ bash/ busybox-extras/ curl/ less Copy code

14. Install Basic Tools

Use variables to install basic tools, so that you no longer have to modify the line that is actually installed, and it is easier to ensure that there are correct installer options, etc., without having to repeat and edit these lines:

# Update Repo Info & Install Basic Packages RUN apk update --no-cache &&/ apk add --no-cache --clean-protected ${INSTALL_TOOLS} Copy code

15. Install Specialized Packages

For some special packages, or those that are not installed through the package manager of the release version (for example, they are not installed through yum on CentOS), they should be put in a separate paragraph. They usually have special installation sequences, processes, and options. Separate paragraphs make them obvious and easier to manage.

# Install Specialized Packages # We need SQLite for Telescope & other uses ENV EXTRA_PACKAGES sqlite3 RUN apk update --no-cache &&/ apk add --no-cache ${EXTRA_PACKAGES} Copy code

16. Remove Useless Stuff

Add a paragraph to delete unused software or programs, which will make the container smaller, and from a security perspective, this will reduce the attack surface. For example, many base images include gcc, and you will never use gcc at runtime, so please remove it. This assumes that you will be doing a multi-stage build or using the Squash option, both of which will be finalized and flattened to include only the active files in all layers.

# Stuff to remove for smaller size # Packages: Some images have dev stuf like gcc, g++, make, etc. ENV REMOVE_PACKAGES gcc RUN apk del $(REMOVE_PACKAGES) Copy code

17. Section Markers

Paragraph mark is a good practice, which makes the file structure clear, the content is easy to find, and prevents future changes from being randomly added to the wrong place. This is an important measure to keep the Dockerfile sanitary.

##### End of OS Items ##### Copy code

18. Service Items Section

The services in the container can be very diverse, from Apache or Nginx to large code bases to large data systems like MySQL or Elasticsearch. They all have their own requirements and complexity, most of which are fairly simple, but the deployment process is generally complex. Normally, it is best to use their dedicated containers, but sometimes you need to include them in your container, such as including Apache in the PHP Laravel application container. In this case, there are still many details to deal with. These parts are usually mini versions of the parts mentioned above, including package lists, installation files, and configuration files that are often copied or edited on-site. This is for Apache, starting with an ENV variable, containing a list of packages we need to install, and then installing them.

##### Apache Items ##### # Install Apache & PHP Modules # php7-apache2 installs much of PHP & Apache, ENV PHP_PACKAGES php7 php7-apache2 php7-json php7-phar php7-iconv/ php7-openssl php7-curl php7-mbstring php7-fileinfo/ php7-tokenizer php7-dom php7-session php7-pdo php7-pdo_sqlite/ php7-xml php7-simplexml php7-xmlwriter php7-zip RUN apk update --no-cache &&/ apk add --no-cache --clean-protected ${PHP_PACKAGES} Copy code

19. Specialized Configurations

There are many ways to set up configuration files, and you should separate them and record them clearly. In this case, we want to keep almost all the default values, so instead of copying the files, it is better to make a few in-situ modifications to the configuration files.

Basically, we set ENV first, and then run sed to implement the modification. Please note that in the first part, we first used the method of copying the product files, but later moved to the method of using the files contained in the basic image.

# Using default Alpine Apache configs and modifying from there # Then we override, which lets us use unmodified official files ENV APACHECONFFILE/etc/apache2/httpd.conf ENV APACHECONFDDIR/etc/apache2/conf.d ENV APACHEVHOSTCONFFILE ${APACHECONFDDIR}/default .conf ENV APACHESECURITYFILE ${APACHECONFDDIR}/security.conf # Copy over PHP file from PHP-Apache # Skipping as seems the Alpine version has one: php7- module .conf # COPY/deploy/apache/docker-php.conf ${APACHECONFDDIR}/docker-php.conf RUN echo &&/ # Remove stuff we don't want nor need for security, etc. rm/etc/apache2/conf.d/userdir.conf &&/ rm/etc/apache2/conf.d/info.conf &&/ # # Apache main config overrides # sed -ri -e ' s/^#ServerName.*$/ServerName elkman/g ' ${APACHECONFFILE} &&/ sed -ri -e ' s/^ServerTokens.*$/ServerTokens Prod/g ' ${APACHECONFFILE} &&/ -ri -e Sed ' S/ServerSignature ^. * $/ServerSignature Off/G ' $ {} APACHECONFFILE copy the code

20. Other Services

Next are other services and configurations, in this case PHP. PHP is already installed in the base image, so we only need to deal with the configuration, copy and modify it in place, plus clean up the base image and delete unused parts to make sure it is clear.

##### PHP Items ##### # PHP Configs-Complicated as there 're two PHP on Alpine 7.3 # Some PHP containers use date-specific extension dir in php.ini # On Alpine, careful of which php is used for CLI # vs. mod_php to verify their paths-Very confusing # Disble default php so can' t get confused on configs, modules, etc. # Then the one we want works fine in path RUN mv/usr/local/bin/php/usr/local/bin/php.bad # For Alphine 7.3 we use/usr/bin/php and/usr/etc/php ENV PHP_INI_DIR/etc/php7 ENV PHPEXTDIR "/usr/lib/php7/modules/" # Use the default prod configuration from php: 7.4 .4 -apache (php.ini-development also exists) COPY deploy/php/php.ini-production $PHP_INI_DIR/php.ini # Copy overrides COPY deploy/php/php-override-prod.ini $PHP_INI_DIR/conf.d/ COPY deploy/php/php-sourceguardian.ini $PHP_INI_DIR/conf.d/ # Install composer & prestissimo for parallel downloads if needed RUN curl -sS https: //getcomposer.org/installer |/ php - --install-dir= /usr/ local/bin --filename=composer &&/ Composer Global the require hirak/Prestissimo --no-scripts plugins --no- duplicated code

21. Add Your Code

Now that you have installed the service, you should add your own code at this time, this time copy it from the build environment.

You can also pull from the git repository, install it in the form of a package (rpm package?), etc. But our build environment has already pulled all the code, products, build scripts, Dockerfile, etc., so direct copying is the easiest. The COPY command is very specific and has been extensively tested.

Please also note that the team needs to maintain a sufficient and consistent understanding of the comments, permissions, etc. on .dockerignore. .dockerignore is usually the result of many, many hours of work, so everyone needs to understand it clearly at all times.

#### Add Code #### # Need to change WORKDIR as Apache default is/var/www/html WORKDIR ${MAINWORKDIR} # Copy files from VM # Copy App Directories-Not setting owners here, it 's done later # Note will ignore the .dockerignore things, so tune that, too # Currently we depend on git to create/ignore all the dirs we need, especially in storage # We do this because later we want to git clone into container as part of build COPY app app COPY config config COPY resources resources COPY routes routes COPY bootstrap bootstrap COPY database database COPY storage storage COPY public public COPY tests tests # Copy Specific Files COPY artisan ./ COPY composer.json ./ COPY composer.lock ./ COPY package.json ./ COPY package-lock.json ./ COPY webpack.mix.js ./ Copy code

22. Building & Compiling Things

After the code is written, it usually needs to be built before it can be used-it is very common for JavaScript, but in our case, running PHP Composer is also a step of container building.

As always, the documents, purposes, and special issues (issues) here must be crystal clear, because this is usually the result of days or weeks of work and testing. In this example, we run PHP Composer during the container build process to obtain and set up the required libraries. This is very troublesome, and we also directly COPY a copy from the outside as a cache to improve performance. This is the result of a lot of trials and a lot of pitfalls.

# Run Composer install # ENV COMPOSER_CACHE_DIR-Can set if needed, now using default # Cannot use RUN mount here as we need a cache dir, and mount only supports files ( as far as I can tell) # Copy in composer cache, use and remove COPY/composer-cache/files/root/.composer/cache/files # Note: Have to run'composer dump-autoload' for some reason here; seems install not fully doing it RUN composer install --no-dev --classmap-authoritative --no-ansi/ --no-scripts --no-interaction --no-suggest &&/ composer dump-autoload &&/ rm -rf/root/.composer/cache Copy code

Then, run npm to get Vue.js and all necessary Javascript code.

# NPM Stuff & Webpack (part of dev script) # RUN npm install --no-optional # Moving to ci instead of install (ci uses lock file) RUN npm ci --no-optional RUN npm run prod Copy code

In this example, we first bypass the problem while seeking the optimal solution. It is very challenging to manage JavaScript, so we keep the build directory to avoid re-executing the build process that is prone to crashes.

# Move public artifacts to doc root- do this after npm run # Get .htaccess, too # We missing anything in the standard html? # Not moving as better to point Doc Root to our public Mv public RUN # /* HTML/&& mv public/.htaccess HTML/ Copy the code

23. Data & Things

Once all the services and codes are ready, we begin to prepare the data. In this case, it is to create an empty sqllite .db file (but the creation and initialization of the table is not here, but in a later step).

# Move DB file from source tree to writable storage area # For now, touch empty file-we initialize this DB later # Later we can copy a default DB if we wish # RUN mv database/db.sqlite storage/database/ RUN touch storage/database/db.sqlite Copy code

24. Environment Setup

Once all the services, codes, and data are ready, we can set up the .env files, which will be used during runtime and will be used in some of the next build steps.

# .env File-Need to copy for production COPY .env.production .env # Copy dusk env for now for testing COPY .env.dusk.testing .env.dusk.testing Copy code

25. Setup System

Now it's time to set up the system itself. For Laravel (a PHP framework), this means running a bunch of Laravel commands to set up PHP configuration, keys, and initialize the database. This part will change frequently, so good comments are very important.

# Setup configs & code; may later do as other user, fixed UID, etc. # Generate a new key each time (though we also need on install) RUN php artisan key:generate # Optimize & cache; do before we migrate or run other artisan jobs RUN php artisan optimize # Seed tables, Telescope, etc. data into DB # Run after keygen, before other artisan cmds RUN php artisan migrate # Update DB version to app code version; this for container 's initial DB only RUN php artisan elkman:update Copy code

26. Remove Logs

Remove the logs generated by all the above steps, so that the image is cleaner and smaller. Remember, be sure to clear any logs created during the build process (because they may contain sensitive information that you don't want users to see).

# Remove log file so we start clean (and with right log file owner) RUN rm -rf storage/logs /* Copy code

27. File Permissions

Because of the previous COPY and command execution, setting file permissions in Docker can easily become a mess. This is especially true for a complex runtime environment like Laravel. So be sure to decide in advance what method to use, and write clear notes or documentation. I say this because we have done countless tests and have a deep understanding of it. In the following example, in the same place, all permissions are set at one time. The advantages of this are: easy to find, easy to modify, easy to add special cases.

# Permissions carefully managed here # Set all directory permissions # Set global owner & read perms, then set for writable, exec, etc. ENV READPERM 440 ENV WRITEPERM 660 RUN chown -R ${MAINUSER}:${APACHEGROUP} ./&&/ chmod -R ${READPERM} ./&&/ chmod -R ${WRITEPERM} storage &&/ # Set all dirs to be executable so we can get into them # Do after any chmods above ./-type D -print0 Find | xargs - 0 the chmod UG + X duplicated code

28. Final Purging

Add a final cleanup paragraph to reduce the image size.

### Data Purge # Need to purge & cleanup # rm composer & caches # rm npx & caches # rm any man pages, etc. # vendor cleanup RUN rm -rf/tmp /* # End of apk installs, we can clean # As apk cache clean seems useless RUN rm -rf/var/cache/apk/* Copy code

At this point, a pretty good Dockerfile was born.