Nebulaworks Insight Content Card Background - Pexels dark sand waves

Nebulaworks Insight Content Card Background - Pexels dark sand waves

Querying Remote Container Repository Metadata with Skopeo

July 7, 2020 Rob Hernandez

Discover a much needed alterative to the docker CLI.

Recent Updates

The container image is used to solve many of the challenges that come about when teams are shipping and running software. One of the container’s big wins is providing a consistent packaging format for an application and it’s dependencies. While containers certainly make the build and installation easier, it’s often difficult to query container metadata without resorting to pulling down the image to the local machine. This metadata often contains very valuable information about the container, including Digests, Layers, Tags, and Labels amongst other items that users commonly need to query.

One of the most popularized tools in the ecosystem: docker has some pretty limited functionality for querying information about these container images that reside in remote repositories (also called a container registry). This metadata is very valuable and has been made available by traditional package managers. But when you look at docker's metadata functionality it is just plain missing.

Since traditional package managers (apt, yum, etc.) don’t ship container images, what’s a viable solution to docker's lack of functionality?

I was asking myself this very question, and I’ve have been very impressed with the capabilities of a newer tool called: skopeo. Skopeo solves the real need of container users providing the ability to query OCI and Docker repositories via the CLI. Let’s get started by discussing docker's shortcommings then comeback to skopeo and cover the specifics of how it fills the metadata gap.

Dockers Missing Query Functionality

Why is docker unable to query a given repository for information about a container image using the docker CLI? How are users doing discovering this information today? It would appear there are three options if you strictly stick to the docker ecosystem:

  1. Look up repository separately in a browser.
  2. Leverage an outside tool like curl to query the specific parts of the repository API.
  3. Know the repository and tags ahead of time, pull it down then query the image with docker images locally.

Initially I thought that I must be missing a subcommmand that docker provides to query a remote repository and images. While looking at the docker CLI documentation, I noticed there are two subcommands related to interacting with a remote repository:

  1. docker registry: to the best of my knowledge, is only available in the Docker enterprise offering… Setting the experimental=true config flag in .docker/config.json did not result in the docker registry subcommand being available on version 19.03.08.

  2. docker search: for basic, high-level search of images. But this only appears works on Dockerhub and only allows you to query for repository names and apply a small amount of filters. There is no way to drill down into Tags or Layers of a given repository:

$ docker search --filter is-official=true nginx
NAME                DESCRIPTION                STARS               OFFICIAL            AUTOMATED
nginx               Official build of Nginx.   13276               [OK]

Great, I guess the path of least resistance is to pull latest and hope for the best? I’d really rather not.

I figured that metadata querying of images must be a often requested feature in docker’s public github repo. Surely with the docker project having been around for 7 years someone in the community would find this lack of query functionality problematic. Looking at the Moby Org GitHub (where the docker engine is hosted) open issues seems to show only one issue raised about this that was opened in 2015 and remains open.

Since docker does not see the value in the image metadata query functionality, it appears that we’ll have to explore alternate solutions. Coming back to my early example, what’s also interesting is that the exact type of query functionality for packages has existed for a long time in other, traditional package managers.

Traditional Package Managers

For example, if we wanted to query the nginx package in apt:

Find all packages in the configured apt repos with the name beginning with the string nginx:

$ apt-cache search -n "^nginx"
nginx - small, powerful, scalable web/proxy server
nginx-common - small, powerful, scalable web/proxy server - common files
nginx-core - nginx web/proxy server (standard version)
nginx-doc - small, powerful, scalable web/proxy server - documentation
nginx-confgen - nginx configuration file macro language and preprocessor
nginx-extras - nginx web/proxy server (extended version)
nginx-full - nginx web/proxy server (standard version)
nginx-light - nginx web/proxy server (basic version)

Show specific versions of nginx that are available:

$ apt-cache madison nginx
     nginx | 1.17.10-0ubuntu1 | http://archive.ubuntu.com/ubuntu groovy/main amd64 Packages

Or lets look at what dnf on Redhat based distros is capable of querying:

$ dnf search nginx
Failed to set locale, defaulting to C.UTF-8
Last metadata expiration check: 0:00:02 ago on Wed Jun  3 00:24:42 2020.
================================================= Name Exactly Matched: nginx ==================================================
nginx.x86_64 : A high performance web server and reverse proxy server
================================================ Name & Summary Matched: nginx =================================================
nginx-mod-mail.x86_64 : Nginx mail modules
nginx-mod-stream.x86_64 : Nginx stream modules
nginx-mod-http-perl.x86_64 : Nginx HTTP perl module
nginx-mod-http-xslt-filter.x86_64 : Nginx XSLT module
nginx-mod-http-image-filter.x86_64 : Nginx HTTP image filter module
nginx-filesystem.noarch : The basic directory layout for the Nginx server
pcp-pmda-nginx.x86_64 : Performance Co-Pilot (PCP) metrics for the Nginx Webserver
nginx-all-modules.noarch : A meta package that installs all available Nginx modules

Show me specific versions of nginx that are available in the enable repositories:

$ dnf repoquery nginx
Last metadata expiration check: 0:04:44 ago on Wed Jun  3 00:24:42 2020.
nginx-1:1.14.1-9.module_el8.0.0+184+e34fea82.x86_64

Solution

So when it comes to query container repositories and images what are our options for being able to deep dive into the repository and specific image details?

Worst

One can leverage the aforementioned docker search:

$ docker search --filter is-official=true nginx
NAME                DESCRIPTION                STARS               OFFICIAL            AUTOMATED
nginx               Official build of Nginx.   13276               [OK]

Then open the browser and navigate to Dockerhub to look at the versions of nginx that are available. This browser-first mentality is something that docker seems to deem as acceptable even though it’s quite inconvenient and doesn’t lend itself to automation. It reminds me of Docker's relatively brief (2018 to 2020) attempt to try to force users to log in to download the Windows and Mac versions of its docker client: https://github.com/docker/docker.github.io/issues/6910

I would think most people find themselves in this workflow when working with docker. But if you’re like me, this gets old pretty quick.

Slightly better

There are many posts online about people querying the repository API with curl or a collection of small shell scripts. This seems “ok”, but these scripts are brittle and not well versioned or maintained. Regardless, it’s a good exercise to see what’s possible via the REST API in a remote repository.

Lets again start with docker search:

$ docker search --filter is-official=true nginx
NAME                DESCRIPTION                STARS               OFFICIAL            AUTOMATED
nginx               Official build of Nginx.   13276               [OK]

Drill down into the nginx repository using curl:

$ curl -Ss https://registry.hub.docker.com/v1/repositories/nginx/tags | jq -r '.[].name' | tail -n 14
1.9.7
1.9.8
1.9.9
alpine
alpine-perl
mainline
mainline-alpine
mainline-alpine-perl
mainline-perl
perl
stable
stable-alpine
stable-alpine-perl
stable-perl

Opting to use v1 instead of the v2 API due to needing a specific token and further complicate this example

This works but its a bit awkward to have to construct the URL. The user would also need to curl a separate part of the API to individual image tag info like Digest, Labels, etc.

So the repository API makes this information available to clients, but why is this not possible in the docker CLI? Let’s see what skopeo is capable of doing with these exposed API endpoints provided by the repository.

Better

If you recall at the start of this blog, I mentioned a CLI tool named skopeo that offers more capabilities to grabbing remote container metadata. So why not try ditching the docker CLI tool that allows the manipulation, investigation, and propagation of container images. If you require a container runtime, you’ll still need to leverage docker or an alternative container runtime like podman.

Let’s see how skopeo compares to docker's previous options for querying remote image metadata of nginx:

At this time skopeo is unable to search the entire image catalog like docker search but we’ll stick to our trusty nginx image and drill down into its metadata

$ skopeo -v
skopeo version 1.0.0
$ skopeo inspect docker://docker.io/nginx
{
    "Repository": "docker.io/library/nginx",
    "Tags": [
        ...
        "1.17",
        "1.18-alpine-perl",
        "1.18-alpine",
        "1.18-perl",
        "1.18.0-alpine-perl",
        "1.18.0-alpine",
        "1.18.0-perl",
        "1.18.0",
        "1.18",
        "1.19-alpine-perl",
        "1.19-alpine",
        "1.19-perl",
        "1.19.0-alpine-perl",
        "1.19.0-alpine",
        "1.19.0-perl",
        "1.19.0",
        "1.19",
        ...
    ]
}

If you are not on linux override the arch and os by adding: “–override-arch=amd64 –override-os=linux”

Woah, that’s a big list! I trunked the complete list of tags with ... since there are a lot of them for this repository.

Let’s drill even deeper into information about a single nginx image tag 1.18:

$ skopeo inspect docker://docker.io/nginx:1.18
{
    "Name": "docker.io/library/nginx",
    "Digest": "sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c",
    "RepoTags": [
        ...
        "1.17",
        "1.18-alpine-perl",
        "1.18-alpine",
        "1.18-perl",
        "1.18.0-alpine-perl",
        "1.18.0-alpine",
        "1.18.0-perl",
        "1.18.0",
        "1.18",
        "1.19-alpine-perl",
        "1.19-alpine",
        "1.19-perl",
        "1.19.0-alpine-perl",
        "1.19.0-alpine",
        "1.19.0-perl",
        "1.19.0",
        "1.19",
        ...

    ],
    "Created": "2020-06-09T16:58:54.881675136Z",
    "DockerVersion": "18.09.7",
    "Labels": {
        "maintainer": "NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e"
    },
    "Architecture": "amd64",
    "Os": "linux",
    "Layers": [
        "sha256:8559a31e96f442f2c7b6da49d6c84705f98a39d8be10b3f5f14821d0ee8417df",
        "sha256:9a38be3aab21dba75ef4792b9aa3bbe55828efd4eb00d2f36e3d45f60db85396",
        "sha256:522e5edd83fa5d31730131804cc6fdb72ea4cae671457b0e4533b5ce2e3c6fcd",
        "sha256:2ccf5a90baa65f9d8bbe042d9c5da9970bf74700d6819ef5758c5a560ed376a7"
    ],
    "Env": [
        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "NGINX_VERSION=1.18.0",
        "NJS_VERSION=0.4.0",
        "PKG_RELEASE=1~buster"
    ]
}

If you are not on linux override the arch and os by adding: “–override-arch=amd64 –override-os=linux”

Skopeo enables the user to get Digest, Tags, Labels, Layers, Architecture, Env, etc. with minimal effort. Plus the user gets all of this metadata and information without having to download the image locally! This enables the user to easily make use of container deployment best practices and leverage image Tags and Digests instead of default to latest. Using Digests (i.e. the SHA256 of the image) can guarantee getting a specific build of an image was is very important in some production environments. Skopeo also gets us closer to the functionality of the traditional package manager provides, which is what I was initially looking for.

Lets bring this full circle and run the specific nginx image using its Digest that we just queried:

$ docker run --rm nginx@sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c nginx -v
Unable to find image 'nginx@sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c' locally
sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c: Pulling from library/nginx
8559a31e96f4: Pull complete
9a38be3aab21: Pull complete
522e5edd83fa: Pull complete
2ccf5a90baa6: Pull complete
Digest: sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c
Status: Downloaded newer image for nginx@sha256:159aedcc6acb8147c524ec2d11f02112bc21f9e8eb33e328fb7c04b05fc44e1c
nginx version: nginx/1.18.0

Perfect, the Digest was successfully utilized to get the container image from Dockerhub and it successfully executed the nginx -v command to confirm the version of nginx that is running inside that container.

Conclusion

It’s odd that the container metadata query functionality is not in the open source docker CLI tool and not prioritized by the developers within Docker or the community. Worse, it makes for a weird set of CLI ergonomics when you have to factor in leveraging a browser to view this metadata.

Skopeo solves the real need of container users with being able to query OCI and Docker repositories via the CLI just like you would with a traditional package management tool. Allowing for programmatic filtering of large amounts of image metadata to get the exact container image a user needs, browser free.

Skopeo can do a lot more than just query repository and image information. It can do most of the administrative operations required by an engineer/team working with container images. This includes repository to repository image copying, image format transformation, and image deletion, to name a few. Skopeo recently just hit v1.0.0, you can check out the Skopeo GitHub repository for more information.

We’ve helped several clients define and iterate on these software engineering best practices and expertise across critical container deployments. If you have more questions around container adoption or best practices, feel free to reach out to us.

Insight Authors

Rob Hernandez, CTO Rob Hernandez CTO
Nebulaworks - Wide/concrete light half gray

Looking for a partner with engineering prowess? We got you.

Learn how we've helped companies like yours.