How to retrieve a file that has been added then removed in a Docker image?












0















Assume you have a working directory like this:



$ tree .
.
├── Dockerfile
└── file.txt


And the Dockerfile contains:



FROM debian:9

WORKDIR /usr/src/foo

COPY file.txt .

RUN echo Some random command involving file.txt
&& rm -f file.txt


And you build and push the corresponding image to a given Docker registry:



$ docker build -t foo/bar .
$ docker login #…
$ docker push foo/bar


Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer? Does the answer depend on the choice of the WORKDIR?










share|improve this question


















  • 1





    A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

    – Aaron D. Marasco
    Feb 10 at 0:07
















0















Assume you have a working directory like this:



$ tree .
.
├── Dockerfile
└── file.txt


And the Dockerfile contains:



FROM debian:9

WORKDIR /usr/src/foo

COPY file.txt .

RUN echo Some random command involving file.txt
&& rm -f file.txt


And you build and push the corresponding image to a given Docker registry:



$ docker build -t foo/bar .
$ docker login #…
$ docker push foo/bar


Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer? Does the answer depend on the choice of the WORKDIR?










share|improve this question


















  • 1





    A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

    – Aaron D. Marasco
    Feb 10 at 0:07














0












0








0


1






Assume you have a working directory like this:



$ tree .
.
├── Dockerfile
└── file.txt


And the Dockerfile contains:



FROM debian:9

WORKDIR /usr/src/foo

COPY file.txt .

RUN echo Some random command involving file.txt
&& rm -f file.txt


And you build and push the corresponding image to a given Docker registry:



$ docker build -t foo/bar .
$ docker login #…
$ docker push foo/bar


Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer? Does the answer depend on the choice of the WORKDIR?










share|improve this question














Assume you have a working directory like this:



$ tree .
.
├── Dockerfile
└── file.txt


And the Dockerfile contains:



FROM debian:9

WORKDIR /usr/src/foo

COPY file.txt .

RUN echo Some random command involving file.txt
&& rm -f file.txt


And you build and push the corresponding image to a given Docker registry:



$ docker build -t foo/bar .
$ docker login #…
$ docker push foo/bar


Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer? Does the answer depend on the choice of the WORKDIR?







docker






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 9 at 23:53









ErikMDErikMD

1033




1033








  • 1





    A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

    – Aaron D. Marasco
    Feb 10 at 0:07














  • 1





    A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

    – Aaron D. Marasco
    Feb 10 at 0:07








1




1





A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

– Aaron D. Marasco
Feb 10 at 0:07





A comment not an answer, because I think so... It's my understanding that yes there is an intermediate overlay layer with the file. However, if the Dockerfile uses the new multi-stage builds it might not... I'm still learning those myself!

– Aaron D. Marasco
Feb 10 at 0:07










1 Answer
1






active

oldest

votes


















3















Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer?




Yes!




Does the answer depend on the choice of the WORKDIR?




No. WORKDIR doesn't do anything other than change the current working directory.





When you build an image from a Dockerfile, each directive in the Dockerfile creates a new layer. An "image" is just a collection of layers that are combined to form the container filesystem when you run a container. Each of those layers can be found separately on your disk under /var/lib/docker. For example, let's say I build an image using this Dockerfile:



FROM debian:9
COPY file.txt /root/file.txt
RUN rm -f /root/file.txt


In that directory, I have a file named file.txt that contains the text:



hello world


If I run docker build -t erikmd ., I see:



Sending build context to Docker daemon  3.072kB
Step 1/3 : FROM debian:9
---> d508d16c64cd
Step 2/3 : COPY file.txt /root/file.txt
---> Using cache
---> 6f06029c1cca
Step 3/3 : RUN rm -f /root/file.txt
---> Using cache
---> a2dc62c823c9
Successfully built a2dc62c823c9
Successfully tagged erikmd:latest


Each step in the build process is generating a new layer, and it is providing you with an image id representing an intermediate image that is the result of all the Dockerfile commands up to that point. Given the about output, I can run:



$ docker run --rm 6f06029c1cca cat /root/file.txt


And see the contents of the file:



hello world


But what if I didn't just build the image? In that case, I would start by using the docker image inspect command to look at the list of layers that comprise the image:



$ docker image inspect erikmd | jq '.[0].RootFS.Layers'
[
"sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303",
"sha256:41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e",
"sha256:1948a4bd00b6f1712667bb2c68d1fe6eb60fbbcdf8bad62653208c23bf2602a5"
]


In the above, jq is just a tool for querying JSON data. You could just visually inspect the output of docker image inspect for the same information if you don't happen to have jq handy.



Assuming a default Docker configuration using the overlay2 storage driver, you will find these identifiers in /var/lib/docker/image/overlay2/layerdb/sha256/*/diff. So, for example:



# grep -l 13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303 
/var/lib/docker/image/overlay2/layerdb/sha256/*/diff
/var/lib/docker/image/overlay2/layerdb/sha256/13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303/diff


This first layer is the debian:9 image. We can confirm that by running:



$ docker image inspect debian:9 | jq '.[0].RootFS.Layers'
[
"sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303"
]


...so we'll ignore it. Let's find the second layer:



# grep -l 41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e 
/var/lib/docker/image/overlay2/layerdb/sha256/*/diff
/var/lib/docker/image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/diff


Inside the same directory as that diff file, we'll find a file called cache-id:



# cat image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/cache-id
118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75


That cache-id identifies the directory into which the layer has been extracted; we can find it under /var/lib/docker/overlay2/<id>:



# ls /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75
diff/ link lower work/


We're interested in the contents of the diff/ directory:



# find /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef6
21ccb70cf14fe672dc74ef75/diff/
/var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/
/var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root
/var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root/file.txt


And there it is!





NB All of the above assumes that you're using the overlay2 storage driver (which is the default on most if not all platforms these days). If you're using a different driver, the layout on disk is going to be different.






share|improve this answer























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499713%2fhow-to-retrieve-a-file-that-has-been-added-then-removed-in-a-docker-image%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3















    Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer?




    Yes!




    Does the answer depend on the choice of the WORKDIR?




    No. WORKDIR doesn't do anything other than change the current working directory.





    When you build an image from a Dockerfile, each directive in the Dockerfile creates a new layer. An "image" is just a collection of layers that are combined to form the container filesystem when you run a container. Each of those layers can be found separately on your disk under /var/lib/docker. For example, let's say I build an image using this Dockerfile:



    FROM debian:9
    COPY file.txt /root/file.txt
    RUN rm -f /root/file.txt


    In that directory, I have a file named file.txt that contains the text:



    hello world


    If I run docker build -t erikmd ., I see:



    Sending build context to Docker daemon  3.072kB
    Step 1/3 : FROM debian:9
    ---> d508d16c64cd
    Step 2/3 : COPY file.txt /root/file.txt
    ---> Using cache
    ---> 6f06029c1cca
    Step 3/3 : RUN rm -f /root/file.txt
    ---> Using cache
    ---> a2dc62c823c9
    Successfully built a2dc62c823c9
    Successfully tagged erikmd:latest


    Each step in the build process is generating a new layer, and it is providing you with an image id representing an intermediate image that is the result of all the Dockerfile commands up to that point. Given the about output, I can run:



    $ docker run --rm 6f06029c1cca cat /root/file.txt


    And see the contents of the file:



    hello world


    But what if I didn't just build the image? In that case, I would start by using the docker image inspect command to look at the list of layers that comprise the image:



    $ docker image inspect erikmd | jq '.[0].RootFS.Layers'
    [
    "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303",
    "sha256:41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e",
    "sha256:1948a4bd00b6f1712667bb2c68d1fe6eb60fbbcdf8bad62653208c23bf2602a5"
    ]


    In the above, jq is just a tool for querying JSON data. You could just visually inspect the output of docker image inspect for the same information if you don't happen to have jq handy.



    Assuming a default Docker configuration using the overlay2 storage driver, you will find these identifiers in /var/lib/docker/image/overlay2/layerdb/sha256/*/diff. So, for example:



    # grep -l 13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303 
    /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
    /var/lib/docker/image/overlay2/layerdb/sha256/13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303/diff


    This first layer is the debian:9 image. We can confirm that by running:



    $ docker image inspect debian:9 | jq '.[0].RootFS.Layers'
    [
    "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303"
    ]


    ...so we'll ignore it. Let's find the second layer:



    # grep -l 41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e 
    /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
    /var/lib/docker/image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/diff


    Inside the same directory as that diff file, we'll find a file called cache-id:



    # cat image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/cache-id
    118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75


    That cache-id identifies the directory into which the layer has been extracted; we can find it under /var/lib/docker/overlay2/<id>:



    # ls /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75
    diff/ link lower work/


    We're interested in the contents of the diff/ directory:



    # find /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef6
    21ccb70cf14fe672dc74ef75/diff/
    /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/
    /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root
    /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root/file.txt


    And there it is!





    NB All of the above assumes that you're using the overlay2 storage driver (which is the default on most if not all platforms these days). If you're using a different driver, the layout on disk is going to be different.






    share|improve this answer




























      3















      Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer?




      Yes!




      Does the answer depend on the choice of the WORKDIR?




      No. WORKDIR doesn't do anything other than change the current working directory.





      When you build an image from a Dockerfile, each directive in the Dockerfile creates a new layer. An "image" is just a collection of layers that are combined to form the container filesystem when you run a container. Each of those layers can be found separately on your disk under /var/lib/docker. For example, let's say I build an image using this Dockerfile:



      FROM debian:9
      COPY file.txt /root/file.txt
      RUN rm -f /root/file.txt


      In that directory, I have a file named file.txt that contains the text:



      hello world


      If I run docker build -t erikmd ., I see:



      Sending build context to Docker daemon  3.072kB
      Step 1/3 : FROM debian:9
      ---> d508d16c64cd
      Step 2/3 : COPY file.txt /root/file.txt
      ---> Using cache
      ---> 6f06029c1cca
      Step 3/3 : RUN rm -f /root/file.txt
      ---> Using cache
      ---> a2dc62c823c9
      Successfully built a2dc62c823c9
      Successfully tagged erikmd:latest


      Each step in the build process is generating a new layer, and it is providing you with an image id representing an intermediate image that is the result of all the Dockerfile commands up to that point. Given the about output, I can run:



      $ docker run --rm 6f06029c1cca cat /root/file.txt


      And see the contents of the file:



      hello world


      But what if I didn't just build the image? In that case, I would start by using the docker image inspect command to look at the list of layers that comprise the image:



      $ docker image inspect erikmd | jq '.[0].RootFS.Layers'
      [
      "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303",
      "sha256:41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e",
      "sha256:1948a4bd00b6f1712667bb2c68d1fe6eb60fbbcdf8bad62653208c23bf2602a5"
      ]


      In the above, jq is just a tool for querying JSON data. You could just visually inspect the output of docker image inspect for the same information if you don't happen to have jq handy.



      Assuming a default Docker configuration using the overlay2 storage driver, you will find these identifiers in /var/lib/docker/image/overlay2/layerdb/sha256/*/diff. So, for example:



      # grep -l 13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303 
      /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
      /var/lib/docker/image/overlay2/layerdb/sha256/13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303/diff


      This first layer is the debian:9 image. We can confirm that by running:



      $ docker image inspect debian:9 | jq '.[0].RootFS.Layers'
      [
      "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303"
      ]


      ...so we'll ignore it. Let's find the second layer:



      # grep -l 41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e 
      /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
      /var/lib/docker/image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/diff


      Inside the same directory as that diff file, we'll find a file called cache-id:



      # cat image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/cache-id
      118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75


      That cache-id identifies the directory into which the layer has been extracted; we can find it under /var/lib/docker/overlay2/<id>:



      # ls /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75
      diff/ link lower work/


      We're interested in the contents of the diff/ directory:



      # find /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef6
      21ccb70cf14fe672dc74ef75/diff/
      /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/
      /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root
      /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root/file.txt


      And there it is!





      NB All of the above assumes that you're using the overlay2 storage driver (which is the default on most if not all platforms these days). If you're using a different driver, the layout on disk is going to be different.






      share|improve this answer


























        3












        3








        3








        Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer?




        Yes!




        Does the answer depend on the choice of the WORKDIR?




        No. WORKDIR doesn't do anything other than change the current working directory.





        When you build an image from a Dockerfile, each directive in the Dockerfile creates a new layer. An "image" is just a collection of layers that are combined to form the container filesystem when you run a container. Each of those layers can be found separately on your disk under /var/lib/docker. For example, let's say I build an image using this Dockerfile:



        FROM debian:9
        COPY file.txt /root/file.txt
        RUN rm -f /root/file.txt


        In that directory, I have a file named file.txt that contains the text:



        hello world


        If I run docker build -t erikmd ., I see:



        Sending build context to Docker daemon  3.072kB
        Step 1/3 : FROM debian:9
        ---> d508d16c64cd
        Step 2/3 : COPY file.txt /root/file.txt
        ---> Using cache
        ---> 6f06029c1cca
        Step 3/3 : RUN rm -f /root/file.txt
        ---> Using cache
        ---> a2dc62c823c9
        Successfully built a2dc62c823c9
        Successfully tagged erikmd:latest


        Each step in the build process is generating a new layer, and it is providing you with an image id representing an intermediate image that is the result of all the Dockerfile commands up to that point. Given the about output, I can run:



        $ docker run --rm 6f06029c1cca cat /root/file.txt


        And see the contents of the file:



        hello world


        But what if I didn't just build the image? In that case, I would start by using the docker image inspect command to look at the list of layers that comprise the image:



        $ docker image inspect erikmd | jq '.[0].RootFS.Layers'
        [
        "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303",
        "sha256:41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e",
        "sha256:1948a4bd00b6f1712667bb2c68d1fe6eb60fbbcdf8bad62653208c23bf2602a5"
        ]


        In the above, jq is just a tool for querying JSON data. You could just visually inspect the output of docker image inspect for the same information if you don't happen to have jq handy.



        Assuming a default Docker configuration using the overlay2 storage driver, you will find these identifiers in /var/lib/docker/image/overlay2/layerdb/sha256/*/diff. So, for example:



        # grep -l 13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303 
        /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
        /var/lib/docker/image/overlay2/layerdb/sha256/13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303/diff


        This first layer is the debian:9 image. We can confirm that by running:



        $ docker image inspect debian:9 | jq '.[0].RootFS.Layers'
        [
        "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303"
        ]


        ...so we'll ignore it. Let's find the second layer:



        # grep -l 41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e 
        /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
        /var/lib/docker/image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/diff


        Inside the same directory as that diff file, we'll find a file called cache-id:



        # cat image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/cache-id
        118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75


        That cache-id identifies the directory into which the layer has been extracted; we can find it under /var/lib/docker/overlay2/<id>:



        # ls /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75
        diff/ link lower work/


        We're interested in the contents of the diff/ directory:



        # find /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef6
        21ccb70cf14fe672dc74ef75/diff/
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root/file.txt


        And there it is!





        NB All of the above assumes that you're using the overlay2 storage driver (which is the default on most if not all platforms these days). If you're using a different driver, the layout on disk is going to be different.






        share|improve this answer














        Is there a way (or several ways) to retrieve from the image, the contents of file.txt that was added then removed in an intermediate layer?




        Yes!




        Does the answer depend on the choice of the WORKDIR?




        No. WORKDIR doesn't do anything other than change the current working directory.





        When you build an image from a Dockerfile, each directive in the Dockerfile creates a new layer. An "image" is just a collection of layers that are combined to form the container filesystem when you run a container. Each of those layers can be found separately on your disk under /var/lib/docker. For example, let's say I build an image using this Dockerfile:



        FROM debian:9
        COPY file.txt /root/file.txt
        RUN rm -f /root/file.txt


        In that directory, I have a file named file.txt that contains the text:



        hello world


        If I run docker build -t erikmd ., I see:



        Sending build context to Docker daemon  3.072kB
        Step 1/3 : FROM debian:9
        ---> d508d16c64cd
        Step 2/3 : COPY file.txt /root/file.txt
        ---> Using cache
        ---> 6f06029c1cca
        Step 3/3 : RUN rm -f /root/file.txt
        ---> Using cache
        ---> a2dc62c823c9
        Successfully built a2dc62c823c9
        Successfully tagged erikmd:latest


        Each step in the build process is generating a new layer, and it is providing you with an image id representing an intermediate image that is the result of all the Dockerfile commands up to that point. Given the about output, I can run:



        $ docker run --rm 6f06029c1cca cat /root/file.txt


        And see the contents of the file:



        hello world


        But what if I didn't just build the image? In that case, I would start by using the docker image inspect command to look at the list of layers that comprise the image:



        $ docker image inspect erikmd | jq '.[0].RootFS.Layers'
        [
        "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303",
        "sha256:41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e",
        "sha256:1948a4bd00b6f1712667bb2c68d1fe6eb60fbbcdf8bad62653208c23bf2602a5"
        ]


        In the above, jq is just a tool for querying JSON data. You could just visually inspect the output of docker image inspect for the same information if you don't happen to have jq handy.



        Assuming a default Docker configuration using the overlay2 storage driver, you will find these identifiers in /var/lib/docker/image/overlay2/layerdb/sha256/*/diff. So, for example:



        # grep -l 13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303 
        /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
        /var/lib/docker/image/overlay2/layerdb/sha256/13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303/diff


        This first layer is the debian:9 image. We can confirm that by running:



        $ docker image inspect debian:9 | jq '.[0].RootFS.Layers'
        [
        "sha256:13d5529fd232cacdd8cd561148560e0bf5d65dbc1149faf0c68240985607c303"
        ]


        ...so we'll ignore it. Let's find the second layer:



        # grep -l 41494b03ef195ce6db527bd68b89cbebdace66210b4c142e95f8553fcb0bf51e 
        /var/lib/docker/image/overlay2/layerdb/sha256/*/diff
        /var/lib/docker/image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/diff


        Inside the same directory as that diff file, we'll find a file called cache-id:



        # cat image/overlay2/layerdb/sha256/14347a192896a59fdf5c1a9ffcac2f93025433c66136d3531d7bbb3aec53efc7/cache-id
        118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75


        That cache-id identifies the directory into which the layer has been extracted; we can find it under /var/lib/docker/overlay2/<id>:



        # ls /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75
        diff/ link lower work/


        We're interested in the contents of the diff/ directory:



        # find /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef6
        21ccb70cf14fe672dc74ef75/diff/
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root
        /var/lib/docker/overlay2/118b1e4a401873e1db8849c0821d0280b4cf9ef621ccb70cf14fe672dc74ef75/diff/root/file.txt


        And there it is!





        NB All of the above assumes that you're using the overlay2 storage driver (which is the default on most if not all platforms these days). If you're using a different driver, the layout on disk is going to be different.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 10 at 3:04









        larskslarsks

        11.2k33042




        11.2k33042






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499713%2fhow-to-retrieve-a-file-that-has-been-added-then-removed-in-a-docker-image%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

            is 'sed' thread safe

            How to make a Squid Proxy server?