Deploy using git archive

When deploying code I by now almost always use git archive for that no matter whether that is library code or actually product code that gets deployed onto a (web)server.

Recently I had a chat with someone that so far hadn’t heard of that so I realized perhaps it’s time to write about it.

What is it

git archive is a tool that exports the content of a git repository into an archive file. But while doing so it also uses information in a .gitattributes file to decide whether to include a file into the archive or not and also whether to modify a file.

A lot of people already know about the .gitattributes file as it makes sure that certain files are removed from the archive that is created by github when creating a release.

The documentation for .gitattributes has (amongst a huge number of other information and awesome things that can be done via that file – but that’s for some other time) a special part about creating archives that talks about two attributes:

export-ignore

export-ignore removes files that are tracked in the git-repository but are not to be part of the distribution archive.

Why is that of interest? As a PHP-Developer when I deploy code I also run composer install --no-dev --prefer-dist as I want in my production setup no development tools but also I prefer the distribution ready code of my dependencies. And that does fetch the archive-files from the respective release and adds that to my vendor folder. By removing files that are not relevant for production from my archive, I can reduce the overall size of a deployment.

So what i usually add to my .gitattributes file is something like this:

All these files and folders with the export-ignore attribute will not be part of the archive that is created. And will therefore not deploy to production.

export-subst

The other attribute that we can set is export-subst. This took me quite some while to realize. But having come from SVN where substitutions where *a thing* that was easy to understand.

Whenever we run an export, git will replace certain placeholders in files with respective values. For that to work, the file with the placeholder will need to have the export-subst attribute set though.

Wait.. What?

Imagine you want to provide the user with the information which version they actually are using of your code. That is easy when they cloned your git repo as the information is readily available. But not so when they downloaded an archive from your repo. We loose that information.

Therefore it can make sense to have a file where you want to have the hash and the last commit-date along with the name of the last committer available. That way people know which version in git this archive maps.

Then you can add the following into a file (for example your README.md):

$Format:%h%$ - commited on $Format:%cD%$ by $Format:%cN%$

This will then on export be replaces with something like this:

fb235d2 - commited on Thu, 18 Jul 2024 22:25:22 +0200 by Andreas Heigl

For more infos on which formats can be used, check out the placeholders section of the pretty formats of the git log docs.

Caveat

One thing that bugs me extremely is that one can actually use $Format:%(describe)%$ which will then output either the tag associated with the current hash or the last hash, the number of commits since that last hash and the current short hash.

So it’s either 0.1.0 when the current commit is associated with a tag or it’s 0.1.0-2-g7265c97 meaning the current hash 7265c97 is 2 commits further than the last tag 0.1.0.

This is awesome! I can use that to actually add the release version to all my files on exporting! šŸŽ‰

Well. No! As its calculation might be resource-hungry (due to it having to check for previous commits) someone made the decission to only allow the (describe) placeholder to be replaced once.

Which means, if you want that in multiple files, you will have to come up with some really ugly bash-scripting code in your ci-pipeline when deploying. More on that in a moment. And it won’t work out of the box when using the archive that is automatically created by github or gitlab.

But all in all a pretty cool feature!

Deployments

But what has all that to do with deployments, you might ask yourself.

A lot!

When I deploy code I do not want to deploy unnecessary code. On the one hand to reduce the amount of transferred. On the other hand to reduce the amount of exploitable files.

git archive can help me with that as it allows me to specify which files I actually want to have in a distribution. So I can remove all the config files for CI tools, my complete test-folder and whatever else might be unnecessary directly from my production distribution by adding all those files to the .gitattributes file and then use git archive to get the files I want to deploy.

My deployment scripts usually looks something like this:

Now I have a (hopefully) fully working production environment with replaced placeholders and all dependencies required for production in one archive! All that is left for me to do is move that archive to the production server and extract it there.

In most of my projects I either have a script deploy.sh that does all that along with actually moving the files onto the production server and finalize the deployment or I have a CI-pipeline that does that whenever I push a tag.

In either case I do not have to worry about “How do I do a deployment again”? I just tag and push. And then check whether there is a deploy.sh. If so I run that…

Pros

  • No unnecessary files on production (No phpunit for example)
  • small file to transfer to production
  • No git required on production (some people use it to check out there)
  • No composer required on production
  • Replace Placeholders with metadata from the git-archive
  • Replace tags within multiple files

Cons

  • requires a bit more work initially
  • … ? 🤷
  • what else?

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Find out more about Webmentions.)