Published

Can Git LFS scale for screenshot tests?

Arnold Noronha
Arnold Noronha
Founder, Screenshotbot

So you're trying out screenshot tests/snapshot tests! You now need to make a decision: where do you store your screenshots?

You have a few options:

  • Store it in Git
  • Store it in Git LFS
  • Build some custom tooling using Amazon S3
  • Use an integrated tool such as Screenshotbot

We will explain each of these options in detail. In particular, we want to demonstrate that Git LFS is not an optimal solution for your screenshot tests: there are easier alternatives that just scale much better, and can be set up much more quickly.

Limitations of storing in Git

The path of least resistance for screenshot tests appears to be just storing your screenshots in Git. However, this hits issues very quickly.

A single screenshot is about 50kB (for a full-screen feature screenshot). If you have a thousand screenshots, that's 50MB: not bad. (By the way, it's really easy to get to thousand screenshots: between dark mode/light mode, adaptive layouts, phone/tablets, languages, window insets.. the number of variants quickly multiplies)

So we're at 50MB for screenshots... not bad, but remember that Git stores the entire history of screenshots. If you have 100 commits that change almost all of the screenshots (say a font or color change), you'll soon be using 5GB of storage!

In fact, GitHub imposes a limit of 2GB for your repositories (for the free plan), so this is clearly not going to scale. In addition, you'll have to download all the 5GB of screenshots each time you clone the repository, slowing down your CI jobs and slowing down your developers.

Limitations of storing in Git LFS

Git LFS seems to be the most popular option when storing screenshot tests, and it makes sense since it's the easiest migration option from vanilla Git.

Most providers have high limits for how much you can store in Git LFS, so you're not going to hit any limits in practice.

In addition, you no longer have to store the entire history of screenshots as part of your repository, so you're not pulling the entire history of screenshots each time you clone. So from the previous example, if you had 5GB of screenshots in the history you're only cloning 50MB of screenshots per clone.

Many larger companies go a step further and automate the process of screenshot generation, just handling storage of screenshots in Git LFS.

This certainly solves the primary bottlenecks, but we still have some others, and some new bottlenecks:

Needs to clone screenshots on every CI job

Each CI job needs to fetch all of the current screenshots. This slows down the clone step, which blocks CI for all your developers (whether or not they are making UI changes).

Adds sources for networking flakiness

Each time your CI job does network access, you're also increasing the chance of networking failures. This is usually the source of complaints about Git LFS being flaky. The more screenshots times the more CI jobs means more flakiness. Ideally we should be doing file-hashing to optimize which files need to be downloaded or uploaded, and only update files that haven't been seen before.

Fetching image history is slow

An important aspect of screenshot testing is looking at the history of screenshots to bisect regressions. If your screenshots are in Git LFS, the history is going to be slow to fetch, which means developers are unlikely to actually use this ability.

DevOps overhead

Many teams have dedicated engineers just to manage Git LFS. Smaller teams that don't have experience with Git LFS are unlikely to be able to set this up and maintain it.

Developer tooling

You have to make sure every developer knows how to set up and use Git LFS.

How can Amazon S3 help?

A more scalable solution is to use file-hashing to prevent network transfers as much as possible.

Consider that you have a S3 bucket, and all your screenshots are stored in your S3 bucket. Let's name the files /{SHA256}.png.

In your Git repository you no longer store the actual screenshots (either with Git LFS or Git). Instead, you store a JSON mapping from the name of the test to the SHA256 of the generated image. The actual image will be stored in the S3 bucket.

During your CI job, it first runs the test which generates the image. You then look at the file hash: if the file hash hasn't changed, then the test passes. If the file hash has changed, then you upload the new screenshot to S3 and fail the test.

Note that in this new scheme image data is very rarely transferred over the network. This improves clone times, and increases build reliability by reducing the possibility of network flakiness.

However, you do lose the image-comparison tools provided by GitHub. When you create a Pull Request, you won't be able to review the screenshots within your Pull Request. Instead, you'll have to build some kind of custom dashboard to review the screenshots by pulling the PNG files from S3.

Using a dedicated tool such as Screenshotbot

A tool such as Screenshotbot (which is completely open-source) essentially does the previous optimization, but handles it all for you automatically with tight integrations with most of the well known screenshot testing libraries on iOS, Android, Flutter and more.

Since it plugs into your existing pipelines, it simplifies the developer tooling and DevOps overhead instead of building a custom integration with S3.

But Screenshotbot also pre-computes resized and compressed versions of images, which means network traffic can be significantly reduced when looking at the history of screenshots (and it has some powerful tools for bisecting regressions).

Summary

Whichever solution you end up going with, we're glad you're doing screenshot testing! It's one of the easiest ways to increase your test coverage, while improving your developer productivity.

As you scale up your screenshot tests, you're going to need to build tooling around it. In this article, we've explained what you need to know about scaling concerns, but also demonstrated that it can be solved quite simply with an integrated tool such as Screenshotbot.

Did you enjoy this post? Share the knowledge!