One in every 600 websites has .git exposed

One does not simply git pullFor web developers, exposing your .git folder to the world is a novice mistake. It allows anyone to download your entire source code repository, which often includes database passwords, salts, hashes, and third party API keys or usernames and passwords.

Over the years, for another personal project, I’ve built up a database of 1.5m reasonably respected domains. These are all either authority sites (such as the BBC or Guardian, or government, educational or military domains), or are domains that have inbound links from at least one of those types of sites.

Out of those 1.5m sites, 2,402 have their .git folder exposed and downloadable. That’s 1 in 600 decent respectable sites, or 0.16% of the internet, that is dangerously exposed.

Some of these .git repositories are harmless, but from a random sample many contain dangerous information that provides a direct vector to attack the site. Hundreds listed database passwords, or included API keys for services such as Amazon AWS or Google Cloud. Others included FTP details to their own web server. Many contained database backups in .SQL files, or the contents of hidden folders that are meant to be restricted.

One prominent human rights group exposed every single person who had signed up to a gay rights campaign (including their home address and email addresses) in a CSV file in their Git repository, publicly downloadable from their website. One company that sold digital reports provided its entire database of reports free of charge to anyone who wanted to download their .git folder.

So developers, please, please check that your .git folder is not visible on your website at http://www.yourdomain.com/.git/. If it is, lock it down immediately. Ideally delete the folder and find a better way to deploy your code, or at least make sure access is forbidden using an .htaccess. Then assume that someone has downloaded everything already and work out what they could have seen. What passwords, salts, hashes or API keys do you need to change? What data could they have accessed? What could they have done to alter or impair your service?

And then please spread the word among other developers too – because right now this must be one of the biggest holes in the internet.

38 thoughts on “One in every 600 websites has .git exposed

  1. Great post and thoughts!
    I’ve found a very useful tool for a case where you find out that somewhere in the history of your repo you did expose some keys is the BFG-repo-clener (https://rtyley.github.io/bfg-repo-cleaner/) which will rewrite your repo’s history *way* faster than the regular git approach and will allow you to keep your repo history while not expose your stuff any longer.

    Still, step #1 should always be to reset-regenerate keys of anything you were using and *always* use env variables for keys instead of files.

    1. You’re missing the point: if it’s ever been exposed, you should assume at least one person has already downloaded it. Rewriting your repo after that is pointless unless you have proof positive that nobody has already caught you out. Even then, you should consider if anything in your git repo would give them enough access to hide their tracks…

  2. I discovered almost the exact same problem in a very large production site once, except it was the .svn directory. I wondered how bad the problem was, but never got around to testing it. You’d think this would be a standard check in Nessus & the like.

  3. For an article pleading with users to hide their .git repos, you don’t provide even a .htaccess or nginx rule showing how.

    1. nginx doesn’t expose .git folders by default. The solution is to stop using apache with the insane directory listing default.

    2. A total stranger points out something useful, and suddenly he owes you even more information? Crawl back into your cave, please.

    3. If you can’t Google something simple like that, maybe you should not be messing with the configuration of your servers at all. Just call somebody that knows what he/she is doing and let them do it.

    4. Thanks for your thoughts! I did think about that, but I’d really rather you didn’t do it in .htaccess or with an nginx rule, but instead just deleted the .git folder and find a different way to deploy.

    5. If you must have your .git folder on your web server, you can also have it one directory up and therefore outside of the public_html/www directory.
      Thanks for the article. Very interesting. I was just discussing this with some other devs a few weeks ago!

  4. If one still wants to use git, it should be better to move the .git out of the web-root completely and specify –git-dir and –work-tree when working with the repository. This way the .git cannot be exposed even if the .htaccess gets deleted or something similar.

  5. Well nothing secure should be in your GIT repo.

    I personally put my entire website repo on GitHub freely available. The key is ensuring nothing secure is in your REPO or previous revisions.

  6. The internet and the world wide web are related, but not the same, and not even interchangeable. You too can make one fewer rookie misteak here.

  7. # Block access to all hidden files and directories with the exception of
    # the visible content from within the `/.well-known/` hidden directory.
    #
    # These types of files usually contain user preferences or the preserved
    # state of an utility, and can include rather private places like, for
    # example, the `.git` or `.svn` directories.
    #
    # The `/.well-known/` directory represents the standard (RFC 5785) path
    # prefix for “well-known locations” (e.g.: `/.well-known/manifest.json`,
    # `/.well-known/keybase.txt`), and therefore, access to its visible
    # content should not be blocked.
    #
    # https://www.mnot.net/blog/2010/04/07/well-known
    # https://tools.ietf.org/html/rfc5785

    RewriteEngine On
    RewriteCond %{REQUEST_URI} “!(^|/)\.well-known/([^./]+./?)+$” [NC]
    RewriteCond %{SCRIPT_FILENAME} -d [OR]
    RewriteCond %{SCRIPT_FILENAME} -f
    RewriteRule “(^|/)\.” – [F]

    via “https://github.com/h5bp/server-configs-apache”

  8. .git, .svn or any other directories created by version control systems must be deleted. In most cases, the metadata information from these can be easily used to gather the information which can easily give the whole source code and possibly sensitive information on those code. There are tools across the internet (its easy to write one) to rip all the information based on the standard structure of VCS tools (such as .git/config in case of .git). So, disabling directory browsing is not enough. Of course, you can make your rules more strict but my golden rule is to wipe .git as part of deployment process.

  9. Interesting find. In addition to not exposing your .git directory on your website, developers should also avoid storing sensitive information in git in the first place. If you go to github and do a code search for certain patterns you’ll easily find ssh keys, api keys and even passwords in thousands of repos.

    1. Yes that’s how statistics works generally. You take a large enough sample of a population and infer something about the whole population from that sample. This sample size is huge, maybe 100 times what was needed for a 99% degree of confidence.

  10. Some devops practices to prevent this:
    1. never check code out directly to your public web server
    2. script or use tools to build packages containing your code: rpm, deb, docker, zip, tar, anything
    3. script/develop a release and deployment pipeline to push your packages to the server
    4. keep some packages around to roll back
    5. if you don’t want to use packages, rsync your code over and have it ignore these dirs

  11. To block access to .git directory of your application, add below line to your .htaccess file.

    RewriteRule “(^|/)\.” – [F]

  12. I’d been meaning to ask why a spider with your domain as the user-agent keeps looking for .git on all my work domains – i’ve got it set up to email me when someone tries to access .git on my servers, with a view to feeding it in to fail2ban at some point 😉

    On a similar note, i get scanned a lot by people looking for php composer packages with known exploits in the path /vendor/* on our sites too (again a rewrite rule to stop direct access to that directory fixes that, or better yet place the whole vendor directory above the webroot)

Leave a Reply

Your email address will not be published. Required fields are marked *