In a non-profit that I collaborate with [1] we use private repos to keep the ser...

cloverich · on May 11, 2016

You shouldn't be keeping api keys or other sensitive information in git at all. And please note -- if you do remove it from git, it will be available in your git history so that needs to be taken care of as well (should the repo ever become public -- a common "exploit").

geofft · on May 11, 2016

Git is just a format for storing data with a record of how that data changed. Saying you shouldn't store it in git seems rather like saying you shouldn't store it in btrfs. It's true that if your btrfs disk image becomes public, the data is recoverable, and it's hard to reliably scrub deleted files from btrfs, but that doesn't mean it's the wrong tool for a filesystem (or a repo) that stays entirely internal.

Saying that you shouldn't keep it on GitHub is different, and I might be more inclined to agree with that, but it still seems like it's not a 100% rule.

Xylakant · on May 11, 2016

> Saying that you shouldn't keep it on GitHub is different

I'd be willing to argue about that, but for my newrelic api key, a private github repo is sufficiently safe - even if I'd prefer if nobody starts having his servers report to my account.

neoeldex · on May 11, 2016

A git repo is usually shared over multiple machines/developers. So the chance of someone publicating it is larger. As well as the entire history is usually copied everywhere

geofft · on May 11, 2016

All of these machines and developers have legitimate access to the secret in question, though. Hence my framing of git as just a file storage format—any other mechanism provides the technical means for any of these machines or developers to publicize it. (And a few other simple mechanisms, like "scp the secret from another machine" or "copy/paste it with your terminal", have an increased risk of doing so by accident. Accidentally making a git repo public is generally unlikely.)

Xylakant · on May 11, 2016

Why would that be a general rule. I track my personal passwords in git (using pass). I'm the only person with access to that repo. I just like to have a history - and the convenient way of moving files around and merging changes.

rdancer · on May 11, 2016

> Saying you shouldn't store it in git seems rather like saying you shouldn't store it in btrfs.

Best practice is to avoid storing secrets in plaintext, or sharing secrets between users/roles. Yours isn't an argument in favour of git, it is an argument against btrfs.

(I don't have any problem with storing passwords in that way, I'm just pointing out why it's not the best practice.)

geofft · on May 11, 2016

> Best practice is to avoid storing secrets in plaintext

How do you store them, then? If they're encrypted with a password, how do you store that secret?

I'm pretty sure best practice is in fact to store things like SSL private keys, cookie HMAC secrets (e.g. Django's SECRET_KEY), and so forth on local disk unencrypted, protected by only filesystem permissions (and the host OS as a whole protected with standard means). In fact I'm not even sure it's possible to store OpenSSH private keys unencrypted.

> or sharing secrets between users/roles.

There's only one role here: the application that has an API key. There are multiple developers of that application, and possibly multiple instances of that application, but it's a single role.

nhaehnle · on May 11, 2016

OpenSSH client private keys can be stored encrypted - that's what ssh-agent is for: it allows you to enter the key passphrase only once and then remember it for the rest of your desktop session.

OpenSSH server private keys, on the other hand - I don't think that makes a whole lot of sense. Unless you have a threat model that forces you to encrypt the entire server disk, but then adding private key encryption on top of that doesn't make much sense either.

geofft · on May 11, 2016

Right, exactly. (I did mean to say "server", thanks.) It sounds like the secrets in question are essentially analogous to OpenSSH private keys: they allow a server / service to prove its own identity to others, and the servers should be able to launch automatically at boot so there's not a reasonable place to enter a passphrase.

rdancer · on May 11, 2016

TPM.

It's just one role, but multiple users.

geofft · on May 11, 2016

Do you have any tools you recommend for that? I love TPMs, but this seems wildly impractical for a small project with developers who aren't excited about becoming TPM experts.

Also, does this rule out hosting on clouds that don't offer vTPM support? (Are there any that do?)

rdancer · on May 11, 2016

There are dedicated discrete HSMs that can be installed. That's what I would do. Or, rather, wouldn't. I agree with you that it would be very impractical, unless the platform has a first-class API:

Chrome OS uses TPM heavily[1], and iOS has the Security Enclave. The standard TPM API is PKCS#11, so any hardware that speaks it can be used with any software that speaks it.

Problem with TPM is that the whole hardware and software stack needs to be secure, which in practice means it needs to be designed top-down with awareness of the TPM, and audited. The secrets must not be cached, written to file system, kept in memory, leaked over network. There are implementations such as Trousers[2], but it's more or less just a proof of concept; it may provide additional security, but most likely you're just using a very complex lock, and leaving the key under the mat.

[1] https://www.chromium.org/developers/design-documents/tpm-usa... [2] http://trousers.sourceforge.net/man/tpmtoken_setpasswd.1.htm...

Xylakant · on May 11, 2016

care to explain why? I need to keep my API keys somewhere so I can roll them out to the machine. Keeping them in git is as good as any storage - what would you propose instead? A shared dropbox account?

uxp · on May 11, 2016

The newfangled approach is something like HashiCorp's Vault, which is a dream when you're looking at more than half a dozen systems with similar roles. A different approach that I like to use for single or smaller cluster systems is Ansible's Vault and rolling out config files based on templates per environment. All actual config files are gitignored so I don't have do deal with conflicts on the server if I use a git-pull style deployment, and ansible itself can backup/version whenever they change.

Additionally, git does keep that history (as it's supposed to), so if you just delete the key from a private repo as you're trying to make the repo public, it's trivial for someone to walk the commit history looking for historical API keys that might not have been rotated. In order to purge that information from git, you then have to go re-write the commit graph from the point of the key's insertion (with it removed) all the way to the present. It's not impossible to do, it's just a major pain.

Xylakant · on May 11, 2016

I'm aware of the implications concerning the history, but sorry, the machine park is two machines. Setting up vault would just be total overkill. The people that have access to that repo change like once every few years. The repo will never go public. Let's keep the solution at least somewhat tailored to the problem.

uxp · on May 11, 2016

Hey, I'm not arguing one way or the other. I like using Ansible for configuration in the way I work. I can trust that I can show my best friend and my worst enemy my project and they won't have the capability of making my life hell. Rock on though. Use the simplest solution for the problem at hand. If you're just managing two boxes though, I'd have a hard time coming up with an argument for adding more complexity to the setup to essentially make it unchanged.

Xylakant · on May 11, 2016

It's all chef-based so we could be using encrypted databags, but as anybody with access to the repo has root on the machines anyways, there's little to gain there as well, especially given the very limited security implications. I'd be more worried that somebody adds his account to the sudoers list that stealing the secret data. But hey, things were that way when I joined and there's better places to spend my time to improve security.

neoeldex · on May 11, 2016

All keys are tracked in git's history, so it's a possible attack vector for hackers. You could use https://github.com/sobolevn/git-secret. But beware, anytime you revoke someone's access, you should regenerate all secrets stored in there. In any case you should always regenerate the keys whenever someone's access is revoked

woah · on May 11, 2016

What the hell? Are you defending a decision to keep keys unencrypted in a git repo?

geofft · on May 11, 2016

It's a perfectly defensible decision. The standard cryptographer's reply at this point would be, what is your threat model?

If "A developer could have their GitHub account broken into" or "Someone could break into GitHub deeply enough that they could access private repos" are in your threat model, you shouldn't be using GitHub at all for anything, including code, because it would be straightforward to use that access to subvert your site in other ways. Which is to say, especially for small sites, that's not a useful threat model.

If "You might do a git commit to remove them, then push the repo somewhere" is in your threat model, then the answer is just "Don't do that" (or more precisely, "Make sure everyone on the team understands that can't be done without precautions"). The easiest way to don't-do-that is to have them in a separate git repo from your code. But either way, as projects grow, there's going to be stuff in your git history you don't want to be public (like, oh, git commit -m "Implementing this stupid feature because this customer is stupid") because human error happens sometimes. So if you want to publish a previously-private codebase, the only robust approach is to copy all the files into a new non-git repo and make a new commit.

And the other part of the cryptographer's reply is, where else are you going to store the secrets and what are its security properties?

Xylakant · on May 11, 2016

yes, indeed I am. I'm all in favor of keeping the tools used to a level where the effort makes sense to protect the value of the goods. I totally could lock up my newrelic api key in a bank safe, double encrypted with two persons 4096 bit GPG keys, but that would be a little overkill, wouldn't it? Do you do that? I'd be moderately annoyed if somebody started pushing false metrics to my NR account, but that's about all the damage they could do with the information in that repo. So what level of effort would you propose?

lox · on May 11, 2016

Agreed, we do this as well at some scale. The vast majority of application configuration falls into this category. The advantage of storing them in a git repo (we use a different git repo to the main codebase) is that you can re-use the same access control mechanisms (note that is not the same as giving the same people access to the different repos) and you get strong change history.

imdsm · on May 11, 2016

> we use private repos ... that contain sensitive information (user data).

Wait, what?

Xylakant · on May 11, 2016

email addresses and (account) names of people reporting bugs in private. Some people prefer it that way. Nothing "sensitive sensitive". Sorry for being unclear.

therealmarv · on May 11, 2016

never store sensitive data like API keys in a repository. Or you can do that but encrypt it so that nobody which can view your repo (even if it's private) can use that data immediately. It's like storing passwords in plaintext in a DB. Every (DB) admin will tell you: Don't/Never do that.

I've recently also have to do with this problems while doing server setup with a private repo. I'm using Ansible and Ansible Vault to encrypt sensitive data and the encryption key itself is only accessible (a password safe) to certain members of our team http://docs.ansible.com/ansible/playbooks_vault.html

Xylakant · on May 11, 2016

see, the whole repo is accessible to the members of the team that are allowed to see the secret - basically the two folks that have root on the machine anyways. There's very limited use in encrypting the repo. There are no SSL keys or any secrets that would require tight security. It's basically our newrelic and some other api keys for reporting services. Even if that repo would be breached you could only start sending fake data to those services.

I'm more concerned about someone hacking the machine than someone hacking github to access the repo and retrieve the newrelic key from there.

therealmarv · on May 11, 2016

ok, agree. That is not that critical.

thomasahle · on May 11, 2016

And you share the passwords with enough volunteers that per users pricing becomes a problem?

Xylakant · on May 11, 2016

I don't handle the account in that case, so I can't even say if it's free or not. I was just replying to the implied question "why would a nonprofit org with an OS project need private repos?"

pc86 · on May 11, 2016

Fix your security.

rdancer · on May 11, 2016

Or... else?