Docker, CoreOS, and the entire ecosystem are moving really fast, it is a bit hard for me as a developer to keep up. For some projects we are creating containers with Docker and will need to get some things into production as soon as Docker 1.0 is declared stable.
At the same time I want to switch to CoreOS, I also want to wait just a little while for blogs, tutorials, and tools to catch up to where CoreOS and Docker are at, and be usable in production by people who are not professional devops.
The most exciting thing for me in this post is CoreOS CloudInit. It seems to be one of those tools that a small shop could use. It looks a bit like the yaml for fig, but is something that could be used in production as well. At the moment, I have been trying to solve everything a bit on my own with Makefiles, which include Makefiles for custom variables and Makefiles for commands that can be run against each container. It is working great for dev, but I could never really see how to run the containers in prod. The Makefiles didn't really seem like a prod solution, but CoreOS CloudInit looks like it could work.
We also skipped boot2docker and use vagrant with ssh, so seeing that there is a Vagrant box which will run just like an EC2 box is pretty exciting. I would be excited if Digital Ocean would start supporting CoreOS too.
> Quite a few people would like CoreOS on Digital Ocean
I'm one of those, and currently trying to figure out a hacky workaround to make it possible.
I'm already doing this[1] to get a current Ubuntu kernel running on DO. The same principle would seem to apply: boot into the CoreOS kernel, but with an initrd that mounts a different btrfs subvolume than the "bootstrap" OS.
This would require DigitalOcean to have a btrfs-formatted image, though, because they don't offer any re-partitioning support... maybe having the initrd mount a loopback image containing a btrfs filesystem would work?
I'm getting to the point, though, where it might be less effort to just start my own CoreOS-centered hosting service than to continue with DO...
I looked at this, and had the impression you can contact support at DO and they'll mount a rescue mode image where you could setup btrfs manually. The idea of having to depend on support to just boot an arbitrary image didn't really appeal to me though, so I wound up just using Linode and creating the root filesystems from their Finnix image, which worked fine.
Last I heard btrfs was still not considered a stable main production filesystem. Has this changed recently? Even just late last year I had not heard of production 'success' stories on it.
Unless something has majorly changed, the idea of using btrfs for the main FS is a bit scary.
I've been a big proponent of ZFS, but recently btrfs has been making a lot of good progress with features. Most interesting to me is the ability to rebalance and switch redundancy levels. I have 20TB btrfs array in production as a backup target for a couple months now and it's been doing OK. I've even switched it from raid6 to raid10 and back. There's still a lot of rough edges though, which is why I'm only using it as a backup.
RHEL7 will support btrfs as a technology preview, which means they'll likely support it fully in a point release. That is a HUGE vote of confidence coming from their QA team.
Do any of you know why btrfs is being pushed rather than [the more mature and super flexible] zfs?
There is lots of debate if you search google for comparisons, though they usually seem to end up favoring zfs, which is why I'm a little perplexed here.
Notably, I also ran into a show-stopping kernel bug [0] with btrfs within a day or two of the first production rollout of ShipBuilder [1] (it's an open-source self-hosted PaaS Heroku clone). Since switching to zfs there have been zero file-system related issues with any ShipBuilder production or staging environment so far as I am aware.
Are you sure? I thought that ZFS for Linux can be distributed, but, only in source code form. So you would have to distributed the source code, then, build the ZFS module for it after / during installation.
As I get it, ZFS core (as necessary for GRUB to read the filesystem) is available under GPL. While not enough, this could've been a starting point.
However, the main issue is ZFS is also patent encumbered, so re-implementing (even with clean-room approach) it may mean meeting with Oracle's lawyers.
That's sad, considering btrfs still suffers from internal fragmentation issue, as Edward Shishkin had shown in 2010.
That's pretty much it, except that nobody big has actually tried doing that (distributing an install CD which specifically builds ZFS as part of the standard install process), therefore it's not been in courts, therefore it's not proven that it's legally safe to do so, therefore anyone larger than a small business is unlikely to build on top of it.
Take a look at the video/demo of the Snapper toolset built on top of btrfs. Might give you a better feel for maturity level and the tooling that it enables: http://snapper.io/
I'm not, but I'm watching several of the Github repos and doing fresh EC2 installs and tests whenever they release a new version. I'm very much a fan of the project, but I would not use it in production at this point.
CoreOS is still alpha, which is exemplified in this release, which requires a fresh install and isn't upgradeable from the previous release with their auto-update feature.
Also the team is still working out where to put things, filesystems, and how to best integrate etcd, networkd, systemd, etc., with their fleet system for a fully automated, auto-scalable, auto-discovering, self-healing clustering system. All that stuff has come a long way in recent weeks, and I'm very glad to see a distro geared towards both Docker and clustering.
I want to but I'm waiting until they at least hit "beta".
"During our alpha period, Chaos Monkey (i.e. random reboots) is built in and will give you plenty of opportunities to test out systemd." <- https://coreos.com/docs/quickstart/
You may want to clarify in the docs those steps will stop chaos monkey. As I read it, I understood it as it will keep me on a specific version of coreos not stop the chaos monkey. Am I misunderstanding?
The "random" reboots are the update process completing. If there is no update available, your machine will not reboot. Following those instructions will still enable the update to download, but not to be applied until you choose.
Our release cadence through the alpha has been about once a week, so 1 "random" reboot a week.
I guess I might be weird, but I love playing with HA distributed-failover tech for its own sake. fleetd means never having to say "oops, it crashed."
I almost think it's worth it to switch the Chaos Monkey's toggle around: disabled in dev, but enabled in test and prod, to ensure you're using http://12factor.net/ principles in your Docker containers. (I think Heroku does something similar by spontaneously killing containers every once in a while.)
At the same time I want to switch to CoreOS, I also want to wait just a little while for blogs, tutorials, and tools to catch up to where CoreOS and Docker are at, and be usable in production by people who are not professional devops.
The most exciting thing for me in this post is CoreOS CloudInit. It seems to be one of those tools that a small shop could use. It looks a bit like the yaml for fig, but is something that could be used in production as well. At the moment, I have been trying to solve everything a bit on my own with Makefiles, which include Makefiles for custom variables and Makefiles for commands that can be run against each container. It is working great for dev, but I could never really see how to run the containers in prod. The Makefiles didn't really seem like a prod solution, but CoreOS CloudInit looks like it could work.
We also skipped boot2docker and use vagrant with ssh, so seeing that there is a Vagrant box which will run just like an EC2 box is pretty exciting. I would be excited if Digital Ocean would start supporting CoreOS too.
[edit] Quite a few people would like CoreOS on Digital Ocean - http://digitalocean.uservoice.com/forums/136585-digital-ocea...