This is probably preaching to the choir, but hosting your own FOSS chat is nowadays a very viable way to avoid being dependent on a centralised service like Slack. Your options include:
* Riot.im / Matrix.org (decentralised global network; e2e encryption; open protocol)
* Rocket.Chat (Meteor-based; focus on UX and feature)
* MatterMost.com (clone of Slack UI; open core license)
Uptime aside, you also have to consider the effect of a single point of failure when you self-host. If your in-house communication hub goes down when your site does, it's going to make firefighting that much worse and you'll pay for it in a longer outage.
Here at FB, a lot of day to day coordination takes place via FB products. But production and release engineering communication happens over IRC, especially during major outages. The fallback factor in critical to keeping the plane in the air.
People like to jump on bandwagon or the other, but the real answer is - it depends.
With Slack, the application itself is probably pretty tough, but for a lot of businesses their infrastructure and connectivity TO Slack (ie internet/WAN) is probably not very resilient. So for a lot of smaller outfits I'd say that Slack is better.
But if you're a large org and your infrastructure is very resilient and diverse, then you're probably better off self-hosting - assuming you can leverage your existing infrastructure to do so.
The biggest benefit of slack for me is their search. All messages are indexed ready to be searched. Code snippets, images, giphy, attachments, bots. It’s a whole ecosystem, not easily replicable with irc.
1. Self hosting doesn't have to operate at the scale of slack, so there's a whole slew of issues avoided. Pushing text messages around really isn't that difficult when you aren't serving millions of customers.
2. You can perform maintenance outside of office hours, with SaaS you don't get to decide when an upgrade (and potential outage) happens. I don't care about 99% uptime, I care about having 99% uptime while I'm working.
Such as? If you've got less than 1000 users then you need an extremely basic server, a raspberry pi should more than suffice. Then you've just got a little bit of manual (or automated) administration, software updates and backups mostly.
I really didn't expect my post to be so controversial, is the HN crowd really so terrified about running there own hardware?
I'm guessing that you're being downvoted because there's a lot more to consider. I agree that it doesn't take much hardware these days (most single-board computers would work perfectly well) to service <1k simultaneous chat users with efficient server-side software (e.g. UnrealIRCd or ejabberd). However, to make it as reliable as Slack (99.99% monthly uptime is their SLA) for the price they offer it ( https://www.slack.com/plans ) would likely take considerable engineering effort. Sure, you could set it up, toss it in a closet, and it might have 100% uptime for a year...until it doesn't. If chat is business-critical, there are chat companies that have profit motive to deliver a good service. If chat is a nice-to-have at a company (and you e.g. don't have to worry about data retention laws / compliance stuff), maybe it's fine to run it on an rPi / t2.micro (free) AWS instance.
Luckily, there are a ton of great free and paid options out there these days!
For $6670 a month (price for 1000 users), I’m pretty sure most people here can spin up two VMs in two different colos, and setup IRC servers or whatever.
99.99% uptime means it can be down for a few minutes a month, so all it needs to do is fail over properly. In practice, it will probably have many more than 4 9’s.
I think the real reason slack does well is ease of client + service setup, the brain-dead UI, lots of feature creep that a few people care about, mobile clients, etc, etc.
I’m not a huge fan, but it could be worse. At least they didn’t leak everyone’s password like hipchat did.
In most of the world that will buy you a decent mid level developer at least, a great senior or two in many. Even if it's below market, if this was my pet OSS project I'd happily take a pay cut to get more job satisfaction.
Generally yes, given the reasons others have said. Other than that, at the very least, outages can be dealt with more proactively when you have your own setup. Third parties won't have the same priorities that your company does.
Since Slack's main business is chat, they have a pretty good incentive to get everything working again ASAP. Here's their SLA for "plus plan" and Enterprise plan:
Our Plus plan Service Level Agreement (SLA) guarantees a 99.99% monthly uptime1
We’ve designed our SLA to be simple and transparent — based directly on the information we make publicly available on
Slack’s System Status page.
If we fall short of our 99.99% uptime guarantee, we’ll refund customers on the Plus plan 100 times the amount your
workspace paid during the period Slack was down.
Chat is a commodity these days. For most businesses, it probably makes more sense to just let the companies in the business of offering paid chat services do their thing.
Don't see why you were voted down on this, since it's true. Slack working to get things running again doesn't mean they're prioritising your companies particular instance or region. They're likely to be making sure their own region and their own stuff is up and fixed first, so anyone away from the east coast of America is likely to get seen to after that. It would be stupid to do it any other way, since slack employees are likely affected as well and they're the ones trying to fix it. Down voting someone pointing that out is pretty fanboi-esk or really naieve.
Pretty much, if you don't own the service, you don't get to decide where in the queue you are for a fix.
I run an XMPP server for my friends. We use Conversations [0,1] on Android and BBOS, and Zom [2] on iOS. We use OMEMO [3] for encrypting most of our conversations, and while it isn't perfect, it usually stays out of the way.
Generally, the experience with the mobile clients has been quite good. Conversations and Zom are stable, attractive, and featureful. The biggest issues are some interoperability problems with desktop clients (displaying messages that should be hidden) and some things which I believe are server-side configuration issues.
Zom hides a some useful configuration features (in the name of being dead-simple to use), so I'm trying to convince one of my iPhone-owning friends to try ChatSecure [4].
I run my own Mattermost server and the mobile version is very lookalike the Slack one (I only use chat so I don't know if there's other functionalities on Slack that are not in Mattermost)
Pretty sure they provide an IRC interface, but almost certainly don't use IRC internally. There's almost no way they could support any of their fancier features using IRC. Reactions etc would be horrible to implement.
Yeah and the IRC interface has been getting worse recently. When they added the shared channels across teams they completely broke being able to '@' a user from the IRC gateway. Support said something along the lines of 'yup and we're not planning on fixing it'.
I'm expecting them to completely turn off the IRC gateway in the next year or two.
I've switched to using TwistApp (https://www.twistapp.com) with my team. Unlike Slack where you have channels where everyone talks about everything, TwistApp bases conversations around threads. Every problem that's being worked on has its own thread. Once it's completed, I close and archive the threads. Very effective for getting things done as every task is isolated in a separate thread and discussions don't overlap.
On the other hand, makes it sound more likely to be a routing/reverse proxy issue instead of (say) a database issue. Those sound easier to deal with via a rollback vs something like "oops we dropped a critical index on the `messages` table".
Yeah, it alternates loading telling me everything is fine, and just giving me a nginx 500 error. Seems that the status page should be hosted differently so it can be up even when other things are having issues.
Returned for me when I was looking but took ~30 seconds to do so. They should host their status page on a separate domain and use something like CloudFlare in front of it to help with sudden spikes in traffic. Another alternative is to use Twitter / Facebook as the status page and let them deal with the traffic spikes, or just serve static HTML.
I'm hoping they publish a public post-mortem. Learning from this kind of outage is the best kind experience for engineering - though it's far better when only staging goes down and not prod.
Then why did the Slack Status page have so many problems at the same time? Half the time loading it would give a 500 Internal Server Error, 45% of the time you'd get broken resources (images and/or CSS), and only 5% of loads would give you the full working page.
Maybe because it's under a lot more load during an outage and they haven't upsized the status page infrastructure to handle their ever increasing user base.
And of course today's the first day we're using Slack for audience Q&A at a conference. 360 folks in a room now have to...raise their hands! So barbaric.
Slack is currently down and I've realized, for better or worse, what Slack has really done.. It's created an expectation for immediacy. I thought about sending my question to someone via email but then just thought, "I'll wait for Slack to be back up, it'll be faster anyhow".
If slack being down means you lose all insight into your build process and code management, you seriously need to introduce a secondary option immediately.
OP didn't say "all insight". There's a difference between being unable to see a stream of change events and not being able to see the current state of the system. The latter is completely unacceptable, whereas the former is just annoying.
I'm sure they have fallbacks, but when their ecosystem (apparently) evolved around Slack, the fallbacks are less effective. Polling Jenkins to see when your job is done is more time consuming than receiving a Slack message.
Reminder why your status page should be hosted in a very different way than your regular infrastructure... so you're way less likely to end up with issues on both at the same time.
I like how statuspage.io even has metastatuspage.com in case their primary domain/DNS/TLD has issues.
Reminder that you should check things first before commenting. Slack's status page is in a different infrastructure, somewhere using digital ocean while slack.com is using AWS.
Just a simple typo. The keys are pretty close together if your finger slips, and I imagine they have enough problems distracting them from proper spellchecking at the moment. :-)
Looks like it. When I first loaded HN 2 minutes after this outage started, the story was #2. Then I refreshed after 3 minutes and it wasn't on the front page at all. Used the search tool to find it and then upvoted.
If you see something like this and you think it's in error, you can let them know and they'll likely be able to respond more quickly. There's a contact link in the footer.
No, I'm not. In my experience the mods are quite responsive, and have explained site behavior on more than a few occasions. They've also adjusted flags and weights of submissions if they identify an issue.
Slack has a nice market share, but also many competitors, many of them 100% ripoffs with the same features (to name a few, Attlassian HipChat and MS Teams... not to mention open source products).
Slack has been experiencing service degradation often lately, so I would not be surprised if people start switching.
In our team we already started looking for an alternative.
* Riot.im / Matrix.org (decentralised global network; e2e encryption; open protocol)
* Rocket.Chat (Meteor-based; focus on UX and feature)
* MatterMost.com (clone of Slack UI; open core license)
* Zulip.org (all about threads!)
* ...or indeed IRC or XMPP.
(disclaimer; I work on Matrix).