Channels, permissions and DMs are more powerful in discord.
But there's no separation of workspaces like in slack. Rather, they have servers. You log into discord with one email, and all its associated servers are shown. So unlike slack, you can't have different servers each with different email ids to login to.
How does this work in light of how some people, including myself, use slacks:
Work slack
N Side-project-team slacks
Personal Slack
---
I cant have work-slack email address to login to my personal slack, which is just for me. Further, need no connections between various side-project/side-interests slacks.
I also participate in Discords, but not actively enough where I am as fluid in using discord...
Amit's blogs get posted quite a bit, but they're a pleasure to go through every time. My favorites are his posts about noise and map/terrain generation with noise -
Is there is a guide to help setup a self-hosted version fo this app? I'd assume there would be a server which does the actual parsing and syncing, along with a web app which is used to render onto a web page?
I absolutely love the idea by the way, can't wait to try it!
With deepfakes this was being discussed. When a group of redditors managed to create a network architecture to put celebs in porn with surprisingly decent results (in some cases), I think a sponsored research group can get better results.
I'm glad we're again concentrating on newer language models.
Curious how it'll perform compared to fasttext when used as encoding network in larger tasks. I can't help but notice the trend of going back to simpler models with smarter optimizations and regularization to achieve better results.
This is a frequent question of mine, which I ask to everyone using RNNs - what do you think of the idea that CNNs will be able to replace RNNs for sequence tasks [0]? CNNs are less computationally expensive too, so there's a definite benefit of switching to them if the performance is on par.
fasttext is just an encoding of the first layer of a model (the word embeddings - or subword embeddings). Full multi-layer pre-trained models are able to do a lot more. For instance, on IMDb sentiment our method is about twice as accurate as fasttext.
As to whether CNNs can replace RNNs in general, the jury is still out. Over the last couple of years there have been some sequence tasks where CNNs are state of the art, some where RNNs are. Note that with stuff like QRNNs the assumption that CNNs are less computationally expensive is no longer necessarily true: https://github.com/salesforce/pytorch-qrnn
I'd be surprised if for tasks that require long-term state (like sentiment analysis on large docs) whether CNNs will win out in the end, since RNNs are specifically designed to be stateful - especially with the addition of an attention layer.
For instance, on IMDb sentiment our method is about twice as accurate as fasttext.
Seeing as fasttext accuracy is 90%+, does this mean your method achieves 180%?
I'm nitpicking of course, but lately I've seen claims like "20% improvement in accuracy", where on closer inspection, the authors mean error rate dropped from 5% to 4%.
Which is not bad of course, but in the grand of scheme of things, 1% absolute improvement may not be such game-changer, especially if it comes at the cost of other relevant metrics like model complexity, developer sanity or performance.
(haven't read your paper yet, just a general sigh/rant)
This generally is the metric you care about - a difference of one percentage point can be an improvement of twenty percent, as that means that the total number of "bad events" that you expect to get when running the system is decreased by 20%. And it's quite reasonable to assume that here, as in almost all other domains, "x% improvement" means the percentage difference (multiplicative), not the percentage point difference (subtractive). For pretty much every percentage quantity, things like defect ratios, recidivism rates or financial interest rates, "20% increase" never means an increase of 20 percentage points but an increase by 20 percent of the starting value. If we're nitpicking, "1% absolute improvement" is an inaccurate statement, the improvement should be described as 1pp (or 20%), not 1%.
Especially for more well defined problems, going from 98.5% to 99.5% is "just" 1pp absolute improvement but the fact that you have three times less mistakes can well justify a more complex model that requires ten times more hardware. The metric that you'd actually care about would often be like "number of hours required to correct the mistakes" or "number of lost sales due to mistakes", which all would get modified by the relative percentage change.
Your note on "more well defined problems" is spot on. Chasing single percent improvements and SOTA is indeed the name of the game there.
But defining the problem in the first place, figuring out the cost matrix and solution constraints, is typically the bigger challenge in highly innovative projects. Once you know what to chase, 80% of the job is done.
Disclosure: building commercial ML systems for the past 11 years, using deep learning and otherwise. What you call "metric you care about" is often not the metric you care about. This is why people coming from academia are sometimes taken by surprise that logistic regression, linear models, or heck, even rule-based systems (!) are still so popular. Model simplicity, developer sanity and performance do matter, too.
> but in the grand of scheme of things, 1% absolute improvement may not be such game-changer, especially if it comes at the cost of other relevant metrics like model complexity, developer sanity or performance
fasttext makes errors about 10% of the time, and our approach makes errors about 5% of the time. It's certainly fair to say (although nitpicky) that "accuracy" isn't quite the right term here (I should have said "half the error").
But as for your general sigh/rant... absolute improvement is very rarely the interesting measure. Relative improvement tells you how much your existing systems will change. So if you're error goes from 5% to 4% then you have 20% less errors to deal with than you used to.
An interesting example: the Kaggle Carvana segmentation competition had a lot of competitors complaining that the simple baseline models were so accurate that the competition was pointless (it was very easy to get 99% accuracy). The competition administrator explained however that the purpose of the segmentation model was to do automatic image pasting into new backgrounds, where every mis-classified pixel would lead to image problems (and in a million+ pixels, that's a low error rate!)
Radim: please consider incorporating this into gensim. It really is superior to simpler classification models running on top of word/BPE/wordpiece embeddings and to classic machine learning algorithms used for text classification and topic modeling like HDP, LDA, LSI/LSA, etc. (You can see for yourself how well this works out-of-the-box with a simple exercise: grab a pretrained model from fast.ai, run a bunch of documents through it, grabbing and saving each time the last hidden-layer representation of each document, and then map these representations to a two-dimensional plot with, say, t-SNE.)
I realize that outside of Silicon Valley and other technology centers, most established companies are far -- far -- from adopting deep learning for any application of importance, due partly to the current unavailability of developers with AI expertise, and partly to deep learning's so-called "unexplainability" (i.e., the inability of many corporate executives and machine learning practitioners to reason about it, and their resulting discomfort with it). But it's only a matter of time before Corporate America starts following the lead of companies like Google and Facebook, which today are aggressively using state-of-the-art AI in lots of important applications.
Why not get ahead of this multi-decade trend?
PS. For those who don't know, Radim is the creator of gensim, a popular, friendly Python library for text classification and topic modeling.[a]
Generally % change in failure rate is what you care about, in most fields. e.g. If something "increases your chance of getting cancer by 50%" that doesn't mean it increases the risk to 1 in 2. It just means the risk goes from 1% to 1.5%