More

pixelHD · on Sept 18, 2018

Just finished publishing my blog using gatsby 1.8! Congratulations and great job!

pixelHD · on Aug 27, 2018

Channels, permissions and DMs are more powerful in discord.

But there's no separation of workspaces like in slack. Rather, they have servers. You log into discord with one email, and all its associated servers are shown. So unlike slack, you can't have different servers each with different email ids to login to.

dvtrn · on Aug 27, 2018

Discord does however let you very trivially change your 'nick' for each server you're in. I'll take this over having multiple emails for servers.

https://support.discordapp.com/hc/en-us/articles/219070107-S...

_9jgl · on Aug 27, 2018

Although you can change your nick, you can't have a per-server profile picture, which may or may not be a problem for business use.

samstave · on Aug 27, 2018

How does this work in light of how some people, including myself, use slacks:

Work slack

N Side-project-team slacks

Personal Slack

---

I cant have work-slack email address to login to my personal slack, which is just for me. Further, need no connections between various side-project/side-interests slacks.

I also participate in Discords, but not actively enough where I am as fluid in using discord...

pixelHD · on Aug 16, 2018

Amit's blogs get posted quite a bit, but they're a pleasure to go through every time. My favorites are his posts about noise and map/terrain generation with noise -

https://www.redblobgames.com/articles/noise/introduction.htm...

https://www.redblobgames.com/maps/terrain-from-noise/

amitp · on Aug 17, 2018

Thank you! I'm working on a new map project using some of those techniques.

benrbray · on Aug 16, 2018

He updates them every so often too! I hadn't seen them in a few years and I was impressed by all the new visualizations when I checked recently!

pixelHD · on July 21, 2018

Is there is a guide to help setup a self-hosted version fo this app? I'd assume there would be a server which does the actual parsing and syncing, along with a web app which is used to render onto a web page?

I absolutely love the idea by the way, can't wait to try it!

pixelHD · on July 12, 2018

Aren't they supposed to be the same GPUs as before, but with a little overclock? Hence the _x_ branding to the name

pixelHD · on July 9, 2018

With deepfakes this was being discussed. When a group of redditors managed to create a network architecture to put celebs in porn with surprisingly decent results (in some cases), I think a sponsored research group can get better results.

- https://www.brookings.edu/blog/order-from-chaos/2018/05/25/t...

- https://www.msnbc.com/hallie-jackson/watch/fake-obama-warnin...

- https://techcrunch.com/2018/06/04/forget-deepfakes-deep-vide...

pixelHD · on July 5, 2018

Wow! I had to implement a small 3 layer network in C for a class, and it was interesting. It was a ton of work though, but I learnt a lot.

Will go through the source code for the kicks. Thanks!

pixelHD · on May 17, 2018

True! The fact that you'd be able to use pretty much any GPU - even integrated/AMD/NVidia would really help.

Speaking of which, doesn't the new Vulkan API include compute too?

grovesNL · on May 17, 2018

Yes, Vulkan includes compute capabilities. For example, here is a relevant section about compute pipelines in the Vulkan specification: https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspe...

pixelHD · on May 17, 2018

Wow, would be nice if this was integrated into the OpenAI environments

mooneater · on May 17, 2018

if you look at the github readme, it is openai gym compatible.

pixelHD · on May 15, 2018

I'm glad we're again concentrating on newer language models.

Curious how it'll perform compared to fasttext when used as encoding network in larger tasks. I can't help but notice the trend of going back to simpler models with smarter optimizations and regularization to achieve better results.

This is a frequent question of mine, which I ask to everyone using RNNs - what do you think of the idea that CNNs will be able to replace RNNs for sequence tasks [0]? CNNs are less computationally expensive too, so there's a definite benefit of switching to them if the performance is on par.

[0]: https://twitter.com/lmthang/status/989261575482560513

jph00 · on May 15, 2018

fasttext is just an encoding of the first layer of a model (the word embeddings - or subword embeddings). Full multi-layer pre-trained models are able to do a lot more. For instance, on IMDb sentiment our method is about twice as accurate as fasttext.

As to whether CNNs can replace RNNs in general, the jury is still out. Over the last couple of years there have been some sequence tasks where CNNs are state of the art, some where RNNs are. Note that with stuff like QRNNs the assumption that CNNs are less computationally expensive is no longer necessarily true: https://github.com/salesforce/pytorch-qrnn

I'd be surprised if for tasks that require long-term state (like sentiment analysis on large docs) whether CNNs will win out in the end, since RNNs are specifically designed to be stateful - especially with the addition of an attention layer.

Radim · on May 15, 2018

For instance, on IMDb sentiment our method is about twice as accurate as fasttext.

Seeing as fasttext accuracy is 90%+, does this mean your method achieves 180%?

I'm nitpicking of course, but lately I've seen claims like "20% improvement in accuracy", where on closer inspection, the authors mean error rate dropped from 5% to 4%.

Which is not bad of course, but in the grand of scheme of things, 1% absolute improvement may not be such game-changer, especially if it comes at the cost of other relevant metrics like model complexity, developer sanity or performance.

(haven't read your paper yet, just a general sigh/rant)

PeterisP · on May 15, 2018

This generally is the metric you care about - a difference of one percentage point can be an improvement of twenty percent, as that means that the total number of "bad events" that you expect to get when running the system is decreased by 20%. And it's quite reasonable to assume that here, as in almost all other domains, "x% improvement" means the percentage difference (multiplicative), not the percentage point difference (subtractive). For pretty much every percentage quantity, things like defect ratios, recidivism rates or financial interest rates, "20% increase" never means an increase of 20 percentage points but an increase by 20 percent of the starting value. If we're nitpicking, "1% absolute improvement" is an inaccurate statement, the improvement should be described as 1pp (or 20%), not 1%.

Especially for more well defined problems, going from 98.5% to 99.5% is "just" 1pp absolute improvement but the fact that you have three times less mistakes can well justify a more complex model that requires ten times more hardware. The metric that you'd actually care about would often be like "number of hours required to correct the mistakes" or "number of lost sales due to mistakes", which all would get modified by the relative percentage change.

Radim · on May 16, 2018

Yes, that's what I was getting at.

Your note on "more well defined problems" is spot on. Chasing single percent improvements and SOTA is indeed the name of the game there.

But defining the problem in the first place, figuring out the cost matrix and solution constraints, is typically the bigger challenge in highly innovative projects. Once you know what to chase, 80% of the job is done.

Disclosure: building commercial ML systems for the past 11 years, using deep learning and otherwise. What you call "metric you care about" is often not the metric you care about. This is why people coming from academia are sometimes taken by surprise that logistic regression, linear models, or heck, even rule-based systems (!) are still so popular. Model simplicity, developer sanity and performance do matter, too.

jph00 · on May 15, 2018

> but in the grand of scheme of things, 1% absolute improvement may not be such game-changer, especially if it comes at the cost of other relevant metrics like model complexity, developer sanity or performance

fasttext makes errors about 10% of the time, and our approach makes errors about 5% of the time. It's certainly fair to say (although nitpicky) that "accuracy" isn't quite the right term here (I should have said "half the error").

But as for your general sigh/rant... absolute improvement is very rarely the interesting measure. Relative improvement tells you how much your existing systems will change. So if you're error goes from 5% to 4% then you have 20% less errors to deal with than you used to.

An interesting example: the Kaggle Carvana segmentation competition had a lot of competitors complaining that the simple baseline models were so accurate that the competition was pointless (it was very easy to get 99% accuracy). The competition administrator explained however that the purpose of the segmentation model was to do automatic image pasting into new backgrounds, where every mis-classified pixel would lead to image problems (and in a million+ pixels, that's a low error rate!)

cs702 · on May 16, 2018

Radim: please consider incorporating this into gensim. It really is superior to simpler classification models running on top of word/BPE/wordpiece embeddings and to classic machine learning algorithms used for text classification and topic modeling like HDP, LDA, LSI/LSA, etc. (You can see for yourself how well this works out-of-the-box with a simple exercise: grab a pretrained model from fast.ai, run a bunch of documents through it, grabbing and saving each time the last hidden-layer representation of each document, and then map these representations to a two-dimensional plot with, say, t-SNE.)

I realize that outside of Silicon Valley and other technology centers, most established companies are far -- far -- from adopting deep learning for any application of importance, due partly to the current unavailability of developers with AI expertise, and partly to deep learning's so-called "unexplainability" (i.e., the inability of many corporate executives and machine learning practitioners to reason about it, and their resulting discomfort with it). But it's only a matter of time before Corporate America starts following the lead of companies like Google and Facebook, which today are aggressively using state-of-the-art AI in lots of important applications.

Why not get ahead of this multi-decade trend?

PS. For those who don't know, Radim is the creator of gensim, a popular, friendly Python library for text classification and topic modeling.[a]

[a] https://radimrehurek.com/gensim | https://github.com/RaRe-Technologies/gensim

taneq · on May 16, 2018

Generally % change in failure rate is what you care about, in most fields. e.g. If something "increases your chance of getting cancer by 50%" that doesn't mean it increases the risk to 1 in 2. It just means the risk goes from 1% to 1.5%

pixelHD · on May 15, 2018

Just remembered I already asked you about the CNNs, my bad!

Oh wow, didn't realize that these were multi-layer pre-trained models.

Also, started going through the QRNNs, they mention they've updated the AWD-LSTM Language model to use QRNNs, which is what your paper uses!

antirez · on May 16, 2018

Relevant to RNNs vs CNNs -> https://openreview.net/pdf?id=rk8wKk-R-