i think that's true of almost all tech books in paper form
personally i think these work great because i can add cells to inspect data or try experiments easily as i'm reading to help me understand whats going on
These sorts of books have existed since the dawn of notebooks. While I think them useful for demonstrations, I don’t think there is much pedantic value (beyond a more static medium) in all honesty. Well-worked examples a student can consult while attempting to solve problems on their own is still of paramount importance.
I recommend everybody who didn’t check those out to do so.
Not really being interested in ML, but doing all the available most popular courses to keep up, I really liked how fastai doesn’t just teach you ready and known models, but also how to compose differentiable building blocks to design NN’s yourself.
This is not intending to minimize in the slightest the amazing work that Jeremy does - I am a huge fan.
But Fast.ai has TWO co-founders, and somehow, Rachel doesn't seem to get any credit in these discussions (not the book specifically, I'm talking about the overall enterprise). Not quite sure why; A lot of the content on the website is written by her, and it's clear she adds a lot of value to the endeavor as a whole.
Thank you for mentioning Rachel! :) She is working as the Founding Director of the Center for Applied Data Ethics nowadays, which is a very full-time job. So she hasn't been involved much in fastai v2 or the book (other than chapter 3, of which she's a co-author).
She created and taught the NLP and Computational Linear Algebra courses, and has written most of the material on the fast.ai blog, and of course (as noted) co-founded fast.ai. Overall, I'd agree that she doesn't get as much credit as she deserves. That's perhaps partly due to her increasing focus on ethics issues, which aren't generally discussed much on HN (sadly).
I also would say that Sylvain Gugger doesn't get as much credit as he should -- he has been an equal partner with me in creating the book and fastai library.
(I discussed this response with Rachel prior to posting it.)
Another smart move from fast.ai, this book is going to be the state-of-the-art reference for 2020 and a classic anthology of algo techniques for the mid term.
> There are also "agglutinative languages", like Polish, which can add many morphemes together to create very long "words" which include a lot of separate pieces of information. [1]
Polish does not work this way. Source: I am Polish. Perhaps jph00 meant Turkish. Issue filed.
Yes you're right, in our NLP course we used Turkish as our example.
But for the book I mentioned Polish due to this paper: https://arxiv.org/abs/1810.10222 . But as you say, now the word "agglutinative" isn't technically correct. I'm actually not sure what the right word is to describe languages that have lots of big compounds with no spaces. (Which is the key issue here, as to why we need subword tokenization techniques).
Polish will be subtype of synthetic called fusional/inflected which means things need to be adjusted to fit together, agglutinative languages are those that use mainly agglutination where morphemes are stuck together as is:
Since it's a spectrum / categorization based on features, all languages will show these features to various degrees. E.g. the famous "anti|dis|establish|ment|ari|an|ism" in english and "anty|samo|u|bez|przedmiot|owia|nie" as a similar example in polish (both from https://pl.wikipedia.org/wiki/Aglutynacyjno%C5%9B%C4%87 ), or more humble "houseboat" or "bitwise".
There are also polysynthetic languages, which is the name for the extreme of this spectrum, but there are no familiar examples of these (Mayan languages, Ainu, Inuit, Aleut are only i recognize from those mentioned on wikipedia).
The term you are looking for may be "highly inflected".
Side note: IMHO, you are exaggerating the ability of Polish to form long compounds. Dissecting the "Bezbarwne zielone idee wściekle śpią" example from https://arxiv.org/pdf/1810.10222.pdf#page=3 reveals no words longer than 4 morphemes:
bez-BARW-n-e ZIEL-on-e IDE-e WŚCIEK-l-e ŚP-ią, where I put word roots in uppercase and bound morphemes in lowercase.
The longest sequences of morphemes (for a loose definition of morpheme) I can think of are conditional mood of verbs with double prefixes like po-wy-CHODZI-ł-y-by-ście. However, the sequences of bound morphemes in those forms, which may look complex to you, form a finite-state language that admits just a few sequences.
It's not about the number of letters in the compounds, but about the number of morphemes.
Your "powychodziłybyście" example could be translated as "you (feminine, plural) would have been going out". With the word tokenization, you get (ignoring comma and brackets) 8 tokens in English and one token in Polish. Now you can have three persons, two genders, two numbers, an imperfective or perfective verb, etc. resulting in combinatorial growth of word tokens in Polish. If you have all word forms for "go out" and you want to add "go in", in English you would add a single token "in", and in Polish you add all the tokens with "-wy-" replaced by "-w-". As a result in Polish you end up with much bigger vocabulary. Additionally you need bigger training corpus as you cannot learn the tokens independently. For example, if you know the meaning of "he ate" and "she wrote", you should be able to guess the meaning of "he wrote", as you've seen all of the tokens. In Polish it's "Zjadł", "Napisała" and "Napisał" - all of the word tokens are different.
Using the subword tokenization instead of word-level tokenization is kind of similar to using a normalized database instead of unnormalized one. It's not about one form being more complex than the other as they're equivalent. After all, will written English be much more complex if we remove all whitespaces? :)
I agree with what you wrote. I did not object to subword tokenization that let you(?) win the competition. I objected to GP's assertion that one can add many morphemes together to create very long "words" in Polish, which made casual readers think of stringing morphemes like German compounds while the number of morphemes in Polish words is bounded by 7, maybe by 8.
Not really. "German grammar allows for the construction of long compounded noun phrases which are expressed as one word in written language. Compounding is not really the same as agglutination.": https://www.quora.com/Is-German-considered-a-true-agglutinat...
It always amazes me how bad some technical people is at basic promotion:
What is Fastai?
Why do I need it?
Something as basic as an elevator speech that introduces your product in your github page and book intro can mean 10x or 100x more sales.
If you force people into having to search it for you, you have already lost most of them.
For this author it is as you already know everything about Fastai, but if you did, you would not be needing this book in the first place.
It happens a lot to technical writers because they have spent years thinking about a topic, so they could not put themselves in the shoes of someone who does not.
It's sad to see this perspective. With the AI hype, there are so many people spending all their time and money on marketing AI materials while their actual product is all smoke and mirrors.
By contrast, Jeremy and team have proven that "build it and they will come" is not dead. They built high quality courses and quickly became authoritative with no marketing and with full transparency and openness in everything they do.
This book draft looks great. Everyone else is talking about "democratising AI" - this is actually doing it.
We're building and improving our internal machine learning platform. We decided recently to support the fast.ai courses. You get everything you need (notebook, object storage, data, parameters and mettics tracking and deployment). Our colleague teaches at university and we're opening the internal platform to about 30 of her students this week to prepare their final masters project.
They don't have access to compute power (GPUs) or bandwidth to download datasets hundreds of gigabytes of data, which they'll find right there, so this should help them since they don't need to have powerful machines or worry about experiment tracking.
We also have a Publish option to make an application from a notebook in one click behind the scenes with training parameters in a form generated automatically, so they can write scripts and instrument model training.
The fast.ai course will also help current or future members, and other students. It's important for us to make it even easier for people to enter the field.
You are downvoted by fanboys but you are exactly right. I am surrounded by researchers working in DL and I have say at least 40% of them have never heard of FastAI or Jeremy Howard. However folks who are active on Twitter, listening to popular podcasts, popular media, HN etc would be very familiar with name Jeremy Howard and what FastAI is and need no introduction. In research world, an astonishing number of good researchers have little to none online presence. They have little to no time other than keeping track of research papers in their sub-field. It also surprises me when authors sweat for months to churn out 100s of polished pages but couldn’t spend 15 minutes to write a paragraph of introduction in readme.MD.
I guess it depends a bit which field they work in exactly. I'd be rather surprised if rigorous DL researchers in NLP haven't heard of him because I expect "Universal language model fine-tuning for text classification" (and tbh. also "Fine-tuned language models for text classification" due to the universality of the idea) to show up in any half-decent literature review of the field.
Most DL researchers I know also have a pretty good knowledge of available libraries and make it a habit to check them pretty often.
Fast.ai really democratizes the bleeding edge research for the masses, though, that’s why it’s popular among outcasts and outsiders. In general, I would be more wary of people working within closed environments and organizations than people making all they do public and open to review.
fastai is popular among practitioners as well as many researchers rather than just outsiders or outcasts. I've personally learned from it a lot and is amazing contribution. However, there is still a large population that is still unaware and it would be great to have a quick intro paragraph in readme so they know what all the fuzz is about.
Few people here need to look it up; the basic promotion has already been done very effectively for the target audience. Besides, the intro chapter explains very clearly what the book is about and who it's for.
This is a draft. I expect that, when the first finished version is ready, the authors will promote it effectively (IMO they are very good at promoting their courses at fast.ai).
Wow, Orielly lawyers are determined to screw this up. The thing is GPL v3 licensed which means I can’t copy any of the book code in my closed-source product or competitions or even MIT licensed code. The readme says I cannot make copies of this material but it’s ok to fork. Huh?
No, the readme says you can make copies for personal use.
If you want to use code in the book under a non GPL license, then you could just buy the book when it comes out. That doesn't seem like an unreasonable burden.
PS: none of this is anything to do with O'Reilly or their lawyers.
Wait... so if you buy book then it seizes to be GPLed? This is quite confusing. For DL research, most code is MIT licenced and legal folks at many industrial labs would be quite hesitent to permit use of code from this repo with feels like legal minefield with different restrictions spread over multiple places including LICENSE, README, fastai website and perhaps printed book. I would highly recommand converting to one simple MIT license and call it a day (except for markdown cells).
I don't get this perspective about the GPL. Look, they are giving you something for free, including the source code and the right to build upon it and publish modified versions. You can do basically whatever you want with it, as long as you pass on the freedoms that were granted to you. Is that unfair? Enjoying getting freedoms but not passing them on is not nice.
I get GPL and fully appreciate its philosophy. The problem happens when you actually use it in practice. Because of its viral nature, anyone with different licensing must convert to GPL if they use your code. For many scenarios, this is actually not possible not just because of commercial secrets but the potential for opening up for security vulnerabilities when you don’t have resources or competitions where you should keep code secret until some time or simply because you have dependencies on other code which is very expensive to get rid off. Due to this reason, many companies forbid the use of GPL licensed software as well as release anything under it (because then you can’t use your own code!). Many other companies simply don't want the headache of checking all of their mess of legacy codebases with a myriad of dependencies that would be hard to untangle into GPL compatible open-source release. The legal and economic overhead when you use or release GPLed code is non-trivial. For this reason, the vast majority of open-source code released by big tech companies on GitHub is MIT/BSD licensed, which ironically is more "freeier" than GPL.
> The problem happens when you actually use it in practice.
I think it's fair for any code publisher to require that the freedoms he/she gives with his/her code never get taken away and it or its modified versions can never get locked up or used in opposition to the wishes and interests of the users (i.e. the users retain the ultimate control over modifying behavior by modifying the code).
> Due to this reason, many companies forbid the use of GPL licensed software as well as release anything under it (because then you can’t use your own code!)
It seems you have a misunderstanding here, and I think it's a common one. You can use your own code in any way you want. You own the copyright, you decide the rules. And you don't need any agreements with yourself. Further, you can release your code to multiple people, each with any license you want. You can also sell proprietary licenses to companies that prefer it, while also releasing the same code under a GPL license to the public.
Definitely excited to check this out; thanks Jeremy and Sylvain!