Using a REPL-based language like Lisp (which I learned from reading advanced Lisp textbooks) for several years before I worked professionally as a programmer enabled me to dive into complex, underexplored parts of the (Python, C) projects at my first job, which meant I basically owned some very important parts of the stack.
Reading and applying ideas from Unix sysadmin books enabled me to be comfortable jumping around the OS very quickly, giving me a speed advantage even when the computer I used was pretty underpowered, relative to my colleagues, early in my first job.
1. You get a state-of-the art training loop that includes the latest research findings such as 1-cycle.
2. Yes. You can make everything project-specific if you want to. e.g. Image segmentation datasets are just slightly modified Dataset classes.
3. fastai extends PyTorch in a very Pythonic OO sense, so I think the only speed issues could come from that, and maybe maintaining a few extra dicts in memory. If it's about i/o, probably not, if it's about parallelization (for nlp), definitely not. In fact, I can't think of a way in which there's a significant speed penalty.
Try a schedule that keeps you on ~ two hours a day, focusing on at least one of the videos in the 14-video series. So: 14 weeks, paying attention to one video a week and whatever peripheral studies you choose to follow up on based on the ideas Jeremy pushes in his lectures.
Edit: also, find datasets online that fit in your interests. There's no better motivation than to solve problems in a domain that you're already familiar with. I grew up on a farm and built a classifier to detect diseases as part of my fast.ai course-work: https://t.me/shambadoctorbot
Wow, when I hear about perverse incentives, I don't usually think about people spending their lives showing their clients a taste of a possible future until a high enough reward gets them to stop caring. Reward hacking for humans.
The conclusion of this paper is, paraphrasing, "We should stop using point estimates for what should be distributions. When we do that, we get much lower bounds and have a better chance at being more precise, whether or not we are alone." It opens up avenues for more questions, which is the correct stance given how little we know.
Gram matrix, some non-linear optimization. Deep Learning is simple, complexity comes from loss function, trying to adjust weights of under-represented categories, different LEGO blocks in building your network and seeing if the particular non-linear optimization works in your case or not. You can go super deep with state-of-art math research in reading about "why do we think Deep Learning works, when it shouldn't", which is mentioned by Jeremy.
Using a REPL-based language like Lisp (which I learned from reading advanced Lisp textbooks) for several years before I worked professionally as a programmer enabled me to dive into complex, underexplored parts of the (Python, C) projects at my first job, which meant I basically owned some very important parts of the stack.
Reading and applying ideas from Unix sysadmin books enabled me to be comfortable jumping around the OS very quickly, giving me a speed advantage even when the computer I used was pretty underpowered, relative to my colleagues, early in my first job.