>“Here at Google, sometimes we just throw stuff together,” Dr. Norvig said, during a meeting of the Google Trips team, in Mountain View, Calif. “But other times, if you’re serving billions of users, it’s important to do that efficiently. A 10-per-cent improvement in efficiency can work out to billions of dollars, and in order to get that last level of efficiency, you have to understand what’s going on all the way down.”
I'm interested in this low level systems opimization. What type of jobs hire for this work? What areas in graduate school should one focus in? Is this type of skill set still employable?
It's not necessarily low level optimization. You get better optimization results if you're willing to change things at all levels of the software stack (including user expectations); so it's really about understanding what's going on all the way down and all the way up.
As with all things, the way to get experience and knowledge is to try things. Take a look at some software that you use that does something slow, and try to make it faster; it's easier if you have the source available. Lots of applications these days are slow to start, and that's easy to time, and probably easy to bucket things into things that don't need to be done or things that can be done later. If you interact with any long data processing loops, those are often good choices for optimization as well.
This skill set is most employable for companies that have scale; when you have ten servers, being able to turn off one of them because of efficiency gains doesn't help that much; but it's nice to turn off ten percent of your 10,000 node cluster. It also leads to nice claims on resumes "made efficiency gain in X, leading to Y% cost savings on servers", and since you're likely to spend a bunch of time on a single issue, you'll be able to have a good discussion when it comes to 'discuss some of your interesting projects' in interviews.
The only downside is you'll be super frustrated about everyone else's software being slow. You'll likely gravitate towards classic games because they don't have load times; I hope you like pixel art. :D
There's a position like that in my company, open to new hires (only bachelor's needed). The goal is to optimize algorithms and data analysis models that run on our HPC cluster. It basically boils down to changing compiler settings and seeing what runs the fastest.
For example, you could read the C++ spec forwards and backwards for a year and know your data structure big O notation just as well, but they will never tell you that inserting into the middle of a vector is faster than inserting into the middle of a linked list in a great many common cases.
Sure, ideally you'd measure, but you need to have some idea of what options to even compare with your measurements.
I may be missing something here, but your example seems to be a textbook case for big-O analysis. Writing in a vector is O(1), while linked list is O(n), isn’t it?
It's about inserting in the middle of a vector, which requires to shift every element after that one (so, O(n)), and for a linked list it can be done by allocating a new element, which, if you have the pointer to the place where you want to insert, is O(1) - and really, just a few operations.
However, in practice, the O(n) operation will be faster - since doing a hundred reads and a hundred writes to shift a hundred numbers within a single cache line will be much faster than reading a single byte to an out-of-cache memory in the linked list case; the processor can do a lot while waiting for any new data from RAM.
Writing into a vector is O(n), writing into a LL is O(1), assuming you have an iterator to the location to write to. If you don't, then it becomes O(n), and writing into a vector can be O(1) if you're writing to the end.
C++ stl, LL appending is always O(1) and writing into a vector is O(n). There are plenty of caveats to this and the documentation should be referenced.
He's right btw that insert into a vector is generally faster in the C++ stl than insert into a LL. I don't recall off the top of my head why that is, just remember it's the case.
Data locality is a big reason. Sure, it's O(N) writes, but those writes are neatly lined up in such a way that the likelihood of a cache miss is low. As opposed to a linked list, where just traversing to the middle may entail multiple round trips to main memory.
Working in any of the big cloud services (AWS, Azure, GCP) will involve this sort of work. Some services probably have more interesting scaling problems than others. (e.g. I bet AWS DynamoDB is more interesting than AWS CloudTrail).
I'm interested in this low level systems opimization. What type of jobs hire for this work? What areas in graduate school should one focus in? Is this type of skill set still employable?