More

narenst · 2024-05-08T16:50:24 1715187024

Children at school use a screen for school work starting in middle school (and sometimes even in elementary school). It is very difficult for parents or teachers to always supervise this. I think the adults should educate the children of safe online behavior but like other real world experiences they need have the independence when being online too.

narenst · on Sept 12, 2023

How does the camera on these new phones with mirrorless cameras at less than $1000 price point?

Dedicated cameras have around 20 Megapixels but much larger sensor size - but does it really matter if the most I would do is print them into a photobook?

pvg · on Sept 12, 2023

If you don't fiddle with the phone camera photos much and are happy with them, there's not much difference. A $1k-ish mirrorless is a much more capable camera but it doesn't fit in your pocket and has about as many controls as a nuclear reactor control room.

narenst · on Aug 1, 2022

StitchFix | Senior Data Platform Engineer | Remote (US) | Full Time | https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid... At Stitch Fix, we’re about personal styling for everybody and we believe in both a service and a workplace where you can be your best, most authentic self. We’re the first fashion retailer to combine technology and data science with the human instinct of a Stylist to deliver a deeply personalized shopping experience.

The Platform team at Stitch Fix is a highly impactful group of engineers who develop some of the most mission critical infrastructure in the company. The team owns our API strategy, execution as well as systems and platforms to help unlock critical algorithmic capabilities. In addition we also invest in high-leverage, self-service platforms and tools to facilitate scalable research & development for our data scientists.

We are looking for engineers with strong experience in building out scalable distributed and production systems. You will be building new platform services, tools and infrastructure for delivering algorithmic models to production. You will collaborate and partner with different functions within Stitch Fix - Data Science, Product, Engineering and other platform teams.

If this sounds interesting, please learn more and apply at: https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid... or reach out via email: naren.thiagarajan <at> stitchfix.com

narenst · on July 6, 2022

StitchFix | Senior Data Platform Engineer | Remote (US) | Full Time | https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid...

At Stitch Fix, we’re about personal styling for everybody and we believe in both a service and a workplace where you can be your best, most authentic self. We’re the first fashion retailer to combine technology and data science with the human instinct of a Stylist to deliver a deeply personalized shopping experience.

The Platform team at Stitch Fix is a highly impactful group of engineers who develop some of the most mission critical infrastructure in the company. The team owns our API strategy, execution as well as systems and platforms to help unlock critical algorithmic capabilities. In addition we also invest in high-leverage, self-service platforms and tools to facilitate scalable research & development for our data scientists.

We are looking for engineers with strong experience in building out scalable distributed and production systems. You will be building new platform services, tools and infrastructure for delivering algorithmic models to production. You will collaborate and partner with different functions within Stitch Fix - Data Science, Product, Engineering and other platform teams.

If this sounds interesting, please learn more and apply at: https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid... or reach out via email: naren.thiagarajan <at> stitchfix.com

narenst · on June 2, 2022

StitchFix | Senior Data Platform Engineer | Remote (US) | Full Time | https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid...

At Stitch Fix, we’re about personal styling for everybody and we believe in both a service and a workplace where you can be your best, most authentic self. We’re the first fashion retailer to combine technology and data science with the human instinct of a Stylist to deliver a deeply personalized shopping experience.

The Platform team at Stitch Fix is a highly impactful group of engineers who develop some of the most mission critical infrastructure in the company. The team owns our API strategy, execution as well as systems and platforms to help unlock critical algorithmic capabilities. In addition we also invest in high-leverage, self-service platforms and tools to facilitate scalable research & development for our data scientists.

We are looking for engineers with strong experience in building out scalable distributed and production systems. You will be building new platform services, tools and infrastructure for delivering algorithmic models to production. You will collaborate and partner with different functions within Stitch Fix - Data Science, Product, Engineering and other platform teams.

If this sounds interesting, please learn more and apply at: https://www.stitchfix.com/careers/jobs?gh_jid=4143150&gh_jid... or reach out via email: naren.thiagarajan <at> stitchfix.com

narenst · on Dec 17, 2020

I have a toddler at home. I purchased a kindle a few months ago and it has drastically increased my reading time. I try to carry the kindle around the house instead of my phone. And read whenever I can - 15 to 30 mins chunks. I also read in bed before going to sleep and the backlit kindle is great for that.

Also I have been renting kindle books from my library. Very easy to try books and continue reading only if I find it interesting!

kqr · on Dec 17, 2020

My son is still an infant, and sometimes things (teething, stomach aches, who knows) keeps him from sleeping. The Kindle makes it much more tolerable to walk around and rock him at night.

Of course, it also helps that my employer will reimburse any book purchases I make.

scruple · on Dec 17, 2020

How does renting Kindle books work? I'm intrigued.

eej71 · on Dec 17, 2020

Our local library makes it pretty easy. Ours works through an app called Libby that connects with my library card and I then download books to the kindle.

NoRagrets · on Dec 17, 2020

I also recommend Libby. It used to be Cloud Library with our public library but they phased out of it and it’s Libby now. I like it much better. They also have audiobooks.

narenst · on March 14, 2020

I have been using CloudApp for similar use case. It’s pretty good for screenshot sharing with annotations.

https://www.getcloudapp.com/

narenst · on Feb 13, 2020

Author here. We have been building ML infra for FloydHub for over 3 years now and learned a ton. It is not easy as we thought it was! We are open sourcing our learning in a blog series - hoping it will be useful for companies who build their own ML infrastructure.

This article focuses on how to use EC2 effectively and save overall cost for ML infra. There are a lot of low-hanging-fruit opportunities that most companies we work with don't adopt. Anything else I missed?

narenst · on Jan 22, 2020

> As the first deep-learning-enabled product to launch on Github.com, this feature required careful design to ensure that the infrastructure would generalize to future projects.

It is surprising to see that this is the first time DL is run in production at GitHub. GitHub has a large amount of fairly structured data in the form of code, issues, etc. Plus they have been dabbling with DL for more than two years [1].

It could be that the business problems that are critical to the growth of GitHub product may not need DL. Solving problems like best-first-issues, code search are useful to the end user but may not effectively grow the business metrics.

[1] https://github.blog/2018-09-18-towards-natural-language-sema...

narenst · on Jan 20, 2020

This is a really good time to be a Independent Scientist (aka Gentleman scientist) in this field because how nascent deep learning and similar techniques are. It requires a lot of trial and error and time/cost investment to bring the AI techniques to the masses.

The FAANGs are trying to hire all the top talent (including Emil who wrote the post) but I believe these independent researchers will be the one finding new opportunities to make AI useful in the real world (like colorizing b&w photos, create website code from mockups).

The biggest challenge I see for these folks is the access to high quality data. There is a reason Google is releasing so many ML models in production compared to smaller companies. Bridging the data gap requires effort from the community to build high quality open source datasets for common applications.

woah · on Jan 20, 2020

On the other hand, the lack of data for independent researchers may encourage the development of low data techniques which is much more exciting in the long term since humans are able to learn with much less data than required by most machine learning techniques

mendeza · on Jan 20, 2020

I think an exciting area that can innovate the lack of data is domain randomization, and synthetic data generation.

Slides from Josh Tobin is a great introduction: http://josh-tobin.com/assets/pdf/randomization_and_the_reali...

http://josh-tobin.com/assets/pdf/BeyondDomainRandomization_T...

And a really cool project implementing synthetic generation of text in images: https://github.com/ankush-me/SynthText

tasogare · on Jan 21, 2020

How it that useful for subsequent learning? The output is random words that doesn't even forms phrases or sentences and has no relation with the image.

TrainedMonkey · on Jan 20, 2020

Arguably humans have a lifetime of data which was used to develop a model of the world that is amazingly efficient at interpreting new data.

echelon · on Jan 20, 2020

Humans can transfer learn across domains because we can draw on an incredible wealth of past experience. We can understand and abstractly reason about the architecture of problem landscapes and map our understanding into new spaces.

That isn't even counting our hardwired animal intelligence.

w4n9r3ntru3 · on Jan 21, 2020

Humans learn efficiently, but that has nothing to do with having a lifetime of data I believe. Humans have a lifetime of data, but you can easily parallelize the model such that it takes in more data than a human can in his/her lifetime, and still humans are still the state of the art.

cygaril · on Jan 20, 2020

Or our entire evolutionary history of data.

AlanSE · on Jan 20, 2020

...which fits into a size of less than 700Mb compressed. Some of the most exciting stories I've read recently for machine learning are cases where learning is re-used between different problems. Strip off a few layers, do minimal re-training and it learns a new problem, quickly. In the next decade, I can easily see some unanticipated techniques blowing the lid off this field.

eanzenberg · on Jan 21, 2020

I’m not sure our genetics encodes all the physics of being a person. A human brain is so complex we’re not even close to simulating it on silicon

K0SM0S · on Jan 20, 2020

It indeed strikes me as particularly domain-narrow when I hear neuro or ML scientists claim as self-evident that "humans can learn new stuff with just a few examples!.." when the hardware upon which said learning takes place has been exposed to such 'examples' likely trillions of times over billions of years before — encoded as DNA and whatever else runs the 'make' command on us.

The usual corollary (that ML should "therefore" be able to learn with a few examples) may only apply, as I see it, if we somehow encode previous "learning" about the problem in very the structure (architecture, hardware, design) of the model itself.

It's really intuition based on 'natural' evolution, but I think you don't get to train much "intelligence" in 1 generation of being, however complex your being might be (or else humans would be rising exponentially in intelligence every generation by now, and think of what that means to the symmetrical assumption about silicon-based intelligence).

tprice7 · on Jan 20, 2020

"The usual corollary (that ML should "therefore" be able to learn with a few examples) may only apply, as I see it, if we somehow encode previous "learning" about the problem in very the structure (architecture, hardware, design) of the model itself."

Yes, and they do. They aren't choosing completely arbitrary algorithms when they attempt to solve a ML problem, they are typically using approaches that have already been proven to work well on related problems, or at least are variants of proven approaches.

K0SM0S · on Jan 20, 2020

Indeed, but what's the magnitude?

The question is, how much information is encoded in those algos (to me, low-order logical truths about a few elementary variables, low degree of freedom for the system overall), compared to how much information is encoded in the "algos of the human brain" (and actually the whole body, if we admit that intelligence has little motivation to emerge if there's no signal to process and no action to ever be taken).

I was merely pointing out this outstanding asymmetry, as I see it, and the unfairness of judging our AI progress (or setting goals for it) relatively to anything even remotely close to evolved species, in terms of end-result behavior, emergent high-level observations.

Think of it this way: a tiny neural net (equivalent to the brain of what, not even an insect?) "generationally evolved" enough by us to be able to recognize cats and license numbers and process human speech and suggest songs and whatnot is really not too shabby. I'd call it monumental successs to be able to focus a NN so well on a vertical skill. But that's also low-order low-freedom, in the grander scheme of things, and "focus" (verticality) is just one aspect of intelligence (e.g. the raging battle is for "context" as we speak, horizontality and sequentiality of knowledge; and you can see how the concept of "awareness", even just mechanical, lies behind that). So, many more steps to go. So vastly much more to encode in our models before they're able to take a lesson in one standing and a few examples.

It really took big-big-big data for evolution to do it, anyway, and we're speeding that up thanks to focus in design, and electronics to hasten information processing, but not fundamentally changing the law of neural evolution, it seems.

If you ask me, the next step is to encode structural information in the neuron itself, as a machine or even network thereof, because that's how biology does it (the "dumb" logic gate transistor model is definitely wrong on all accounts, too simplistic). Seems like the next obvious move, architecturally.

eanzenberg · on Jan 20, 2020

humans dont start with a random scrambled brain

lallysingh · on Jan 20, 2020

Is that in a csv.gz I can torrent somewhere?

rhizome · on Jan 20, 2020

Are you referring to empiricism?

ummonk · on Jan 20, 2020

Transfer learning for the win.

SQueeeeeL · on Jan 20, 2020

Low data techniques are just another name for algorithms/equations. Dijstras algorithm required 0 training graphs to make.

Any other kind of method will get killed by low statistical information in the data (can't get blood from a stone)

ssivark · on Jan 20, 2020

Agree with your first statement and disagree with your second; I don’t think the former implies the latter.

I think there’s a lot of room to be clever with encoding domain-specific inductive biases into models/algorithms, such that they can perform fast+robust inference. Exploiting this trade off as a design parameter to be tuned, rather than sitting at one of the two extremes is potentially going to generate a lot of value. And this is highly under-appreciated currently since most people are obsessed with “data”. I’m willing to bet that this will become big in a few years when the current AI hype machine falters, and will serve as a huge competitive advantage.

btrettel · on Jan 20, 2020

These types of techniques are already big in certain fields. E.g., in fluid dynamics and heat transfer, "dimensional analysis" is frequently used to simplify and generalize models. Sometimes models can be nearly fully specified up to a constant of proportionality based solely on dimensional considerations. Beyond what is typically seen as "data" the information here is a list of variables involved in the problem and the dimensions of the variables.

As far as I can tell "dimensions" in this sense are a purely human construct. For two variables to have different dimensions, it means that they can not be meaningfully added, e.g., apples and oranges.

gdubs · on Jan 20, 2020

This would be a great area, IMHO, for the government to step in and fund an initiative to provide huge, rich datasets for anyone to use for ML research.

andreyk · on Jan 20, 2020

wrt the data point, to be fair most research is still coming out of universities where students have access to the same data as anyone else. So from a research perspective it's not a huge deal, much as with compute industry can scale up known techniques while individual researchers do more interesting stuff.

K0SM0S · on Jan 20, 2020

So if I understand correctly, to reformulate in my own words/views:

while the "big data" (datasets) formed and thus owned by big-tech, big-ads, big-brother, etc. may be instrumental to build at-scale solutions for real-world usage (for profit, knowledge, control, whatever actionable goal),

fundamental research itself, as done in universities, can move forward without these datasets: using what's publicly available is enough.

Did I read this right? It would effectively add much needed nuance to the common perception that big data is necessary to train innovative models, that there might be some sort of monopoly on oil (data, the 'fuel' of ML) by a few champions of data collection.

yorwba · on Jan 20, 2020

It's not exactly true that research institutions don't have access to the same big datasets as companies. For example, I took a course that involved tracking soccer players using videos provided by a streaming company that specializes in amateur soccer. They promised to give us access to their internal API under an NDA, which they wouldn't have done for just anyone.

On the other hand, they never actually gave our API keys the necessary privileges, so in the end I just reverse-engineered the URL scheme of their streams and scraped them. Many datasets used in academia are just collections of publicly available data (e.g. Wikipedia, images found by googling), optionally annotated for cheap using Amazon Mechanical Turk. Experimenting with that kind of data is also open to independent researchers. You don't need to work at a data-hoarding company if you can get what you need by scraping their website.

andreyk · on Jan 20, 2020

yep, you read that right. Source: I am a PhD student at Stanford at the Stanford Vision and Learning lab (http://svl.stanford.edu/) and read a ton of AI papers. The vast majority of papers are done with datasets anyone can just download / request, as far as I've seen.

DoctorOetker · on Jan 20, 2020

personally, without affiliation to a university, I have a hard time downloading the datasets through my slow home connection. I live in a city with a university, I explained the situation but they won't let me download a dataset even if I pay, ... only when I enroll. Instead of just selling the shovel, they want to sell me the wheelbarrow too.

I succeeded one time in convincing the guy behind a desk in an internet cafe, so I could bring my HDD and download a dataset in a calmer time of day, and throttled so it wouldn't disturb other customers. This went without any problems for the other customers in the internet cafe. When I asked again a few months later for a new dataset, they no longer wanted me to do so...

There seems to be no download by mail service (and I only get people forwarding me to google cloud products etc, which as a European is so financially out there with automatic balance deductions and non transparent pricing schemes, I would have no qualms using GCP or others if they ran a prepaid alternative for people who refuse to take on risk)

K0SM0S · on Jan 21, 2020

All of which is very satisfying! Thank you for the uplifting view.

tnecniv · on Jan 20, 2020

A lot of research data sets are publicly available, but many researchers based at universities have relationships with private companies where they can get access to data or other resources useful for research (e.g. Google has a big room of robotic arms generating data for pick and place tasks).

There is still plenty you can do with a reasonable personal budget, however.

z3t4 · on Jan 21, 2020

> colorizing b&w photos

You will have unlimited training data. But its very difficult task even for humans. Its like trying to reverse a hash. Also a lot of information is lost when you store a color digitally.