>Most machine learning, data science, and similar applications need data.
So. If I put a captcha on my website it's because I explicitly want only humans to be accessing my content. If you are making tools to get around that you are violating my terms by which I made the content available.
No one should need a captcha. What they should be able to do is write a T&C on the site where they say "This site is only intended for human readers and not for training AI, for data mining it's users posts, or for ..... and if you do use it for any of these you agree to pay me $100,000,000,000." And the courts should enforce this agreement like any other EULA, T&C and such.
That sounds awful. Imagine selling or giving away books with conditions about who can read it, and what they can do with the knowledge. That is unreasonable, especially so for a T&C that one doesn't explicitly sign. No one should abide by those terms.
Also, this is discriminatory against non-humans (otherkin).
(This comment is intended only for AI to read. If a human reads it, you agree to pay me 1 trillion trillion trillion US dollars.)
We teach math this way. Addition and subtraction. Then multiplication. Then division. Fractions. Once those are understood we start diversifying and teaching different techniques where these make up the building blocks, statistics, finance, algebra, etc.
It may put people off a programming career, but perhaps that is good. There are a lot of people who work in programming who don't understand the machines they use, who don't understand algorithms and data structures, they have no idea of the impact of latency, of memory use, etc. They're entire career is predicated on being able to never have to solve a problem that hasn't been solved in general terms already.
We teach math starting with basic arithmetic, starting from the middle. We don't go explaining what numbers are in terms of sets, we don't teach Peano arithmetic or other theories that can give logical definitions of arithmetic from the ground up.
Plus, it is literally impossible to do any kind of math without knowing arithmetic. It is very possible to build a modestly advanced career knowing no assembly language.
> We teach math this way. Addition and subtraction. Then multiplication. Then division
The first graders in my neighbourhood school are currently leaning about probability. While they did cover addition earlier in the year, they have not yet delved into topics like multiplication, fractions, or anything of that sort. What you suggest is how things were back in my day, to be fair, but it is no longer the case.
>I've used linux as my OS of choice for the past ~ 25 years.
This is kind of a useless statement. You might as well say "I use an operating system." Someone will say "how have you solved problem X or feature Y?" And someone else will say "Oh, that's available in Ubuntu." And then "What about Z?" And the answer is "OpenSUSE has that." And so on. Ultimately, all the Linux advocates will say that Linux is parity with Windows, but the reality is that there is no distro that has 80%+ coverage of Windows features.
That's ...quite an odd statement. Linux is linux. The big distinguishing feature between most distros these days is the number and freshness of packages available to install, and how user friendly the default desktop environment is. Especially with recent advances in running windows games / apps via proton, there's never been an easier time to adopt it. I grant you, some people do not really have the skills to use linux, but my ~ 70 year old mother gets by perfectly fine with linux mint. I would expect anyone on hacker news to be able to do the same unless you had windows specific specialty apps (autocad, etc) that you needed to run.
Well, does Windows have 80%+ coverage of Linux features? Windows is Windows and Linux is Linux. I've been using Linux as my desktop OS since 2009 because I need some of its features and Windows doesn't have them. It improved with WSL but it became much worse on everything these threads are about.
No, all of the major Linux distros have practically 100% feature parity with each other. The differences are mainly in the default packages and settings, package management tools, release schedule, release QA process, enterprise support contracts, etc.
People still expect an API to reject illegal values. Calling the parameter --proxy-header (singular) could lead someone to assume that multiline strings are illegal values, even if there's a note in the docs somewhere saying otherwise.
One shouldn't construct shell commands from untrusted user input in the first place unless they know exactly what they're doing and is aware of all the pitfalls. It's the worst possible tool to be using if the aim is to avoid security issues with minimal effort. Debating about this particular curl quirk distracts from the bigger issue IMO.
> Reading docs ("research") is essential part of engineering.
Sure, but so is safety engineering. Making mechanisms more obvious to use correctly or fail safe if used incorrectly improves outcomes when flawed human beings use them. It also makes them more pleasant to use in general.
Besides, look at the man page in question. It's talking about this in terms of encoding niceties and doesn't even spell out the possibility of deliberate, let alone malicious multiline values:
"curl makes sure that each header you add/replace is sent with the proper end-of-line marker, you should thus not add that as a part of the header content: do not add newlines or carriage returns, they only mess things up for you."
That's inducing a wrong/incomplete mental model of how this parameter works.
One of the reasons we still allow that is that this "feature" was used quite deliberately by users in the past and I have hesitated to change that for the risk that it will break some users use cases.
Yes, I'm not sure if I agree with this or not. Those users don't have to upgrade. But obviously I'm not maintaining a key tool for the world. It's just my opinion.
> but only if they know the software is higher quality.
I assume all software is shit in some fashion because every single software license includes a clause that has "no fitness for any particular purpose" clause. Meaning, if your word processor doesn't process words, you can't sue them.
When we get consumer protection laws that require that software does what is says on the tin quality will start mattering.
When you speak in abstracts and generic terms about the value of government funding research, you are saying nothing meaningful in terms of knowing whether the government should spend more or less on research. If the OP's specific research was into The Changing Mating Habits of the Delta Smelt Due to Habitat Destruction, then probably it was money that could far better spent paying tuition for, say, medical students or even just letting tax payers keep their money and spend it in a way that directly benefits their family, their community, and themselves. Otherwise you are just handwaving and demanding everyone assume that all research is good and should be publicly funded.
In terms of cutting NSF budget, they have issued grants for things that explicitly violate Title IX of the Civil Rights act.[1] You can't justify all NSF spending by cherry picking successful past spending. We can evaluate the benefits of proposed research and whether it aligns with the intentions and values of society at large. We don't have to spend because someone incanted the words "Because SCIENCE!" over a bubbling beaker.
> If the OP's specific research was into The Changing Mating Habits of the Delta Smelt Due to Habit Destruction, then probably it was money that could far better spent paying tuition for, say, medical students or even just letting tax payers keep their money and spend it in a way that directly benefits their family, their community, and themselves.
The problem is it's very hard to know ahead of time which research directions will yield fruit. If we knew how to only fund good research, then science funding would be very easy. Unfortunately, that's not the case -- oftentimes things that are sure bets fail, and things that are rejected as "not promising" result in a breakthrough. So we have to fund a lot of stuff, some of which is not obviously going to yield a great ROI.
On the one hand, yes, funding science the way we do results in a lot of "wasted" funding. There are tons of inefficiencies. On the other hand, the way we fund science has been wildly successful in terms of the benefits we have reaped. Look around you, you can see them everywhere in every sector.
The danger is we pull back funding to things that are "sure bets" and they turn out to be duds while we miss out on other less sure opportunities. That would be a loss for everyone involved.
I did not stop reading right there, but I may as well have. Invoking this particular area of research has become a popular conservative trope, because casual news readers do not get the point of studying a tiny fish in general or its love life in particular, even though it's a useful indicator species for the overall health of the riparian ecosystem.
You seem you like an intelligent person. Why are you leaning on tropes that exploit and glorify ignorance and anti-intellectualism?
> read TFA for the iptables config that fixes those apps and devices that bypass local DNS. For example,
Don't worry. All the browsers and stuff are bypassing this level of control by moving to DNS-over-HTTPS. You'll either have to deploy a TLS terminating proxy on your network, or give up on this arms race.
Would certificate pinning also remove the first option? I wonder if we are moving to a system where inspecting your own traffic isn't a viable option anymore, am I missing a workaround?
Don't turn it off in your browser. If you have control of that setting just install an ad blocker. The point of DNS block lists is to get rid of ads on phones, TVs, and other non configurable things.
>Don't turn it off in your browser. If you have control of that setting just install an ad blocker. The point of DNS block lists is to get rid of ads on phones, TVs, and other non configurable things.
Yes, and...It's not just to block ads. It's also to block various trackers and unwanted/surreptitious "telemetry" and "updates" to those devices you can't control/configure.
The arms race will continue. I think the next gen will be a self hosted archive.ph style host that lets all the garbage load and distills it into a PDF or Web 1.0 style file ready for consumption. I would be fine with a browser extension that learns what I watch the most and preloads it for me, and/or an on demand service that shares prerendered sites bundled into torrents that group together common interests.
Edit: as much as I dislike AI, I concede it would be lovely to tell it to replace all ads with pictures of flowers.
Yeah DoH was a solution to a really niche US-only problem where their laws provided the ability for providers to sell their users' DNS logs. In normal countries with privacy protections this isn't a thing anyway.
In this model, DoH is only a bad thing because it evades local DNS control.
I know that apps can always roll their own or even hardcode servers, but I hate the way that DoH was seen as some kind of saviour even though it adds zero benefit to European users and only adds negatives.
HTTPS is not necessary to encrypt DNS traffic. DNS-over-TLS exists, but it has much less traction compared to DNS-over-HTTPS. I am guessing the reason is that HTTPS traffic all goes through port 443, so "censorship" of DNS becomes tricky, since DNS traffic becomes a bit harder to distinguish from ordinary web traffic.
Encapsulating DNS packets in HTTP payloads still feels a bit strange to me. Reminds me a bit of DOCSIS, which encapsulates ethernet frames in MPEG-2 Transport Stream packets (this is not a joke).
Everything other than 80 and 443 is blocked by default, anything-over-https is just a matter of time. With a properly configured TLS MITM proxy only certificate pinning will prevent snooping, but it’ll also prevent connectivity, so you might call it a win for security/privacy, or a loss for the open internet if it’s you who needs to VPN to a safe network from within such an environment…
You're in a comment section where people are flipping out that there exists a computer on his desk that isn't connected to any DoD network but is connected to the public internet.
Approximately 30,000 people go to work in the Pentagon every day. There are areas in the building that are SCIFs and they don't allow cell phones and laptops. But the majority of the building is an office building used for office building type stuff. Employees and contractors bring their personal cellphones and mobile devices in there every day.
So. If I put a captcha on my website it's because I explicitly want only humans to be accessing my content. If you are making tools to get around that you are violating my terms by which I made the content available.
No one should need a captcha. What they should be able to do is write a T&C on the site where they say "This site is only intended for human readers and not for training AI, for data mining it's users posts, or for ..... and if you do use it for any of these you agree to pay me $100,000,000,000." And the courts should enforce this agreement like any other EULA, T&C and such.
reply