Hacker News new | past | comments | ask | show | jobs | submit | more pornel's comments login

I wonder if we get some malicious workarounds for this, like re-releasing the same phone under different SKUs to pretend they were different models on sale for a short time.


Maybe, but the regulation tries to prevent this by separating "models" from "batches" from "individual items" and defaults to "model" when determining compatibility. Worth noting is that each new model requires a separate filing for both EcoDesign and other certifications like CE which could help reduce workarounds like model number inflation.


We commonly use hardware like LCDs and printers that render a sharp transition between pixels without the Gibbs' phenomenon. CRT scanlines were close to an actual 1D signal (but not directly controlled by the pixels, which the video cards still tried to make square-ish), but AFAIK we've never had a display that is a 2D signal that we assume in image processing.

In signal processing you have a finite number of samples of an infinitely precise contiguous signal, but in image processing you have a discrete representation mapped to a discrete output. It's contiguous only when you choose to model it that way. Discrete → contiguous → discrete conversion is a useful tool in some cases, but it's not the whole story.

There are images designed for very specific hardware, like sprites for CRT monitors, or font glyphs rendered for LCD subpixels. More generally, nearly all bitmap graphics assumes that pixel alignment is meaningful (and that has been true even in the CRT era before the pixel grid could be aligned with the display's subpixels). Boxes and line widths, especially in GUIs, tend to be designed for integer multiples of pixels. Fonts have/had hinting for aligning to the pixel grid.

Lack of grid alignment, an equivalent of a phase shift that wouldn't matter in pure signal processing, is visually quite noticeable at resolutions where the hardware pixels are little squares to the naked eye.


I think you are saying there are other kinds of displays which are not typical monitors and those displays show different kinds of images - and I don’t disagree.


I'm saying "digital images" are captured by and created for hardware that has the "little squares". This defines what their pixels really are. Pixels in these digital images actually represent discrete units, and not infinitesimal samples of waveforms.

Since the pixels never were a waveform, never were sampled from such signal (even light in camera sensors isn't sampled along these axis), and don't get displayed as a 2D waveform, the pixels-as-points model from the article at the top of this thread is just an arbitrary abstract model, but it's not an accurate representation of what pixels are.


If Apple is so bad at this that they have to charge 30%, they should have failed in the free market to a competitor that can do the same or better for 3%. However, Apple has prevented that, not by being better or cheaper, but by implementing DRM that locks users out from having a choice (and the market as a whole ended up being a duopoly with cartel-like pricing).

Whether Apple can be cheaper isn't really the point (they should be, digital services are a very high margin business). It's that they're anti-competitive to the point that the market for paid apps and in-app payments became inefficient (in a financial sense).


Rust has such open extensibility through traits. The prime example is Itertools that already adds a bunch of extra pipelining helper methods.


It is due to the risk of a leak.

Laundering of data through training makes it a more complicated case than a simple data theft or copyright infringement.

Leaks could be accidental, e.g. due to an employee logging in to their free-as-in-labor personal account instead of a no-training Enterprise account. It's safer to have a complete ban on providers that may collect data for training.


Their entire business model based on taking other peoples stuff. I cant imagine someone would willingly drown with the sinking ship if the entire cargo is filled with lifeboats - just because they promised they would.


How can you be sure that AWS will not use your data to train their models? They got enormous data, probably most data in the world.


Being caught doing they would be wildly harmful to their business - billions of dollars harmful, especially given the contracts they sign with their customers. The brand damage would be unimaginably expensive too.

There is no world in which training on customer data without permission would be worth it for AWS.

Your data really isn't that useful anyway.


> Your data really isn't that useful anyway

? One single random document, maybe, but as an aggregate, I understood some parties were trying to scrape indiscriminately - the "big data" way. And if some of that input is sensitive, and is stored somewhere in the NN, it may come out in an output - in theory...

Actually I never researched the details of the potential phenomenon - that anything personal may be stored (not just George III but Random Randy) -, but it seems possible.


There's a pretty common misconception that training LLMs is about loading in as much data as possible no matter the source.

That might have been true a few years ago but today the top AI labs are all focusing on quality: they're trying to find the best possible sources of high quality tokens, not randomly dumping in anything they can obtain.

Andrej Karpathy said this last year: https://twitter.com/karpathy/status/1797313173449764933

> Turns out that LLMs learn a lot better and faster from educational content as well. This is partly because the average Common Crawl article (internet pages) is not of very high value and distracts the training, packing in too much irrelevant information. The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all.


Obviously the training data should be preferably high quality - but there you have a (pseudo-, I insisted also elsewhere citing the rights to have read whatever is in any public library) problem with "copyright".

If there exists some advantage on quantity though, then achieving high quality imposes questions about tradeoffs and workflows - sources where authors are "free participants" could have odd data sip in.

And the matter of whether such data may be reflected in outputs remains as a question (probably tackled by some I have not read... Ars longa, vita brevis).


There is a stark contrast in usability of self-contained/owning types vs types that are temporary views bound by a lifetime of the place they are borrowing from. But this is an inherent problem for all non-GC languages that allow saving pointers to data on the stack (Rust doesn't need lifetimes for by-reference heap types). In languages without lifetimes you just don't get any compiler help in finding places that may be affected by dangling pointers.

This is similar to creating a broadly-used data structure and realizing that some field has to be optional. Option<T> will require you to change everything touching it, and virally spread through all the code that wanted to use that field unconditionally. However, that's not the fault of the Option syntax, it's the fault of semantics of optionality. In languages that don't make this "miserable" at compile time, this problem manifests with a whack-a-mole of NullPointerExceptions at run time.

With experience, I don't get this "oh no, now there's a lifetime popping up everywhere" surprise in Rust any more. Whether something is going to be a temporary view or permanent storage can be known ahead of time, and if it can be both, it can be designed with Cow-like types.

I also got a sense for when using a temporary loan is a premature optimization. All data has to be stored somewhere (you can't have a reference to data that hasn't been stored). Designs that try to be ultra-efficient by allowing only temporary references often force data to be stored in a temporary location first, and then borrowed, which doesn't avoid any allocations, only adds dependencies on external storage. Instead, the design can support moving or collecting data into owned (non-temporary) storage directly. It can then keep it for an arbirary lifetime without lifetime annotations, and hand out temporary references to it whenever needed. The run-time cost can be the same, but the semantics are much easier to work with.


Browsers check the identity of the certificates every time. The host name is the identity.

There are lots of issues with trust and social and business identities in general, but for the purpose of encryption, the problem can be simplified to checking of the host name (it's effectively an out of band async check that the destination you're talking to is the same destination that independent checks saw, so you know your connection hasn't been intercepted).

You can't have effective TLS encryption without verifying some identity, because you're encrypting data with a key that you negotiate with the recipient on the other end of the connection. If someone inserts themselves into the connection during key exchange, they will get the decryption key (key exchange is cleverly done that a passive eavesdropper can't get the key, but it can't protect against an active eavesdropper — other than by verifying the active participant is "trusted" in a cryptographic sense, not in a social sense).


I copy the same certbot account settings and private key to all servers and they obtain the certs themselves.

It is a bit funny that LetsEncrypt has non-expiring private keys for their accounts.


DANE is a TLS with too-big-to-fail CAs that are tied to the top-level domains they own, and can't be replaced.

Separation between CAs and domains allows browsers to get rid of incompetent and malicious CAs with minimal user impact.


DANE lets the domain owner manage the certificates issued for the domain.


This delegation doesn't play the same role as CAs in WebPKI.

Without DNSSEC's guarantees, the DANE TLSA records would be as insecure as self-signed certificates in WebPKI are.

It's not enough to have some certificate from some CA involved. It has to be a part of an unbroken chain of trust anchored to something that the client can verify. So you're dependent on the DNSSEC infrastructure and its authorities for security, and you can't ignore or replace that part in the DANE model.


Mixing of colors in an "objective" way like blur (lens focus) is a physical phenomenon, and should be done in linear color space.

Subjective things, like color similarity and perception of brightness should be evaluated in perceptual color spaces. This includes sRGB (it's not very good at it, but it's trying).

Gradients are weirdly in the middle. Smoothness and matching of colors are very subjective, but color interpolation is mathematically dubious in most perceptual color spaces, because √(avg(a+b)) ≠ avg(√(a) + √(b))


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: