Hacker News new | past | comments | ask | show | jobs | submit | more Hello71's comments login

This is interesting, but I'm not sure it actually makes much difference for realistic workloads. ffmpeg probably has the most link-time dynamic linking of any Linux package, and ffmpeg -loglevel quiet takes only ~40 ms on Alpine and ~60 ms on Debian. Other programs using ffmpeg tend to either statically link it (Chromium) or link it at run-time (Firefox, most video editors), neither of which would be improved by this optimization.

Would it be nice to shave 60 ms off of every ffmpeg/mpv invocation? In isolation, sure, but considering the maintenance burden and potential inconsistencies I don't think it's worth it. Nix is supposed to ensure that the dependencies are always the same, but currently if something breaks somehow, the wrong version will be loaded or an error will be emitted, whereas with this optimization, it will crash or silently invoke the wrong functions which seems extremely difficult to debug.


my thinking:

40ms might matter if you consider the number of times a process might restart, the number of instances you run and on how many hosts. That could be considerable wasted cycles in aggregate. (i.e. cloud provider)

Also it's not only a function of how many shared libraries but how many symbols they each individually have as well -- also their symbol length as well.


> pre-diabetes [...] is rather close to the median level

but according to the NIH, 30.7% of Americans are overweight, and 42.4% are obese. as the median American is overweight, it doesn't seem a stretch to claim that the median American is also pre-diabetic? I don't know whether it's true or not, but your evidence seems a bit thin.


According to the CDC, 38.0% of the adult US population is pre-diabetic. 11.6% of the total population have actual diabetes. It's very close to half.

https://www.cdc.gov/diabetes/data/statistics-report/index.ht...


> it doesn't seem a stretch to claim that the median American is also pre-diabetic

No, it's your evidence that's thin. You seem to have started from the premise ("Americans are unhealthy") and derived a pre-diabetes level from that.


You've literally provided zero evidence beyond hearsay from an alleged statistician friend.

The prediabetes level is drawn from where you see a huge uptick in the risks for developing diabetes mellitus.

You want data? There is plenty of it out there. If you're gonna try to dispute the established understanding, you need to bring evidence or you are just wasting people's time.


this is basically the "computer scientist" view of the world in https://ansuz.sooke.bc.ca/entry/23, especially the Monolith bit. spoofing UA could be legal in general or when done for interoperability but illegal when done with intent to bypass access control.


Right, the UA spoofing browsers perform is historically done to bypass sites not wanting to serve them content.


Well the UA spoofing I perform is done to bypass sites not wanting to serve me content.


Also, 4.7 seconds to read 1345 MB in 81k files is suspiciously slow. On my six-year-old low/mid-range Intel 660p with Linux 6.8, tar -c /usr/lib >/dev/null with 2.4 GiB in 49k files takes about 1.25s cold and 0.32s warm. Of course, the sales pitch has no explanation of which hardware, software, parameters, or test procedures were used. I reckon tar was tested with cold cache and pack with warm cache, and both are basically benchmarking I/O speed.


The footnotes at the bottom says

> Development machine with a two-year-old CPU and NVMe disk, using Windows with the NTFS file system. The differences are even greater on Linux using ext4. Value holds on an old HDD and one-core CPU.

> All corresponding official programs were used in an out-of-the-box configuration at the time of writing in a warm state.


My apologies, the text color is barely legible on my machine. Those details are still minimal though; what versions of software? How much RAM is installed? Why is 7-Zip set to maximum compression but zstd is not? Why is tar.zst not included for a fair comparison of the Pack-specific (SQLite) improvements on top of from the standard solution?


Using 32GB of RAM, but it is far more than they need.

7-Zip was used as others, just gave it a folder to compress. No configuration.

As requested, here are some numbers on tar.zst of Linux source code (the test subject in the note): tar.zst: 196 MB, 5420 ms (using out-of-the box config and -T0 to let it use all the cores. Without it, it would be, 7570 ms) Pack: 194 MB, 1300 ms Slightly smaller size, and more than 4X faster. (Again, it is on my machine; you need to try it for yourself.) Honestly, ZSTD is great. Tar is slowing it down (because of its old design and being one thread). And it is done in two steps: first creating tar and then compression. Pack does all the steps (read, check, compress, and write) together, and this weaving helped achieve this speed and random access.


This sounds like a Windows problem, plus compression settings. Your wlog is 24 instead of 21, meaning decompression will use more memory. After adjusting those for a fair comparison, pack still wins slightly but not massively:

  Benchmark 1: tar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
    Time (mean ± σ):      2.573 s ±  0.091 s    [User: 8.611 s, System: 1.981 s]
    Range (min … max):    2.486 s …  2.783 s    10 runs
   
  Benchmark 2: bsdtar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
    Time (mean ± σ):      3.400 s ±  0.250 s    [User: 8.436 s, System: 2.243 s]
    Range (min … max):    3.171 s …  4.050 s    10 runs
   
  Benchmark 3: busybox tar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
    Time (mean ± σ):      2.535 s ±  0.125 s    [User: 8.611 s, System: 1.548 s]
    Range (min … max):    2.371 s …  2.814 s    10 runs
   
  Benchmark 4: ./pack -i ./linux-6.8.2 -w
    Time (mean ± σ):      1.998 s ±  0.105 s    [User: 5.972 s, System: 0.834 s]
    Range (min … max):    1.931 s …  2.250 s    10 runs
   
  Summary
    ./pack -i ./linux-6.8.2 -w ran
      1.27 ± 0.09 times faster than busybox tar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
      1.29 ± 0.08 times faster than tar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
      1.70 ± 0.15 times faster than bsdtar -c ./linux-6.8.2 | zstd -cT0 --zstd=strat=2,wlog=24,clog=16,hlog=17,slog=1,mml=5,tlen=0 > linux-6.8.2.tar.zst
Another machine has similar results. I'm inclined to say that the difference is probably mainly related to tar saving attributes like creation and modification time while pack doesn't.

> it is done in two steps: first creating tar and then compression

Pipes (originally Unix, subsequently copied by MS-DOS) operate in parallel, not sequentially. This allows them to process arbitrarily large files on small memory without slow buffering.


Thank you for the new numbers. Sure, it can be different on different machines, especially full systems. For me on Linux and ext4, Pack finishes the Linux code base at just 0.96 s.

Anyway, I do not expect an order of magnitude difference between tar.zst and Pack; after all, Pack is using Zstandard. What makes Pack fundamentally different from tar.zst is Random Access and other important factors like user experience. I shared some numbers on it here: https://news.ycombinator.com/item?id=39803968 and you are encouraged to try them for yourself. Also, by adding Encryption and Locking to Pack, Random Access will be even more beneficial.


HDD for testing is a pretty big caveat for modern tooling benchmarks. Maybe everything holds the same if done on a SSD, but that feels like a pretty big assumption given the wildly different performance characteristics between the two.


the opposite is the case. from your link: "Archive.is’s owner is intentionally blocking 1.1.1.1 users"


The archive owner wants takedown requests to be forced to be cross-border, so he wants to know where the request is coming from so he can serve from a server in a different country.

Cloudflare blocks the extension to DNS that allows that. If you don't care, you can set your DNS to bypass Cloudflare for those domains only.


I’ve known this for a while now but it never ceases to amaze me. This must be up there with “no copyright intended” in terms of misguided compliance strategies.


It's always appeared to be a matter of perspective, to me.

That being said, by "Cloudflare has issues with archive.is" I very literally meant that they have issues with the DNS records served to them by Archive.is. (i.e. They do not support EDNS.)


is >99% of all food consumed daily privately owned and sold by a single company?


> 25% of US/Canada trade depends on a single privately owned bridge

Where are you taking this analogy?


For the region it is 99% of the traffic, nobody will drive for half a day to avoid the bridge. But when the alternative opens (and it is not a toll bridge) you can expect traffic on the privately owned bridge to drop like a stone and likely it will eventually become public property.


Why? The toll rates don't seem out of line with similar publicly financed bridges. https://www.ambassadorbridge.com/auto-toll-rates/


They are if they are the result of an artificial monopoly and there is an alternative on the drawing board with better rates. In general infrastructure should be free to use, toll is friction on commerce and development.


your idea also doesn't work with live streaming, and may also not work with inter-frame filters (depending on implementation). nonetheless, this exists already with those limitations: av1an and I believe vapoursynth work more or less the way you describe, except you don't actually need to load every chunk into memory, only the current frames. as I understand, this isn't a major priority for mainstream encoding pipelines because gop/chunk threading isn't massively better than intra-frame threading.


It can work with live streaming, you just need to add N keyframes of latency. With low-latency livestreaming keyframes are often close together anyways so adding say 4s of latency to get 4x encoding speed may be a good tradeoff.


Well, you don't add 4s of latency for 4x encoding speed though. You add 4s of latency for very marginal quality/efficiency improvement and significant encoder simplification, because the baseline is current frame-parallel encoders, not sequential encoders.

Plus, computers aren't quad cores any more, people with powerful streaming rigs probably have 8 or 16 cores; and key frames aren't every second. Suddenly you're in this hellish world where you have to balance latency, CPU utilization and encoding efficiency. 16 cores at a not-so-great 8 seconds of extra latency means terrible efficiency with a key frame every 0.5 second. 16 cores at good efficiency (say, 4 seconds between key frames) means terrible 64 second of extra latency.


You can pry vp8 out of my cold dead heands. I'm sorry, but if it takes more than 200ms including network latency it is too slow and video encoding is extremely CPU intensive so exploding your cloud bill is easy.


4s of latency is not acceptable for applications like live chat


As I said, "may be". "Live" varies hugely with different use cases. Sporting events are often broadcast live with 10s of seconds of latency. But yes, if you are talking to a chat in real-time a few seconds can make a huge difference.


Actually, not only does it work with live streaming, it's not an uncommon approach in a number of live streaming implementations*. To be clear, I'm not talking about low latency stuff like interactive chat, but e.g. live sports.

It's one of several reasons why live streams of this type are often 10-30 seconds behind live.

* Of course it also depends on where in the pipeline they hook in - some take the feed directly, in which case every frame is essentially a key frame.


> except you don't actually need to load every chunk into memory, only the current frames.

That's a good point. In the general case of reading from a pipe you need to buffer it somewhere. But for file-based inputs the buffering concerns aren't relevant, just the working memory.


> dishwasher soap used to be a lot more effective before they removed phosphates

was it? https://www.consumerreports.org/media-room/press-releases/20... says that several phosphate-free formulations were "Very Good".

> LED bulbs have been associated with migraine symptoms, and are all around terrible for reading.

some cheap LED bulbs have issues, but they're generally better than fluorescent bulbs which contain the horrible chemicals.

> Let's not forget the clusterfuck that is lead-free solder and just ask yourself why it's banned from aircraft and space applications where reliability is of utmost importance.

is it? a Google search for "avionics lead solder" finds Boeing saying in 2005 that "consumer electronic industry trends will force aerospace to adapt to an evolving lead-free transition", a "US Tech" saying "The global aircraft and aerospace market is moving toward 100 percent lead-free solder", an "AIM Solder" which "offers many tin-lead & lead-free RMA products suitable for the military & aerospace sector", and one source saying "Tin-lead alloy solder [...] has been used to assemble the avionics of every aircraft currently flying", by... "Lead Matters".

while environmentally-friendly replacements sometimes have downsides, categorically painting them as a lot/infinitely/terribly/clusterfuck worse is just "gubmint takin away our freedoms". yes, lead-free gasoline did require some engine design improvements, but it would be insane and downright inhumane to keep using it for a small increase in octane rating.


at time of writing, this says NO, but according to the text, "uBlock Origin could still work on YouTube. Not all YT updates are targeted against uBO's solutions." so shouldn't it say MAYBE?


Just under that it says

> A different ID doesn't always mean the detection will occur.

So that definitely seems to be the case according to everything except the "NO"


you could also say that "supports linux" allowed them to discover and remedy this vulnerability which would likely have still existed without linux support, thus making the device stronger for both windows and linux users.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: