I once again feel that a comparison to humans is fitting.
We are also "trained" on a huge amount of input over a large amount of time.
We will also try to guess the most natural continuation of our current prompt (setting). When asked about things it I can at times hallucinate things I was certain to be true.
It seems very natural to me that large advances in reasoning and logic in AI should come at the expense of output predictability and absolute precision.
The comparison is flawed though in that humans and LLMs make mistakes for different reasons.
Humans forget things. Humans make errors. Humans' train of thought isn't impacted by an errant next token in the statement they're making. We have thoughts which exist as complete prior to us "emitting" them. Just as a multi-lingual speaker does not have thoughts exclusive to the language they're speaking in (even if that language allows them tools to think a certain way).
This is obvious if you consider different types of symbolic languages, such as sign language. Children can learn sign language prior to them being verbal. The ideas they have as a prior are not effected by the next sign they make: children actually know things independent of the symbolic representation they choose to use.
How is this a debian maintainer going rogue?
Maintainers making changes to default package build flags is quite normal and should be expected.
These changes were even requested in the public bug tracker.
I never claimed it was robust (I made this project in an hour after a beer), just that it worked.
Mana costs both on the card and on the rules text (e.g. Ward 2 should be Ward {2}) seem to be an issue and I'm curious as to why. I may have to experiment more with few-shot examples.
This actually sounds like a solid move. There is obviously a huge market for a porn, nudity and kink related community. As proven by the decline of tumblr. I guess most US companies are too prudish to capitalize on the opportunity.
I guess the hard part is compartmentalizing the content spheres and public image perception.
There is no money there. Advertisements won't be shown next to those posts. Or at least non-adult brands -- Apple, Microsoft, BMW etc won't want to have their ads shown next to NSFW content (if they are still doing business there). Twitter/X is creating infrastructure and providing bandwidth for content that will barely give them any revenue in return.
Plus, companies will think more carefully before showing ads on Twitter/X -- even if their ads aren't shown next to NSFW content, if the platform is lax on moderation (non-consensual content etc) they likely don't want their brands to be associated with such a website.
Well, unless in the case where people who view NSFW content a lot are also active Twitter users otherwise and would click on "normal" ads. But I don't think that is going to happen.
I strongly dislike swap on servers. I can understand some use cases laptops and one-off situations.
I would much rather have an application get killed by OOM killer than swapping. Swapping absolutely kills performance. Not having enough RAM is a faulty state and swapping hides that from admins resulting in hard to debug issues. The OOM killer leaves handy logs, swapping just degrades your service and needs to be correlated with RAM usage metrics.
My experience is also that swap will be used no matter how low (or was it high?) you set the swappiness number if the memory throughput is high enough, even if there is enough RAM available.
We live in a world where you are charged per megabyte of RAM you allocate to your VM's. Sure, they have the occasional peak that lasts for a few seconds, but if you provision RAM for that peak it's costing you money. The cheap way out is to give it swap.
My rule of thumb is on an average load there should be no swapping, meaning that vmstat or whatever should be showing mostly 0's in the swap column. That doesn't mean it has 0 bytes in swap, in fact it probably is using some swap. It means nothing in swap is in the working set. For example, the server I'm looking at now is showing 0 swap activity, has 2GB of RAM and is using 1.3GB of swap. When a peak hits you will get some small delays of course, but it's likely no one will notice.
Swap (paging) can also help performance. It exists for a reason. Having metrics on paging helps you tune your application, so it is also good for observability. It a feature that can be misused, but it is not a good recommendation to turn it off without knowledge of the specific situation.
none of my laptops/desktop/gaming rigs or servers feature swap. Swapping along with a managed language and garbage collection is next to indistinguishable from a application crash.
it is, trying running stop-the-world type of a garbage collection along with page-in-out, etc.
>cache impacted by those applications’ file I/O
Which cache? The disk one, that depends on the available memory, with pretty much all free memory being a disk cache. In the cases of swapping, there is no disk cache left, effectively.
That is not how paging works. The swap area is also a cache in every sense of the word. And the kernel will swap out pages that clearly are accessed less frequently than another page, even if that page is the buffer cache. When used like that, swapping is a way of getting more disk cache.
Generally speaking, systems do not swap out pages only under memory pressure. That design would be ineffective. When memory pressure is high enough, you've already lost.
I can see the benefit to not having swap in a server scenario, but to offer a counterpoint- it seems like IT likes to under allocate servers by something like 25%, so if you have a server with 256GB of RAM, by design it should never use more than 192GB. That’s a lot of RAM going to waste for the off-chance usage jumps above 75-80%.
I think I would rather have the server’s SSD be an Optane drive (or some other high-endurance flash memory) with a swap partition, and use some other means of monitoring and being alerted to high memory usage.
A lot of swap on servers is bad. A little bit is fine though and can actually be helpful in certain scenarios. I've seen servers with 256GB of RAM configured to have <1GB of swap, so in case of a runaway process the swap fills up very quickly and doesn't delay the OOM killer, but still it helps the Linux memory manager to run more optimally.
A process randomly dying is a faulty state. Making malloc fail and having programs respond appropriately is far preferable to just randomly trashing processes.
Replace browser with operating system or computer and expand extensions to user installable programs and it mostly still rings true.
I believe users should be empowered to modify their installed applications as they see fit.
It doesn't ring true for installed software anymore — "virus scanners" have gotten to the point where they just work for most people, desktop software is more difficult develop (for your average hacker wannabe), more difficult to get users to install, and has far less valuable data to go after.
I actually very much like Apple's approach to browser extensions forcing them to be truly installed software and in the purview of tools that protect the rest of the system.
The Chrome browser extension ecosystem is perfectly fine in theory but suffers from reinventing installed software without taking any of the lessons we've learned about OS software. Nice cautionary tale but the web is different.
On a typical PC, installed software has even more permissions than a browser extension, and all any malware author has to do is write their own keylogger or upload the browser cookie database. Sure, it's a little more effort, but I think the only real advantage that malicious browser extensions have over native programs is the discoverability and auto-update Google and Mozilla give them "for free".
I don't know, it would simple enough to catch, but would also flag access by file managers. Probably the only way is to test. Generally I've found writing malware from scratch is enough to get it through AV, but I only tested on what I had installed.
> It doesn't ring true for installed software anymore — "virus scanners" have gotten to the point where they just work for most people
... by allowing software from big corporations not matter how user-hostile it is while randomly flagging/deleting harmless software make by individuals/smaller groups who have not paid the protection racket.
The AV industry is a scam.
> desktop software is more difficult develop (for your average hacker wannabe)
Desktop software can be written in the same languages as webshit and more.
> and has far less valuable data to go after
All data available in browsers is also available to native programs running besides.
Cool project. Very happy about you taking the time to package it. I never use cli tools I cant install with apt because i log into a lot of servers. Hope i still remember this tool once i leaves sid and gets to stable.
It seems very natural to me that large advances in reasoning and logic in AI should come at the expense of output predictability and absolute precision.