I have a moderate sized legacy project where I need to migrate tests from Enzyme to React Testing Library (RTL). Probably 150+ test files, each containing upwards of 10 test cases.
While not using Copilot, I have GTP-4o assistant with a system prompt setup from trial and error to convert a given test from Enzyme to RTL. There are certain scenarios where a given test cannot actually exist in RTL due to a difference in testing philosophy between the two frameworks and I am required to make some decisions, but overall this is probably 10x faster than refactoring these tests by hand.
One of the important aspects of this, though, is when a I encounter a repeated failure of the LLM, I update the system prompt going forward. Even though this is a simple 1-shot approach, it still works well for a task like this.
They don't think superintelligence will "always" be destructive to humanity. They believe that we need to ensure that a superintelligence will "never" be destructive to humanity.
I'm on the fence with this because it's plausible that some critical component of achieving superintelligence might be discovered more quickly by teams that, say, have sophisticated mechanistic interpretability incorporated into their systems.
A point of evidence in this direction is that RLHF was developed originally as an alignment technique and then it turned out to be a breakthrough that also made LLMs better and more useful. Alignment and capabilities work aren't necessarily at odds with each other.
Read through this last night. Loved the article! Blending the story of your personal experience with a pseudo, high-level tutorial was really interesting.
Strategic timing for the release of this paper. As of last week OpenAI looks weak in their commitment to _AI Safety_, losing key members of their Super Alignment team.
I found using Nix package manager on my current daily-driver OS was a great way to break the ice. After translating my dotfiles to Nix and figuring out my project-specific development workflow I had given myself a strong foundation for NixOS.
Jumping into the deep end and going straight to daily-driving NixOS, is certainly also a good option.
Are you able to access Claude 3 via AWS Bedrock or GCP Vertex AI? I haven't used Vertex AI, but I know that several US regions have Claude 3 access through Bedrock.
I asked GPT-4 to rewrite the refrain from Eminem's "Forgot about Dre" but change it to "Forgot about Tay" and make it all about chatbots... this is the best one it came up with:
Nowadays, every bot wanna chat
Like they got something to say, but these LLMs
Are too toxic to use, just a waste of GPUs
And the programmers act like they forgot about Tay.
> In a world of horrendously complex software developed by myriads of authors, be smart, use Nix
I mean, Nix is pretty complex software, and is an added layer of abstraction in many contexts. Framing Nix as a solution to complexity seems to be a tenuous claim.
What Nix can help with, imo, is reducing toil. And a good abstraction maintained by a team can reduce toil for a lot of others.
> Framing Nix as a solution to complexity seems to be a tenuous claim.
Disagree strongly. Building a large and varied set of software packages in a generalised and reproducible way is a fundamentally complex endeavour, and nixpkgs is the most successful effort I've encountered to simplify it.
In contrast, build & packaging systems that try to present apparent "simplicity" are usually ignoring a lot of the subtle difficulties in building software in a reproducible and robust manner, leading to extended and vague debugging sessions. See for instance almost any non-trivial Dockerfile-based build procedure.
I use Nix every day. I love it, but I'd be lying if I claimed it things less complex. I don't think that is very controversial. To build software using Nix you still need to understand how that software builds without Nix plus you need to know some amount of Nix. If the abstraction was airtight, then I'd agree, but currently, it is a very leaky abstraction. But that doesn't mean it's bad, just a trade-off to consider.
While not using Copilot, I have GTP-4o assistant with a system prompt setup from trial and error to convert a given test from Enzyme to RTL. There are certain scenarios where a given test cannot actually exist in RTL due to a difference in testing philosophy between the two frameworks and I am required to make some decisions, but overall this is probably 10x faster than refactoring these tests by hand.
One of the important aspects of this, though, is when a I encounter a repeated failure of the LLM, I update the system prompt going forward. Even though this is a simple 1-shot approach, it still works well for a task like this.