The top comment in this thread [1] highlights (potential) problems with virtual threads, referring to this PDF [2]. Does anyone know if these actually manifest in the way they are implemented?
Your second link is from 2018, and would indeed need a case study for Loom. For example the concerns
about thread-local storage are addressed, and the base overhead is understood (as in: don't use them for CPU-bound worloads, they'll fare better on IO-bound workloads)
Also, the cases studies in that paper are platform-level when I believe the language has to be involved; as a runtime has more information when resuming to a suspension point. The paper even acknowledges Go as a successful implementation although with C-compatibility call overhead caveat. In the Java world, the vast majority of programs stay in the Java language. So I'd say that paper would list Loom as the best implementation (And maybe revise their recommendation. It'd be useful to have that author's opinion of Loom in 2023)
This is relevant to environments where the user code is statically compiled into native code that depends on some thin runtime library and more or less directly interacts with C code. In case of full blown VM many of these problems are not as significant as the internal thread state representation is non-native anyway and you can dynamically instrument pretty much whatever you want (one issue are system calls and system library functions that do not have non-blocking equivalent, but that could be handled by either having separate OS-level thread for such things or simply ignoring the issue, with real implementations doing some mix of these two approaches).
[1] https://www.reddit.com/r/rust/comments/xrrjec/virtual_thread...
[2] https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p13...