OK, the tradeoff they're making that I missed is that it's not a compacting collector. So eventually your heap can fragment to the point where allocation gets expensive or impossible. Unusual design choice.
Unlike Java Go has first class value types and memory layout can be controlled by developers. So it leads to much less objects on heap and compact layouts both will lead to far less fragmentation. As you can see here Go apps use quite less memory than Java.
https://benchmarksgame.alioth.debian.org/u64q/go.html
Unfortunately it's impossible to reliably measure the memory usage of Java that way because the JVM will happily prefer to keep allocating memory from the OS rather than garbage collect. It makes a kind of sense: GC has a CPU cost that gets lower the more memory is given to the heap, so if you have spare memory lying around, may as well deploy it to make things run faster.
Of course that isn't always what you want (e.g. desktop apps) ... sometimes you'd rather spend the CPU and minimise the heap size. The latest Java versions on some platforms will keep track of total free system RAM and if some other program is allocating memory quickly, it'll GC harder to reduce its own usage and give back memory to the OS.
In the benchmarks game I suspect there aren't any other programs running at the same time, so Java will go ahead and use all the RAM it can get. Measuring it therefore won't give reasonable results as the heap will be full of garbage.
Value types don't have much to do with fragmentation, if anything they make it worse because embedding a value type into a larger container type results in needing larger allocations that are harder to satisfy when fragmentation gets serious. But ultimately a similar amount of data is going to end up in the heap no matter what. Yes, you can save some pointers and some object headers, so it'll be a bit less. But not so much that it solves fragmentation.
You can't really compare total memory usage of a JIT to total memory usage of an AOT compiler that way if what you're trying to show is that value types reduce memory usage.
Also, I suspect that the fact that JVMs use a generational GC (and a compacting GC) blows everything else out of the water when it comes to fragmentation. There's no way a best-fit malloc implementation can possibly beat bump allocation in the nursery for low fragmentation.
Those default memory use measurements are just a way to check if a particular 100 line toy benchmark program has been written to exploit time / space trade-off.