To address the short-running program issue, can't the regular GCs just not clean...

duelingjello · on Dec 8, 2019

Setting aside GC, nailgun (JDK <= 8?) and drip already solves/d short-running VMs. This is often how to speed-up CLI tools like JRuby, ant, mvn, sbt, etc.

Also these help reduce load times:

- Class Data Sharing (CDS; JDK 5+) https://docs.oracle.com/en/java/javase/11/vm/class-data-shar...

- Application Class Data Sharing (AppCDS; JDK 10+) https://openjdk.java.net/jeps/310

- Ahead-Of-Time compilation (AOT; jaotc; JDK 9+ Linux-x86_64 only): http://openjdk.java.net/jeps/295 - JVM runtime trimmer (jlink; JDK 9+): http://openjdk.java.net/jeps/282

---

Drip: https://github.com/ninjudd/drip

Nailgun:

https://github.com/facebook/nailgun

http://www.martiansoftware.com/nailgun

fhars · on Dec 8, 2019

I do remember some perl scripts where I would send a SIGKILL to itself as the last instruction. Cut the total runtime of the script almost in half...

jlokier · on Dec 8, 2019

Heh.

That script could probably have used POSIX::_exit() to get the same speedup without the calling process thinking it crashed.

uWSGI, a web application container for Python, Perl and other languages has an option for that:

  --skip-atexit-teardown

The teardown time delay comes from:

- Unreferencing objects recursively, therefore tracing all objects.

- Calling destructor functions.

- Potentially doubling idle memory use due to copy-on-write as all the objects are written to.

When a web application server restarts all its child processes, the third item in particular can result in a large spike in memory use.

shipilev · on Dec 8, 2019

The post misleads readers into thinking that JVM runs the GC before exit. It does not.

When I was writing the Epsilon JEP, I meant that it might be futile to have a hundreds-of-ms-long GC cycle, when the program exits very soon anyway, and the heap would be abandoned wholesale. The important bit of trivia is that GC might be invoked long before 'the whole memory' is exhausted. There are several reasons to do this: learning the application profile to size up generations or collection triggers, minimizing the startup footprint, etc. GC cycle then can be seen as the upfront cost that pays off in future. With the extremely short-lived job that future never comes.

Contrived example:

  $ cat AL.java
  import java.util.*;

  public class AL {
       public static void main(String... args) throws Throwable {
          List<Object> l = new ArrayList<>();
          for (int c = 0; c < 100_000_000; c++) {
              l.add(new Object());
          }
          System.out.println(l.size());
      }
  }

  $ javac AL.java

Ooof, 12.5 seconds to run, and about 2 cpu-minutes taken with Parallel:

  $ time jdk11.0.5/bin/java -XX:+UnlockExperimentalVMOptions -Xms3g -Xmx3g -XX:+UseParallelGC -Xlog:gc AL
  [0.015s][info][gc] Using Parallel
  [0.988s][info][gc] GC(0) Pause Young (Allocation Failure) 768M->469M(2944M) 550.699ms
  ...
  [12.281s][info][gc] GC(3) Pause Full (Ergonomics) 1795M->1615M(2944M) 7660.045ms
  100000000

  real 0m12.464s 
  user 1m53.618s
  sys 0m1.087s

Much better with G1, but we still took 11 cycles that accrued enough pauses to affect the end-to-end timing. Plus GC threads took some of our precious CPU.

  $ time jdk11.0.5/bin/java -XX:+UnlockExperimentalVMOptions -Xms3g -Xmx3g -XX:+UseG1GC -Xlog:gc AL
  [0.031s][info][gc] Using G1
  [0.452s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 316M->314M(3072M) 124.119ms
  ...
  [2.518s][info][gc] GC(11) Pause Young (Normal) (G1 Evacuation Pause) 2321M->2324M(3072M) 79.496ms
  100000000

  real 0m2.953s 
  user 0m16.880s
  sys 0m0.872s

Now Epsilon, whoosh, 1.5s end-to-end, and less than 1s of user time, which is probably the only running Java thread itself, plus some OS memory management on allocation path.

  $ time jdk11.0.5/bin/java -XX:+UnlockExperimentalVMOptions -Xms3g -Xmx3g -XX:+UseEpsilonGC -Xlog:gc AL
  [0.004s][info][gc] Using Epsilon
  ...
  [1.387s][info][gc] Heap: 3072M reserved, 3072M (100.00%) committed, 2731M (88.93%) used

  real 0m1.480s
  user 0m0.830s
  sys 0m0.699s

You might think fully concurrent GCs would solve this, and they partially do, by avoiding large pauses. But they still eat CPUs. For example, while Shenandoah is close to Epsilon in doing the whole thing in about 1.7s wall clock time, it still takes quite significant CPU time. Therefore, that benefit is there because machine has spare CPUs to offload that work to.

  $ time jdk11-shenandoah/bin/java -XX:+UnlockExperimentalVMOptions -Xms3g -Xmx3g -XX:+UseShenandoahGC -Xlog:gc AL
  [0.009s][info][gc] Using Shenandoah
  ...
  [0.913s][info][gc] Trigger: Learning 3 of 5. Free (1651M) is below initial threshold (2150M)
  [0.913s][info][gc] GC(2) Concurrent reset 1265M->1267M(3072M) 0.689ms
  [0.914s][info][gc] GC(2) Pause Init Mark 0.111ms
  [1.276s][info][gc] GC(2) Concurrent marking 1267M->1925M(3072M) 361.985ms
  [1.306s][info][gc] GC(2) Pause Final Mark 0.465ms
  [1.306s][info][gc] GC(2) Concurrent cleanup 1924M->1748M(3072M) 0.171ms

  real 0m1.761s 
  user 0m5.688s 
  sys 0m0.633s

smarks · on Dec 9, 2019

> The post misleads readers into thinking that JVM runs the GC before exit. It does not.

Yes, the article is incorrect about this. We’ll make sure it gets fixed.

These numbers are quite interesting. Thanks for doing this analysis!

jelllyounf · on Dec 8, 2019

Perhaps there may be objects that depend on the finalizer callback for correctnesss. I have seen people use finalizer to do things like close file handles, and presumably not calling close may not guarantee data is persisted.

jlokier · on Dec 8, 2019

At least on unix systems, process termination implicitly calls close() on every file descriptor anyway. There should be no need to call it explicitly.

(You won't get the chance to log any write errors reported by close() or react to the errors, though.)

pron · on Dec 8, 2019

Finalizers were deprecated two years ago, and might be removed altogether in the not-too-distant-future.

vips7L · on Dec 8, 2019

Who would do that? Effective Java specifically says not to rely on finalizers.

capableweb · on Dec 8, 2019

It's not a issue? It's one of the cases where it does make sense to use Epsilon as the heap is cleared anyway on program exit.

From the post:

> There is a strong temptation to use Epsilon on deployed programs, rather than to confine it to performance tuning work. As a rule, the Java team discourages this use, with two exceptions. Short-running programs, like all programs, invoke the garbage collector at the end of their run. However, as JEP 318 explains, “accepting the garbage collection cycle to futilely clean up the heap is a waste of time, because the heap would be freed on exit anyway.”

mike_hock · on Dec 8, 2019

Yes, my point is, why do you have to explicitly select the no-op GC for this purpose, when the default GC could already behave this way?

chrisseaton · on Dec 8, 2019

The default GC's do already behave this way. They don't run GC on exit. Why on earth would they?

kjeetgill · on Dec 9, 2019

I believe you are correct but it's worth pointing out that the post above is from an oracle blog and seems to suggest they do run on exit.

chrisseaton · on Dec 8, 2019

Why do you need to clean up if your program is about to exit?

zbentley · on Dec 8, 2019

Memory might need to be cleaned up if the program was being run embedded in something else (it's not unheard of to embed JVMs inside e.g. C++ applications, and it's very common in scripting languages to do this).

Additionally, global destructors, while not guaranteed, can be very helpful if you let them run rather than just exiting and letting the system clean up file descriptors: for example, a clean disconnect from a database is often faster overall (on the database side, e.g. freeing up a connection slot) than a dirty "client hasn't phoned in for awhile/received unexpected FIN" disconnect via hard-exit.

chrisseaton · on Dec 8, 2019

> Memory might need to be cleaned up if the program was being run embedded in something else

Just unmap the heap pages. Don't run the GC!

> global destructors, while not guaranteed, can be very helpful if you let them run

If you want them to run on exit then you want Runtime.runFinalizersOnExit, not the GC. Finalizers are non-deterministic, asynchronous, and would take an indefinite number of GC cycles to run them for all objects.

mike_hock · on Dec 8, 2019

Hence my question.

chrisseaton · on Dec 8, 2019

I still don't understand your question, sorry:

> can't the regular GCs just not clean up on program exit

Why do you need to clean up your memory - at all, GC or otherwise - on program exit? The process and all its resources will be gone.

pests · on Dec 8, 2019

I think the concern is those resources might be external and not cleaning up correctly leaves them in an inconsistent state. Not saying this is best practice but I've seen it done.

chrisseaton · on Dec 8, 2019

Finalisers are not guaranteed to be called by GC in theory, and in practice they run asynchronously even if they are going to be called, so aren't likely to be called if you GC and then exit.

So how does calling GC help with anything?

pests · on Dec 8, 2019

I agree with you and how it's not reliable. I just remember the Rust community going through this same kerfuffle with their Drop trait not being guaranteed not too long ago.

steveklabnik · on Dec 8, 2019

That was four and a half years ago; time flies!

mike_hock · on Dec 8, 2019

But regular GCs DO run at the end of the program, so my question was, why don't they just skip the final cleanup.

MaxBarraclough · on Dec 8, 2019

Unless I'm missing something, the only reason you'd run a GC cycle on termination is to invoke finalizers.

If the platform doesn't feature finalizers, or if it is known that they've not been used, there'd be no point at all.

chrisseaton · on Dec 8, 2019

> But regular GCs DO run at the end of the program

Which GCs do that? I'm not aware of any.

balfirevic · on Dec 8, 2019

The article (apparently falsely) claims that default JVM GC does that:

"Short-running programs, like all programs, invoke the garbage collector at the end of their run."

0xbadcafebee · on Dec 8, 2019

The question was rhetorical; you and OP are saying the same thing.

amir734jj · on Dec 8, 2019

That's an interesting concept but unfortunately, garbage collection is not as simple as that.