Interesting — looks like the join isn't pipelined. The entire right-hand side evaluates synchronously. So it has to wait for the entire right-hand result set before it can evaluate the join operator, instead of streaming it concurrently. I'm surprised anyone would do it this way in Java, which has good support for concurrency.
Edit: Actually the file you linked to was a test file. Hash join code is here [1], and it uses ES' scrolling feature to incrementally join, though it's not pipelined. Not sure scrolling is entirely appropriate for this; it will potentially hold an unpredictable amount of memory on the server end.
Edit: Actually the file you linked to was a test file. Hash join code is here [1], and it uses ES' scrolling feature to incrementally join, though it's not pipelined. Not sure scrolling is entirely appropriate for this; it will potentially hold an unpredictable amount of memory on the server end.
[1] https://github.com/NLPchina/elasticsearch-sql/blob/5cd6ab639...