PRQL is a similar idea, and it compiles to SQL: https://prql-lang.org/ from invo...

snthpy · 2025-02-13T16:20:54 1739463654

SQL pipe syntax is a great step in the right direction but what I think sets PRQL apart (in my very biased opinion, contributor) is the ability to define functions.

Here's a simple example:

    # define the "take_smallest" function
    let take_smallest = func n col tbl<relation> -> (
     from tbl
     sort col
     take n
     )
    
    # find smallest 3 tracks by milliseconds
    from tracks
    take_smallest 3 milliseconds

You can try this now in the online playground: https://prql-lang.org/playground/

That's simple enough and there's not that much gained there but say you now want to find the 3 smallest tracks per album by bytes?

That's really simple in PRQL and you can just reuse the "take_smallest" function and pass a different column name as an argument:

    from tracks
    group album_id (
     take_smallest 3 bytes
     )

snthpy · 2025-02-13T16:23:50 1739463830

Here's a more complex example for sessionizing user events:

    # specify the target SQL dialect
    prql target:sql.duckdb
    
    # define the "sessionize" function
    let sessionize = func user_col date_col max_gap:365 tbl<relation> -> (  
      from tbl
      group user_col (
        window rows:-1..0 (
          sort date_col
          derive prev_date=(lag 1 (date_col|as date))
          )
      )
      derive {
        date_diff = (date_col|as date) - prev_date,
        is_new_session = case [date_diff > max_gap  || prev_date==null => 1, true => 0],
        }
      window rows:..0 (
        group user_col (
          sort {date_col}
          derive user_session_id = (sum is_new_session)
        )
        sort {user_col, date_col}
        derive global_session_id = (sum is_new_session)
      )
      select !{prev_date, date_diff, is_new_session}
    )
    
    # main query
    from invoices
    select {customer_id, invoice_date}
    sessionize customer_id invoice_date max_gap:365
    sort {customer_id, invoice_date}

You can also try that in the playground: https://prql-lang.org/playground/

jshute00 · 2025-02-14T05:43:32 1739511812

SQL with pipe syntax can also do functions like that. An sequence of pipe operators can be saved as a table-valued function, and then reused by invoking in other queries with CALL.

Example:

  CREATE TEMP TABLE FUNCTION ExtendDates(input ANY TABLE, num_days INT64)
  AS
  FROM input
  |> EXTEND date AS original_date
  |> EXTEND max(date) OVER () AS max_date
  |> JOIN UNNEST(generate_array(0, num_days - 1)) diff_days
  |> SET date = date_add(date, INTERVAL diff_days DAY)
  |> WHERE date <= max_date
  |> SELECT * EXCEPT (max_date, diff_days);

  FROM Orders
  |> RENAME o_orderdate AS date, o_custkey AS user_id
  |> CALL ExtendDates(7)
  |> LIMIT 10;

from the script here: https://github.com/google/zetasql/blob/master/zetasql/exampl...

snthpy · 2025-02-14T07:23:03 1739517783

Oh nice! Thanks for setting that straight. Apologies, I must have either missed that or forgotten about it.

aoeusnth1 · 2025-02-14T16:49:50 1739551790

BigQuery has table-valued functions already, which can be used with pipes with a CALL clause.

gorkaerana · 2025-02-13T14:58:10 1739458690

I'm tempted to give it a try just for the "For HackerNews enthusiasts" section of their landing page, which states "The PRQL compiler is written in Rust" and "We talk about “orthogonal language features” a lot". Love me some banter in technical documentation :D

benrutter · 2025-02-13T15:06:11 1739459171

I love the design behind PRQL, was a little dissapointed when I tried using it with an MSSQL server and found there was no decent way to filter for `LIKE '%something%'`.

(PRQL uses regex for stuff like this, which sounds great but then isn't supported on some databases like MSSQL)

default-kramer · 2025-02-13T15:42:34 1739461354

I'm very surprised to learn that PRQL does not natively support `like`, but you can add it yourself: https://github.com/PRQL/prql/issues/1123#issuecomment-135385...

promiseofbeans · 2025-02-13T17:11:24 1739466684

Huh, it's super cool they have a built-in way of extending to support missing engine features!

benrutter · 2025-02-14T14:44:54 1739544294

Oh man that's awesome!

throwaway127482 · 2025-02-13T14:42:36 1739457756

I think I like this new pipe syntax a lot better than PRQL. Feels like it has the potential to become officially supported by DBs like MySQL since it's a relatively small change to existing SQL syntax.

digikata · 2025-02-13T18:46:38 1739472398

IIRC the SQL pipe google paper references PRQL. I think both make sense in their own context. SQL pipe is a smaller leap for users familiar with SQL and prql sort of takes on more change. I do wonder if the smaller leap might cause more deja-vu displacement initially than a bigger shift like PRQL. I don't know the answer other than look at users of both over time and see how they do with the different syntaxes.

mritchie712 · 2025-02-13T14:41:29 1739457689

there's an duckdb extension[0] for it too

0 - https://duckdb.org/community_extensions/extensions/prql.html

e1g · 2025-02-13T15:34:24 1739460864

“Malloy” is one one in the same spirit, but probably more comprehensive https://github.com/malloydata/malloy

ashu1461 · 2025-02-13T16:37:32 1739464652

In general in languages like PRQL / Pipe syntax, though conciseness is a big advantage, do they help in query optimisation in any regard ?