PL/Rust 1.0: now a trusted language for Postgres

ttfkam · on April 9, 2023

This looks amazing! This appears to perfectly fit the huge gulf between pl/pgsql and C in the Postgres stack.

Pl/pgsql, while not exactly elegant, is well suited to bridging that gap between imperative and set-oriented business logic within the database.

On the other hand, pl/pgsql is far from optimal for defining custom types and implementing operators for those types. For that, we've needed C.

But now the possibility opens up to create efficient custom types, operators, and other functions that can run at machine speed without depending on what a vendor decides to bundle or the mostly unvetted quality of some 3rd party's C code extension.

Potentially huge!

kiwicopple · on April 9, 2023

this is an exciting development for Postgres. Since PL/Rust is now a trusted language, it means that cloud providers like RDS and Supabase will be able to provide it.

This means that you can write your database functions in rust, as an alternative to pgplsql / plv8.

(disclosure: i work at supabase)

techdragon · on April 9, 2023

Can… but how long until they actually do provide it?

I’m a big fan of PostgreSQL and it’s constantly an annoyance to find cool new capabilities provided by extensions I can never use since I’m not going to manage my own database in a critical environment for a lot of reasons… I’ve done it before, I know how hard it is to do well, and I don’t want this to be my job anymore… so when I find cool stuff like vector search or graph traversals but can never use them it’s just a constant disappointment.

Does this “trusted” state actually translate into greater adoption by cloud providers or is it just something the developers behind this effort hope will happen?

kiwicopple · on April 9, 2023

We are launching something with the RDS team this week that will make it very easy to install your own extensions that are written in Trusted Languages. We will share more on Friday.

In terms of pl/rust becoming available on RDS and supabase: months. It’s going through security audits now

ttfkam · on April 9, 2023

It will be very hard for other database engines to keep up if this gets implemented by cloud vendors. MySQL would get marginalized further and further not just in the relative lack of SQL features, but all the easily available add-ons the community will inevitably write for Postgres with no viable answer from its competition.

techdragon · on April 10, 2023

Wonderful! I’ll be checking for news on Friday. Thanks for replying, otherwise I’d probably not have noticed till a couple months from now when I next check the AWS service feature release news.

rywalker · on April 9, 2023

pl/rust will be available on CoreDB when we launch, along many other popular and interesting Postgres extensions. We’re a new company, closed beta - but join our waitlist @ https://coredb.io if you’d like a shot at early access

kiwicopple · on April 9, 2023

for those who want to know the difference between "trusted" and "untrusted":

https://tcdi.github.io/plrust/trusted-untrusted.html

_Normally, PL/Rust is installed as a "trusted" programming language named plrust. In this setup, certain Rust and pgx operations are disabled to preserve security. In general, the operations that are restricted are those that interact with the environment. This includes file handle operations, require, and use (for external modules). There is no way to access internals of the database server process or to gain OS-level access with the permissions of the server process, as a C function can do. Thus, any unprivileged database user can be permitted to use this language._

Languages like pl_python are "untrusted" and give too much access to the file system, which is why cloud providers never support them on their platforms

zabzonk · on April 9, 2023

cannot rust functions do anything a c function can?

ekimekim · on April 9, 2023

Normally yes, but it looks like the trusted PL/Rust being discussed here is limited to some subset of Rust. They specifically note that `unsafe` code is not allowed, which means you can't (for example) implement your own syscalls or construct a pointer into postgres internals memory.

However, they make it clear that this is not intended to be your only defence against an attacker:

> Note that this is done on a best-effort basis, and does not provide a strong level of security — it's not a sandbox, and as such, it's likely that a skilled hostile attacker who is sufficiently motivated could find ways around it

IshKebab · on April 9, 2023

I don't think it's really any defence at all against an attacker. I don't think Rust's `unsafe` was ever intended to be a security boundary.

Surely all a "sufficiently motivated" attacker would need to do is peruse the unsound bugs on GitHub?

https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Ais...

Those aren't considered to be security issues. Makes me wonder what the point of banning `unsafe` is at all. You're going to need some other system anyway...

feike · on April 9, 2023

The trust is not just banning unsafe, it is using a limited std:

> The "trusted" version of PL/Rust uses a unique fork of Rust's std entitled postgrestd when compiling LANGUAGE plrust user functions.

https://github.com/tcdi/postgrestd

IshKebab · on April 10, 2023

That doesn't matter. If you can use `unsafe` you can make syscalls directly.

ekimekim · on April 10, 2023

A large number of unsoundness bugs only work if you have access to the stdlib, because they're flaws in stdlib types and functions that use `unsafe` internally, and are supposed to present a safe interface around it.

If you a) don't have access to unsafe, and b) don't have access to the stdlib that lets you do powerful things without unsafe, then you're very limited in what you can do.

https://smallcultfollowing.com/babysteps/blog/2016/10/02/obs... discusses this further. Conceptually, you can think of "entirely Safe Rust" to be a very limited language, which you then progressively add "capabilites" to by exposing safe interfaces implemented with unsafe code. For example, Vec and Box (which require unsafe) grant safe code the ability to do heap allocations.

It's true that this is not designed as a security boundary. As I note in my comment above, the PL/Rust devs also make that clear. That doesn't mean it has no value as part of a defence in depth strategy.

workingjubilee · on April 11, 2023

The rustc driver for trusted PL/Rust prevents using the subset of the Rust language required to trigger those issues. Most of them are things that would have a hard time traversing the Postgres procedure call boundary, anyways, in a legitimate use-case, so this isn't expected to meaningfully affect actual user code.

vlovich123 · on April 10, 2023

So you have to give them your rust code to compile on your behalf then?

lexicality · on April 9, 2023

only when using `unsafe` which this forbids

tyingq · on April 9, 2023

I don't know much about rust, so it seemed odd to me that just forbidding unsafe would make for a reasonable sandbox. At least for the postgres concept of "trusted language", that's supposed to mean things like closing off access to the filesystem. Not just typical file io calls either, but more obscure stuff like sendfile().

I do see that rust access to sendfile() would be via a syscall, which is in the unsafe category...so perhaps that's not the best example.

But it does make me curious how comprehensive a sandbox PL/Rust is providing, beyond just forbidding unsafe.

iudqnolq · on April 9, 2023

pl/rust is its own target platform, so they provide their own standard library*. The IO stuff all panics, for example. (In Rust a panic should be used to indicate invariant failure. pl/rust catches panics and converts them to postgres errors).

In Rust you'd normally be able to link c code, but calling c requires unsafe because you have to manually ensure the c code upholds any relevant Rust invariants.

> But it does make me curious how comprehensive a sandbox PL/Rust is providing, beyond just forbidding unsafe.

They also hook the compiler and try and detect shenanigans. It's not perfect, but it's pretty thought out.

*Technically the Rust standard library builds on top of a lower-level io module, which is all you have to replace.

gpm · on April 9, 2023

More than just forbidding unsafe, but not enough to make this secure against competent adversaries by their own admission. They argue postgres itself isn't secure against competent adversaries so this doesn't matter too much.

https://tcdi.github.io/plrust/plrust.html#what-about-rust-co...

Rust keeps a list of soundness bugs via a tag on github - they're pretty common:

https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Ais...

yarg · on April 10, 2023

Read the article, they specify that they use a compiler target that restricts system access.

As I read it, it seemed that the unsafe prohibition was more about the safety of the code and not the security of the system.

chrismorgan · on April 9, 2023

This implementation blocks file system access and is thus not vulnerable, but note that in Rust in general you can actually violate safety on some platforms without any `unsafe` by modifying magic files like /proc/self/mem. This is a known issue but considered unfixable (because the technical solution of marking opening a file as unsafe would cause far more trouble than it could ever hope to solve).

IshKebab · on April 9, 2023

It's more of an interesting fact than an issue to be fixed.

formalLogicGirl · on April 9, 2023

And there is also an alternative implementation to Rust std that blocks, among others, filesystem access via std::fs https://github.com/tcdi/postgrestd

Cyphase · on April 9, 2023

Thank you for saying "disclosure" instead of "disclaimer".

robertlagrant · on April 9, 2023

Disclaimer: I know what I'm talking about.

mike_hearn · on April 11, 2023

But bear in mind, those providers generally don't provide all trusted languages out there already. PL/Java has existed for a long time, has a trusted mode, and isn't supported by Supabase for example.

jefflynn · on April 10, 2023

PL/Scheme - Code is data

https://www.rockdata.net/blog/2022/10/26/plscheme/

SahAssar · on April 10, 2023

"postgrestd" might be a slighty unfortunate name since there already exists postgrest, a daemon for providing a http interface to postgresql.

workingjubilee · on April 11, 2023

It's just the name of the repo: the actual target is ${arch}-postgres-linux-gnu.

pornel · on April 10, 2023

This looks risky to me. Rust is not a sandbox language. It's designed to catch accidental mistakes, not intentionally malicious code.

They've taken extra precautions to plug known holes, but it is using Rust beyond what it was designed for.

systems · on April 9, 2023

why would anyone want to use PL/Rust over PL/PQSL?

what is the use case?

mkl95 · on April 9, 2023

> PL/Rust is a loadable procedural language that enables writing PostgreSQL functions in the Rust programming language

Use to write PostgreSQL functions in Rust. Also

> The top advantages of PL/Rust include writing natively-compiled functions to achieve the absolute best performance, access to Rust's large development ecosystem, and Rust's compile-time safety guarantees.

samanator · on April 9, 2023

A crucial facet of optimization for computational triggers in PostgreSQL pertains to the implementation of event triggers, which enable operations to be executed in bulk (per DML statement) rather than on a per-row basis. It appears that, at present, PL/Rust has not incorporated support for event triggers. According to the documentation:

> Event Triggers and DO blocks are not (yet) supported by PL/Rust.

ttfkam · on April 9, 2023

Event triggers fire on DDL changes, not DML. You're thinking of statement-level triggers.

As far as event triggers and DO-blocks, that omission seems fine to me. Especially DO-blocks, which are essentially an inline code, one-off escape hatch in the middle of other SQL. Rust would not be helping any performance-sensitive critical paths in those cases.

samanator · on April 9, 2023

You're right! My mistake. I was thinking of statement triggers.

https://www.postgresql.org/docs/current/sql-createtrigger.ht...

> The REFERENCING option enables collection of transition relations

I don't see any examples of statement triggers...

ttfkam · on April 10, 2023

https://stackoverflow.com/a/72397774

Note the use of new_table and old_table as aliases. Instead of single records in NEW and OLD, you can select against new and old sets of records.

samanator · on April 12, 2023

I understand, and I've used the old and new table aliases. But I meant that I don't see a way to use those in the PL/Rust docs.

doctor_eval · on April 9, 2023

True, but it’s sometimes useful for experimenting, REPL style.

systems · on April 9, 2023

As someone who have done a lot of database development, none of these sound advantageous

Using a text oriented language like Perl with a good regexp engine might

DB performance, comes from indexes , table partitioning and in-memory tables and to compile query execution plans, so you save some time the very first you run a procedure

ixfo · on April 9, 2023

Doing in-database computation can be very advantageous for some applications, and writing those functions in Rust would be fantastic for some uses, not least for the library ecosystem. I did some work on video similarity search with in-DB search which would've certainly benefited.

ttfkam · on April 9, 2023

DB performance also comes from size efficiency of user-defined data types, user-defined operator functions (typically for use with those user-defined data types), etc.

Smaller, more efficient types directly translate to less disk usage and smaller indexes, both of which measurably improve database performance.

pjmlp · on April 10, 2023

I assume PL/pg is not behind the curve in relation to PL/SQL and T-SQL for UDTs.

doctor_eval · on April 9, 2023

> DB performance, comes from indexes , table partitioning and in-memory tables and to compile query execution plans, so you save some time the very first you run a procedure

The network round trip to the database can also be a pretty significant performant penalty, especially when iterating over large sets.

pjmlp · on April 10, 2023

Stored procedures are compiled to native code in any respectfull RDMS (Oracle, SQL Server, DB2,...), and I assume same applies to PostgreSQL.

zkirill · on April 10, 2023

Wouldn't it be easier to write and run unit tests in Rust?

tempaccount420 · on April 9, 2023

PLPSQL is an awful language for anything less than the highest level glue code

cobythedog · on April 9, 2023

Can you give some examples of why you think this? I'm sincerely curious as someone who uses PLPSQL nearly every day and knows it is not perfect, but surprised to hear it is "awful".

ttfkam · on April 9, 2023

I concur. Pl/pgsql isn't exactly elegant to be sure, but if you're already in a set-oriented mindset but need to add a sprinkling of imperative logic, it's well suited to the job.

asah · on April 9, 2023

no seriously, PL/pgsql is pretty horrible and obscure. But aside from subjective comments, there's very few algorithms available for it, approximately 0% of engineers know it and it's not taught in school, has little tooling compared with a first class programming language, you can't run pl/pgsql code outside of PostgreSQL, and (tell me when to stop)

PL/PGSQL is fine for "a bit more than a SELECT statement" and for very simple algorithms of <50 LOC. Anything more and please use a first class language like plrust, plv8, etc.

ttfkam · on April 10, 2023

The decision between pl/pgsql and something like plv8 isn't LOC. It's whether the solution best fits a set-oriented model or a procedural model. Both are valid, just different use cases.

There are a lot of cases where plv8 will thrash back and forth between the internals of Postgres and C and its v8 engine. These are usually the cases where set theory dominates the solution space.

On the flip side, if you're doing a lot of filter/map/reduce on large JSON payloads, plv8 is demonstrably better than pl/pgsql.

Right tool. Right job.

pjmlp · on April 10, 2023

The same engineers that use to rave about NoSQL a couple of years ago?

danielheath · on April 10, 2023

Rather than "Awful", I'd say it's showing its age.

The things I notice when working in PLPSQL:

  * Ample boilerplate that needs to be correct when it could be inferred.
  * Lack of a language server (doesn't help that PLPSQL is often embedded in strings in other files)
  * Papercuts like procedures vs functions having different call syntaxes
  * No/limited support for encapsulation
  * No/limited package management
  * Most new languages have syntactic sugar, like implicit returns / everything is an expression

doctor_eval · on April 9, 2023

I don’t think it’s awful but it’s a bit verbose. But are there any other trusted languages that are as directly connected to SQL?

pjmlp · on April 10, 2023

Nah, it does the job.