Postgres is great as a queue, but this post doesn't really get into the features...

gavanm · on Feb 10, 2024

SQL Server has included a message queue capability for a while now. It’s called SQL Server Service Broker:

https://learn.microsoft.com/en-us/sql/database-engine/config...

I haven’t had the opportunity to use it in production yet - but it’s worth keeping in mind.

I’ve helped fix poor attempts of “table as queue” before - once you get the locking hints right, polling performs well enough for small volumes - from your list above, the only thing I can’t recall there being in sql server is a LISTEN - but I’m not really an expert on it.

halfcat · on Feb 10, 2024

Also Azure is adding SQL Server trigger support

https://learn.microsoft.com/en-us/azure/azure-functions/func...

dagss · on Feb 10, 2024

This stuff has a latency measured in minutes though, limiting the usecases a lot.

EvanAnderson · on Feb 10, 2024

Came here to mention Service Broker. I've used it in production in multi-server configurations for a number of years. It works really well but it's terribly obscure. Nobody seems to know it's even there.

The learning curve is steep and there are some easy anti-patterns you can fall into. Once you grok it, though, it really is very good.

The LISTEN functionality is absolutely there. Your activation procedure is invoked by the server upon receipt of records into the queue. It's very slick. No polling at all.

Deukhoofd · on Feb 10, 2024

> * use LISTEN to be notified of rows that have changed that the backend needs to take action on (so you're not actively polling for new work)

> * use NOTIFY from a trigger so all you need to do is INSERT/UPDATE a table to send an event to listeners

Could you explain how that is better than just setting up Event Notifications inside a trigger in SQL Server? Or for that matter just using the Event Notifications system as a queue.

https://learn.microsoft.com/en-us/sql/relational-databases/s...

> * you can select using SKIP LOCKED (as the article points out)

SQL Server can do that as well, using the READPAST table hint.

https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

> * you can use partial indexes to efficiently select rows in a particular state

SQL Server has filtered indexes, are those not the same?

https://learn.microsoft.com/en-us/sql/relational-databases/i...

killingtime74 · on Feb 10, 2024

It's better because it's not SQL Server

michaelcampbell · on Feb 11, 2024

> Could you explain how that is better than just setting up Event Notifications inside a trigger in SQL Server.

Why? The article wasn't about that. Hear me out here, but there's value in having multiple implementations for the same idea.

Deukhoofd · on Feb 14, 2024

The argument of the OOP I was responding to was about how Postgres was better than other SQL solutions due to 4 reasons, with SQL Server being explicitly named. I was merely wondering whether his reasoning actually considered the abilities of SQL Server.

diek · on Feb 10, 2024

Admittedly I used SQL Server pretty heavily in the mid-to-late-2000s but haven't kept up with it in recent years so my dig may have been a little unfair.

Agree on READPAST being similar to SKIP_LOCKED, and filtered indexes are equivalent to partial indexes (I remember filtered indexes being in SQL Server 2008 when I used it).

Reading through the docs on Event Notifications they seem to be a little heavier and have different deliver semantics. Correct me if I'm wrong, but Event Notifications seem to be more similar to a consumable queue (where a consumer calling RECEIVE removes events in the queue), whereas LISTEN/NOTIFY is more pubsub, where every client LISTENing to a channel gets every NOTIFY message.

taspeotis · on Feb 10, 2024

I agree that SQL Server has similar functionality but Service Broker is pretty clunky compared to LISTEN.

go_prodev · on Feb 10, 2024

Thanks for the links, I was wondering if SQL Server supports similar features.

jakjak123 · on Feb 10, 2024

Using the INSERT/UPDATES is kind of limiting for your events. Usually you will want richer event (higher level information) than the raw structure of a single table. Use this feature very sparingly. Keep in mind that LISTEN should also ONLY be used to reduce the active polling, it is not a failsafe delivery system, and you will not get notified of things that happened while you were gone.

diek · on Feb 10, 2024

For my use cases the aim is really to not deal with events, but deal with the rows in the tables themselves.

Say you have a `thing` table, and backend workers that know how to process a `thing` in status 'new', put it in status 'pending' while it's being worked on, and when it's done put it in status 'active'.

The only thing the backend needs to know is "thing id:7 is now in status:'new'", and it knows what to do from there.

The way I generally build the backends, the first thing they do is LISTEN to the relevant channels they care about, then they can query/build whatever understanding they need for the current state. If the connection drops for whatever reason, you have to start from scratch with the new connection (LISTEN, rebuild state, etc).

halfcat · on Feb 10, 2024

> Usually you will want richer event (higher level information) than the raw structure of a single table.

JSONB fields in Postgres are pretty awesome for this. You can query the JSON fields, index them, and all that.

Is that what you mean?

Rapzid · on Feb 10, 2024

I use a generic subsystem modeled loosely after SQS and Golang River.

I have a visible_at field which indicates when the "message" will show up in checkout commands. When checked out or during a heartbeat from the worker this gets bumped up by a certain amount of time.

When a message is checked out, or re-checked out, a key(GUID) is generated and assigned. To delete the message this key must match.

A message can be checked out if it exists and the visible_at field is older or equal to NOW.

That's about it for semantics. Any further complexity, such as workflows and states, are modeled in higher level services.

If I felt it mattered for perf and was worth the effort I might model this in a more append-only fashion taking advantage of HOT updates and etc. Maybe partition the table by day and drop partitions older than longest supported process. Use the sparse index to indicate deleted.. Hard to say though with SSDs, HOT, and the new btree anti-split features..

leontrolski · on Feb 10, 2024

Thanks for the comprehensive reply, does the following argument stand up at all? (Going on the assumption that LISTEN is one more concept and one less concept is a good thing).

If I have say 50 workers polling the db, either it’s quiet and there's no tasks to do - in which case I don't particularly care about the polling load. Or, it's busy and when they query for work, there's always a task ready to process - in this case the LISTEN is constantly pinging, which is equivalent to constantly polling and finding work.

Regardless, is there a resource (blog or otherwise) you'd reccomend for integrating LISTEN with the backend?

diek · on Feb 10, 2024

In a large application you may have dozens of tables that different backends may be operating on. Each worker pool polling on tables it may be interested on every couple seconds can add up, and it's really not necessary.

Another factor is polling frequency and processing latency. All things equal, the delay from when a new task lands in a table to the time a backend is working on it should be as small as possible. Single digit milliseconds, ideally.

A NOTIFY event is sent from the server-side as the transaction commits, and you can have a thread blocking waiting on that message to process it as soon as it arrives on the worker side.

So with NOTIFY you reduce polling load and also reduce latency. The only time you need to actually query for tasks is to take over any expired leases, and since there is a 'lease_expire' column you know when that's going to happen so you don't have to continually check in.

As far as documentation, I got a simple java LISTEN/NOTIFY implementation working initially (2013?-ish) just from the excellent postgres docs: https://www.postgresql.org/docs/current/sql-notify.html

mulmen · on Feb 10, 2024

What happens when the backend worker dies while processing?

0cf8612b2e1e · on Feb 10, 2024

Usual way is you update the table with a timestamp when the task was taken. Have one periodic job which queries the table looking for tasks that have outlived the maximum allowed processing time and reset the status so the task is available to be requeued.

foofie · on Feb 10, 2024

Excellent comment. Thank you for taking the time to write it.

eddd-ddde · on Feb 10, 2024

There's actually an extension called "tcn" or trigger change notification that provides such a trigger put of the box.