Re NAMEDATALEN specifically - you at least acknowledge it eventually needs fixin...

anarazel · on Feb 10, 2019

> Agreeing that there is something worthy of fixing is a first step. It should have happened with this NFS patch and imo some other stuff. The considerations for how and when should be dear with separately.

But there were like multiple people agreeing that it needs to be changed. Including the first two responses that the thread got. And there were legitimate questions around how interrupts need to be handled, about errors ought to be signaled if writes partially succeed and everything. Do you really expect us to integrate patches without thinking about that? And then the author vanished...

tamalesfan · on Feb 10, 2019

> Do you really expect us to integrate patches without thinking about that?

Whoa! You made huge leap there. At what point was it suggested that patches be recklessly applied?

That didn't happen. Your quote actually suggests a reasonable progression and at no point is there any suggestion, implied or otherwise, that changes be integrated without due consideration.

Not irrationally dismissing criticism != abandoning sound design and development.

mjw1007 · on Feb 10, 2019

Multiple people on this thread have expressed the opinion that the patch was "too perfect" or that the only reason that the patch wasn't simply accepted was that the maintainers "feel weird when there's nothing to criticize".

pgaddict · on Feb 10, 2019

Ugh. I don't know a single PostgreSQL committer with that attitude. Maybe my patches are too crap to be in that situation, not sure.

tamalesfan · on Feb 10, 2019

"Multiple people" were left with nothing better to which to attribute maintainer decisions. See the dysfunction there? Don't address the problem, or the proposed solution; just leave it to fester and create discontent.

That's not how well run projects function.

pgaddict · on Feb 10, 2019

Well, that's the thing - changing NAMEDATALEN is a seemingly small change, but it'll require much more work than just increasing the value. Increasing the value does not seem like a great option, because (a) how long before people start complaining about the new one and (b) it wastes even more memory. So I assume we'd switch to a variable-length strings, which however affects memory management, changes a lot of other stuff from fixed-length to variable-length, etc. So testing / benchmarking needed and all of that.

Which is why people are not enthusiastic about changing it, when there are fairly simple workarounds (assuming keeping the names short is considered to be a workaround).

marcosdumay · on Feb 10, 2019

> (a) how long before people start complaining about the new one

Very likely many years, or even never. People don't use large names because they like it, they always prefer small ones.

How much memory are we talking about?

pgaddict · on Feb 10, 2019

> Very likely many years, or even never. People don't use large names because they like it, they always prefer small ones.

Well, we don't have exactly a barrage of complaints about the current limit either.

> How much memory are we talking about?

Good question.

The thing is - it's not just about table names. NameData is used for any object name, so it affects pretty much any system catalog storing name. A simple grep on the repo says it affects about 40 catalogs (out of 80), including pg_attribute, pg_class, pg_operator, pg_proc, pg_type (which tend to be fairly large).

So the amount of additional memory may be quite significant, because all of this is cached in various places.

anarazel · on Feb 10, 2019

Yea, I think pg_attribute is likely to be the main issue here. For one, it obviously exists many times per table, and there are workloads with a lot of tables. But also importantly it's included in all tuple descriptors, which in turn get created during query execution in a fair number of places. It's currently ~140 bytes, with ~64bytes of that being the column name - just doubling that would increase the overhead noticeably, and we already have plenty of complaints about pg_attribute. I think it'd be fairly useless to just choose another fixed size, we really ought to make it variable length.

pgaddict · on Feb 10, 2019

Is it ~140 bytes? pahole says it's 112 (without CATALOG_VARLEN).

The impact of doubling NameData size would be quite a bit worse, though, thanks to doubling of chunk-size in allocset. At the moment it fits into a 128B chunk (so just ~16B wasted), but by doubling NameData to 128B the struct would suddenly be 176B, which requires 256B chunk (so 80B wasted). Yuck.

anarazel · on Feb 10, 2019

> Is it ~140 bytes? pahole says it's 112 (without CATALOG_VARLEN).

Well, but on-disk varlena data is included. pg_column_size() averages 144 bytes for pg_attribute on my system.

> The impact of doubling NameData size would be quite a bit worse, though, thanks to doubling of chunk-size in allocset. At the moment it fits into a 128B chunk (so just ~16B wasted), but by doubling NameData to 128B the struct would suddenly be 176B, which requires 256B chunk (so 80B wasted). Yuck.

I'm not sure that actually matters that much. Most attributes are allocated as part of TupleDescData, but that allocates all attributes together.

pgaddict · on Feb 10, 2019

> Well, but on-disk varlena data is included. pg_column_size() averages 144 bytes for pg_attribute on my system.

Sure, but I thought we're talking about in-memory stuff as you've been talking about tuple descriptors. I don't think the on-disk size matters all that much, TBH, it's likely just a tiny fraction of data stored in the cluster.

> I'm not sure that actually matters that much. Most attributes are allocated as part of TupleDescData, but that allocates all attributes together.

Ah. Good point.

anarazel · on Feb 10, 2019

> I don't think the on-disk size matters all that much, TBH, it's likely just a tiny fraction of data stored in the cluster.

I've seen pg_attribute take up very significant fractions of the database numerous times, so I do think the on-disk size can matter. And there's plenty places, e.g. catcache, where we store the full on-disk tuple (rather than just the fixed-length prefix); so the on-disk size is actually quite relevant for the in-memory bit too.