[Bucardo-general] Design of new txntime field

Mon Sep 11 15:31:13 UTC 2017

On Sun, 10 Sep 2017, Greg Sabino Mullane wrote:

>> That won't sort unless you pad txid_current(), right? As a stringified 
>> integer with varying digits, it'll depend on number of digits.
>>
>> And actually the timestamp won't sort right as a string either, since 
>> it will have fewer digits sometimes when the microseconds end with 0 
>> and are truncated, leaving fewer string digits.
>
> Excellent points, I'm glad you spotted that. I've been playing with this 
> today, and we can run now() back through to_char to get the full 
> microseconds each time like this:
>
> select to_char(now(), 'YYYY-MM-DD HH24:MI:SS.US');

Yes, that's better.

> Then I started looking at formatting the txnid_current() output. Problem 
> is, it's a bigint, which means quite a lot of padding for most, if not 
> all, standard installations. Then I had another idea - what about the 
> PID? Those are distributed sequentially or randomly, but all that really 
> matters is that two backends who happen to have the exact same now() 
> have diferent PIDs - which should always happen. Then the question of 
> how big to make the string - PIDs can be forced quite high with some 
> effort, but for purposes of coliision avoidance, we probably only need 
> some of the digits. So I came up with:
>
> select to_char(right(pg_backend_pid()::text, 4)::int, 'FM00000');
>
> While PIDs can be negative on some systems, that should not be a problem 
> either - it still makes a good tie breaker.
>
> We can also compress things a little - we want to keep the timestamp 
> human readable, but we don't need all the whitespace that now() gives by 
> default. This, the final proposal becomes:
>
> select to_char(now(), 'YYYYMMDD.HH24MISS.US.')::text || to_char(right(pg_backend_pid()::text, 5)::int,'FM00000');
>
> The output looks like this:
>
> 20170910.143526.395762.04482
>
> That's 28 characters, which is not too bad overall. And it is still 
> human readable, and sortable.

If we're going for space savings, it'd be just as well to leave out the 
unnecessary dots, wouldn't it? It's still pretty easily readable without 
them if you know what and how wide the fields are.

> So how does it do in the criteria I made for the new txntime?
>
> * No collisions
> The addition of the last five of the current PID should avoid collisions.
> In theory, there may still be collisions - which we can reduce further by
> going to 6 or 7 digits, but is that really worth it? On a system using
> 6 or more digits, you'd have to have two backends created exactly
> 100,000 PIDs apart (if sequential), or happening to have the same last
> five digits (if random), a 9 in 10,0000 chance, or 99 in 10,000.
> Think I just talked myself into 6 digits. :)

We could use the same approach with txnid_current() and keep only the last 
few digits, right? That would be a lot more consistent across OSes than 
PID allocations are.

> * Human readable
> Not quite as intuitive as the existing timestamp, but fairly obvious and
> easy to read. And we can document the right-hand-side of it. Might be
> worth it to force it all to GMT as well.

+1 to using UTC.

> * Disk space
> We want the delta tables to be as small as possible. Right now, this
> solution is a lot more space than a plain timestamp, but I don't
> see an easy way around this problem right now.

It could probably be converted to 1 or 2 bigints, which would more 
efficiently use the space available, and sort more quickly, at the cost of 
no easy readability of the source inputs.

> On rereading this, maybe we don't need the microseconds anymore either, as
> the PID chunk should be enough to distinguish things? Or at least we could
> do milliseconds, and save a few characters.

That's probably true.

Jon