[Bucardo-general] Status on Multi-Column Primary Key Support

Ben Allen bsa at lanl.gov
Fri Oct 26 21:51:43 UTC 2007


On Oct 26, 2007, at 3:13 PM, Greg Sabino Mullane wrote:

> As promised, an estimate of multi-column primary keys in Bucardo. I've
> been working on the problem of smoothly adding and removing target
> databases and keeping bucardo_delta clean at the same time, but have
> spent a little thought on the key problem.
>
> Basically, we can either add new columns to goat, or overload the
> existing pkey and pkeytype columns to hold multiple fields, such as
> "abc|def". Then we create new versions of bucardo_delta, for each  
> number
> of primary keys. So bucardo_delta2 would get created as needed,  
> etc. In
> reality, keys with more than 2 primary keys are unlikely, so I'm not
> convinced that we need to be so flexible as to hold any number of  
> keys.
> Perhaps three would be sufficient?
In most cases I would think three would be sufficient. For my current  
project, the max I have is a three column primary keys and only in  
one table. Although, I would hate to have someone run into an imposed  
limit. Depending on how much difficulty adding  
'unlimited' (significantly high enough, i.e. 10-20+) number of  
primary keys would be compared to a limited number, I would say lean  
towards goal of unlimited primary keys. Otherwise if you want to hard  
code the max number I would say an extra safe top end would be 5 or 6  
columns (can't really think why anyone would do that but who knows).


> The other big thing that would need
> changing would be the main join against the bucardo_delta table from
> within the kid. I've got some early prototypes, but I'm still  
> undecided
> on which of these to do:
>
> 1. goat.pkey = 'id1' goat.pkeytype = 'int'; goat.pkey2 = 'id2'
> goat.pkeytype2 = 'text';
>
> 2. goat.pkey = 'id1|id2'; goat.pkeytype = 'int|text';
>
> 3. goat.pkey = fk to a separate table.
>
> Number three is just my instinctual desire to normalize things, but it
> not needed as the data is not going to change often, and we'd have to
> parse it apart anyway. Number two is nice because we don't have to
> change the schema at all, it supports any number, and a quick split  
> gets
> you the needed information. Number one prevents any changes to  
> existing
> goat tables, keeps things very simple, but hardcodes some more columns
> into the goat table, which is not pretty. Now that I've written it  
> out,
> I'm leaning towards number two strongly, but any input is welcome. I'd
> estimate a working prototype could arrive in about a week from now,
> based on how much time I'm able to give it between now and then.
Hmm... 3 would be nice and normalized, but you are right 2 would work  
fine. Just make sure you use a character for the separator that can  
never be in a column name or data-type (I guess thats probably going  
to be a little hard, "|" will likely work and looks logical). Would 2  
be any harder to validate on insert/update then 3? Number 1 is ugly  
and harder to expand in the future if the need arises. My vote is for  
number 2.

>
> -- 
> Greg Sabino Mullane greg at endpoint.com
> End Point Corporation 610-983-9073
>

Thanks for the status update.

Regards,
Ben Allen



More information about the Bucardo-general mailing list