[Bucardo-general] Status on Multi-Column Primary Key Support
Ben Allen
bsa at lanl.gov
Fri Oct 26 21:51:43 UTC 2007
On Oct 26, 2007, at 3:13 PM, Greg Sabino Mullane wrote:
> As promised, an estimate of multi-column primary keys in Bucardo. I've
> been working on the problem of smoothly adding and removing target
> databases and keeping bucardo_delta clean at the same time, but have
> spent a little thought on the key problem.
>
> Basically, we can either add new columns to goat, or overload the
> existing pkey and pkeytype columns to hold multiple fields, such as
> "abc|def". Then we create new versions of bucardo_delta, for each
> number
> of primary keys. So bucardo_delta2 would get created as needed,
> etc. In
> reality, keys with more than 2 primary keys are unlikely, so I'm not
> convinced that we need to be so flexible as to hold any number of
> keys.
> Perhaps three would be sufficient?
In most cases I would think three would be sufficient. For my current
project, the max I have is a three column primary keys and only in
one table. Although, I would hate to have someone run into an imposed
limit. Depending on how much difficulty adding
'unlimited' (significantly high enough, i.e. 10-20+) number of
primary keys would be compared to a limited number, I would say lean
towards goal of unlimited primary keys. Otherwise if you want to hard
code the max number I would say an extra safe top end would be 5 or 6
columns (can't really think why anyone would do that but who knows).
> The other big thing that would need
> changing would be the main join against the bucardo_delta table from
> within the kid. I've got some early prototypes, but I'm still
> undecided
> on which of these to do:
>
> 1. goat.pkey = 'id1' goat.pkeytype = 'int'; goat.pkey2 = 'id2'
> goat.pkeytype2 = 'text';
>
> 2. goat.pkey = 'id1|id2'; goat.pkeytype = 'int|text';
>
> 3. goat.pkey = fk to a separate table.
>
> Number three is just my instinctual desire to normalize things, but it
> not needed as the data is not going to change often, and we'd have to
> parse it apart anyway. Number two is nice because we don't have to
> change the schema at all, it supports any number, and a quick split
> gets
> you the needed information. Number one prevents any changes to
> existing
> goat tables, keeps things very simple, but hardcodes some more columns
> into the goat table, which is not pretty. Now that I've written it
> out,
> I'm leaning towards number two strongly, but any input is welcome. I'd
> estimate a working prototype could arrive in about a week from now,
> based on how much time I'm able to give it between now and then.
Hmm... 3 would be nice and normalized, but you are right 2 would work
fine. Just make sure you use a character for the separator that can
never be in a column name or data-type (I guess thats probably going
to be a little hard, "|" will likely work and looks logical). Would 2
be any harder to validate on insert/update then 3? Number 1 is ugly
and harder to expand in the future if the need arises. My vote is for
number 2.
>
> --
> Greg Sabino Mullane greg at endpoint.com
> End Point Corporation 610-983-9073
>
Thanks for the status update.
Regards,
Ben Allen
More information about the Bucardo-general
mailing list