Unique keys #10

pereferrera · 2013-01-29T15:25:13Z

It can be useful to allow the API users to indicate that there are unique constraints when building a Table.

For example, one may want to use Splout for generating a SQL store out of a set of registers that were appended to a big file (e.g. HDFS). Sometimes it could happen that a register with the same "id" was appended twice or more times. In this case, the user probably doesn't want to have duplicates in the final table.

The opened question is how this should be handled. I see two possible ways:

Letting SQLite handle unique constraints, and catching the exception.
Handling the unique constrain before inserting to SQLite. This could probably be done using the specific OrderBy (InsertionOrderBy) that can now be used to sort the data before appending it to SQLite, but it is unclear.

ivanprado · 2013-01-30T10:27:57Z

For these cases, something important is which version should survive. User
probably would like a oarticular versions to survive, based on some policy.
Possible policies would be:

Insertion order. Last/first one survive. Easy to handle by the use of
InsertionOrderBy
Based on some kind of timestamp... but maybe the used don't want to
insert the timestamp in the final table. (some kind of hidden column)

2013/1/29 Pere Ferrera [email protected]

It can be useful to allow the API users to indicate that there are unique
constraints when building a Table.

For example, one may want to use Splout for generating a SQL store out of
a set of registers that were appended to a big file (e.g. HDFS). Sometimes
it could happen that a register with the same "id" was appended twice or
more times. In this case, the user probably doesn't want to have duplicates
in the final table.

The opened question is how this should be handled. I see two possible ways:

Letting SQLite handle unique constraints, and catching the exception.

Handling the unique constrain before inserting to SQLite. This could
probably be done using the specific OrderBy (InsertionOrderBy) that can now
be used to sort the data before appending it to SQLite, but it is unclear.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/10.

Iván de Prado
CEO & Co-founder
www.datasalt.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unique keys #10

Unique keys #10

pereferrera commented Jan 29, 2013

ivanprado commented Jan 30, 2013

Unique keys #10

Unique keys #10

Comments

pereferrera commented Jan 29, 2013

ivanprado commented Jan 30, 2013