Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique keys #10

Open
pereferrera opened this issue Jan 29, 2013 · 1 comment
Open

Unique keys #10

pereferrera opened this issue Jan 29, 2013 · 1 comment

Comments

@pereferrera
Copy link
Contributor

It can be useful to allow the API users to indicate that there are unique constraints when building a Table.

For example, one may want to use Splout for generating a SQL store out of a set of registers that were appended to a big file (e.g. HDFS). Sometimes it could happen that a register with the same "id" was appended twice or more times. In this case, the user probably doesn't want to have duplicates in the final table.

The opened question is how this should be handled. I see two possible ways:

  1. Letting SQLite handle unique constraints, and catching the exception.
  2. Handling the unique constrain before inserting to SQLite. This could probably be done using the specific OrderBy (InsertionOrderBy) that can now be used to sort the data before appending it to SQLite, but it is unclear.
@ivanprado
Copy link
Contributor

For these cases, something important is which version should survive. User
probably would like a oarticular versions to survive, based on some policy.
Possible policies would be:

  1. Insertion order. Last/first one survive. Easy to handle by the use of
    InsertionOrderBy
  2. Based on some kind of timestamp... but maybe the used don't want to
    insert the timestamp in the final table. (some kind of hidden column)

2013/1/29 Pere Ferrera [email protected]

It can be useful to allow the API users to indicate that there are unique
constraints when building a Table.

For example, one may want to use Splout for generating a SQL store out of
a set of registers that were appended to a big file (e.g. HDFS). Sometimes
it could happen that a register with the same "id" was appended twice or
more times. In this case, the user probably doesn't want to have duplicates
in the final table.

The opened question is how this should be handled. I see two possible ways:

  1. Letting SQLite handle unique constraints, and catching the exception.
  2. Handling the unique constrain before inserting to SQLite. This could
    probably be done using the specific OrderBy (InsertionOrderBy) that can now
    be used to sort the data before appending it to SQLite, but it is unclear.


Reply to this email directly or view it on GitHubhttps://github.com//issues/10.

Iván de Prado
CEO & Co-founder
www.datasalt.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants