Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uhugeint #2

Closed
wants to merge 135 commits into from
Closed

Uhugeint #2

wants to merge 135 commits into from

Conversation

nickgerrets
Copy link
Owner

No description provided.

nickgerrets pushed a commit that referenced this pull request Sep 27, 2023
nickgerrets pushed a commit that referenced this pull request Jan 10, 2024
commit 13fb9e2
Author: Tmonster <[email protected]>
Date:   Mon Dec 18 11:37:06 2023 -0800

    PR cleanup #2

commit 066f3cc
Author: Tmonster <[email protected]>
Date:   Mon Dec 18 11:21:07 2023 -0800

    fix dereference nullptr

commit 094db53
Author: Tmonster <[email protected]>
Date:   Mon Dec 18 10:43:15 2023 -0800

    PR cleanup

commit c9a1ecd
Merge: 2893c0c 6258996
Author: Tmonster <[email protected]>
Date:   Mon Dec 18 10:22:20 2023 -0800

    Merge remote-tracking branch 'upstream/main' into reservoir_sampler_Vectors

commit 2893c0c
Author: Tmonster <[email protected]>
Date:   Thu Dec 14 13:10:25 2023 +0100

    make format fix. Get compiler ready

commit 80b5f13
Merge: e30b726 c29eb0c
Author: Tmonster <[email protected]>
Date:   Thu Dec 14 12:34:18 2023 +0100

    Merge branch 'main' into reservoir_sampler_Vectors

commit e30b726
Author: Tmonster <[email protected]>
Date:   Thu Dec 14 12:33:03 2023 +0100

    remove all parallelism. will do it in the next iteration

commit e8e088d
Author: Tmonster <[email protected]>
Date:   Thu Dec 14 11:52:27 2023 +0100

    still failing a test. Merging samples collected in parallel is difficult, and probably doesnt provide much benefit. Going to leave it for later

commit 96bfa1c
Merge: 45fa9a5 3237244
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 13 17:02:31 2023 +0100

    Merge remote-tracking branch 'upstream/main' into reservoir_sampler_Vectors

commit 45fa9a5
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 13 14:36:50 2023 +0100

    make format-fix

commit 049327b
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 13 14:31:22 2023 +0100

    try to fix this parallel issue

commit a5b290d
Merge: 21d4120 8849f97
Author: Tom Ebergen <[email protected]>
Date:   Tue Dec 12 11:18:52 2023 +0100

    Merge remote-tracking branch 'upstream/main' into reservoir_sampler_Vectors

commit 21d4120
Merge: 795c454 e117c34
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 11 12:43:43 2023 +0100

    Merge branch 'main' into reservoir_sampler_Vectors

commit c29eb0c
Merge: 6bf31e1 25906f3
Author: Tmonster <[email protected]>
Date:   Thu Dec 7 15:23:06 2023 +0100

    Merge remote-tracking branch 'upstream/main'

commit 6bf31e1
Author: Elliana May <[email protected]>
Date:   Mon Dec 4 22:21:30 2023 +0800

    fix warning

commit a521081
Author: Elliana May <[email protected]>
Date:   Mon Dec 4 21:58:50 2023 +0800

    add test for streaming extracted statements

commit 5ee902a
Author: Elliana May <[email protected]>
Date:   Mon Dec 4 21:15:30 2023 +0800

    add some tests of duckdb_execute_prepared_streaming

commit 58b6664
Author: Elliana May <[email protected]>
Date:   Mon Dec 4 21:02:48 2023 +0800

    chore(docs): update docs for duckdb_execute_prepared_streaming

commit a8e49b1
Author: Hannes Mühleisen <[email protected]>
Date:   Tue Dec 5 11:31:21 2023 +0100

    add test case, apparently from snowflake

commit a7ee1dd
Author: Hannes Mühleisen <[email protected]>
Date:   Tue Dec 5 11:25:51 2023 +0100

    enable implicit fallthrough warning for /src and fixed a few instances

commit c6bf4c6
Author: Hannes Mühleisen <[email protected]>
Date:   Tue Dec 5 11:02:54 2023 +0100

    supporting more physical types of parquet time columns with time zone info

commit baf670f
Author: Jacob <[email protected]>
Date:   Mon Dec 4 09:05:56 2023 -0800

    make BufferPool members protected

commit 878e7d2
Author: Yves <[email protected]>
Date:   Mon Dec 4 12:00:49 2023 -0500

    Mark BufferPool getters const

commit a7ddb87
Author: Gabor Szarnyas <[email protected]>
Date:   Mon Dec 4 16:22:44 2023 +0100

    Capitalize URL in httpfs extension flags

commit 795c454
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 6 13:23:29 2023 +0100

    removing reservoir type checks

commit 6e0e431
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 6 11:25:50 2023 +0100

    make format fix

commit 236825b
Author: Tom Ebergen <[email protected]>
Date:   Wed Dec 6 10:23:57 2023 +0100

    remove unused code

commit 34902e9
Author: Tmonster <[email protected]>
Date:   Tue Dec 5 21:20:09 2023 +0100

    should pass make format fix

commit 42d3fb8
Author: Tmonster <[email protected]>
Date:   Tue Dec 5 18:04:36 2023 +0100

    percentage is still global, but rows is local

commit d378cc7
Author: Tmonster <[email protected]>
Date:   Tue Dec 5 15:37:40 2023 +0100

    some debugging statements

commit 4ad877c
Author: Tmonster <[email protected]>
Date:   Tue Dec 5 14:16:25 2023 +0100

    some changes. Have a lot of bugs solved. but still not great

commit ad79d30
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 4 17:41:37 2023 +0100

    have figured out why percentage wasnt working. but it requires a big rework

commit 04d4c0d
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 4 14:10:26 2023 +0100

    reservoir sample works. but for large cardinalities and high percentages no

commit ddcea54
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 4 12:33:16 2023 +0100

    remove std::couts

commit 4e12d15
Author: Tom Ebergen <[email protected]>
Date:   Wed Nov 29 17:47:16 2023 +0100

    ok, have the proper output for reservoir sampling. need to understand when to add local sample or global sample

commit 43e72a4
Author: Tom Ebergen <[email protected]>
Date:   Wed Nov 29 15:23:55 2023 +0100

    compiles. Now I want to figure out where I left off last time

commit 450655c
Merge: 3639e4c 3f96a90
Author: Tom Ebergen <[email protected]>
Date:   Wed Nov 29 15:00:08 2023 +0100

    Merge branch 'main' into reservoir_sampler_Vectors

commit 3639e4c
Merge: c10b3a4 5bc0773
Author: Tom Ebergen <[email protected]>
Date:   Wed Nov 29 14:56:39 2023 +0100

    Merge branch 'main' into reservoir_sampler_Vectors

commit c10b3a4
Author: Tom Ebergen <[email protected]>
Date:   Mon Jan 23 10:04:20 2023 +0100

    this should work now for sampling a set amount of rows. Still need to work on percentage sampling

commit 7147e2a
Author: Tom Ebergen <[email protected]>
Date:   Wed Jan 18 16:38:02 2023 +0100

    it is starting to work, but need to look into why it is still slow

commit 2255424
Author: Tom Ebergen <[email protected]>
Date:   Wed Jan 18 11:21:15 2023 +0100

    working for normal blocking sample, but not for percentage

commit 8a01b32
Author: Tom Ebergen <[email protected]>
Date:   Mon Jan 16 17:02:34 2023 +0100

    intermediate commit, will fix other spots later

commit 3676737
Author: Tom Ebergen <[email protected]>
Date:   Mon Jan 16 09:03:52 2023 +0100

    intermediate work, will be fixing later

commit 904d220
Author: Tom Ebergen <[email protected]>
Date:   Fri Jan 13 13:50:43 2023 +0100

    collecting samples in parallel now, now I need to figure out how to combine them in a proper uniform and weighted manner

commit 01c4b89
Author: Tom Ebergen <[email protected]>
Date:   Tue Jan 10 15:22:28 2023 +0100

    minor code cleanup

commit b5c6d61
Author: Tom Ebergen <[email protected]>
Date:   Tue Jan 10 11:51:32 2023 +0100

    get rid of 4 spaces

commit 1dc807f
Merge: 605520f 7e1a307
Author: Tom Ebergen <[email protected]>
Date:   Tue Jan 10 11:50:14 2023 +0100

    Merge branch 'reservoir_sampler_Vectors' of github.com:Tmonster/duckdb into reservoir_sampler_Vectors

commit 7e1a307
Author: Tmonster <[email protected]>
Date:   Wed Jan 4 11:39:30 2023 -0800

    make format-fix

commit 750c1e3
Author: Tmonster <[email protected]>
Date:   Wed Jan 4 11:37:27 2023 -0800

    small syntax updates

commit fa2ac9c
Author: Tmonster <[email protected]>
Date:   Wed Jan 4 11:36:33 2023 -0800

    small syntax updates

commit cd232c6
Author: Tmonster <[email protected]>
Date:   Tue Dec 27 14:54:08 2022 -0800

    Revert "mostly adding debugging statements for help. Still trying to figure out how to know if parallelizing is sequential or not"

    This reverts commit 0f08574.

commit 0f08574
Author: Tmonster <[email protected]>
Date:   Wed Dec 21 15:08:52 2022 -0800

    mostly adding debugging statements for help. Still trying to figure out how to know if parallelizing is sequential or not

commit f4f5834
Author: Tmonster <[email protected]>
Date:   Wed Dec 21 13:43:00 2022 -0800

    remove iostream

commit bee57ae
Author: Tmonster <[email protected]>
Date:   Wed Dec 21 12:37:02 2022 -0800

    make format fix

commit ce950a1
Author: Tmonster <[email protected]>
Date:   Wed Dec 21 09:10:23 2022 -0800

    ok added test over reservoir threshold

commit 29fb39a
Author: Tmonster <[email protected]>
Date:   Wed Dec 21 09:01:21 2022 -0800

    ok it's all in a datachunk, now I can try and parallelize it

commit 8f05514
Author: Tmonster <[email protected]>
Date:   Tue Dec 20 17:03:25 2022 +0100

    remove pragma threads

commit 98f9897
Author: Tmonster <[email protected]>
Date:   Tue Dec 20 17:02:58 2022 +0100

    no more memory errors

commit da234b7
Author: Tmonster <[email protected]>
Date:   Tue Dec 20 12:24:37 2022 +0100

    no more errors when running count(*) on samples greater than the basic vector size

commit 3fc7214
Author: Tmonster <[email protected]>
Date:   Tue Dec 20 10:04:13 2022 +0100

    fix error

commit e302946
Author: Tmonster <[email protected]>
Date:   Tue Dec 20 10:03:30 2022 +0100

    still errors

commit 7ea6405
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 19 21:15:38 2022 +0100

    its getting better but still getting memory errors

commit 8f3c597
Author: Tom Ebergen <[email protected]>
Date:   Fri Dec 16 16:48:12 2022 +0100

    add some functionality, but mostly making reservoir sampler use datachunk chunkcollection

commit 605520f
Author: Tom Ebergen <[email protected]>
Date:   Mon Dec 19 21:15:38 2022 +0100

    its getting better but still getting memory errors

commit 97a491e
Author: Tom Ebergen <[email protected]>
Date:   Fri Dec 16 16:48:12 2022 +0100

    add some functionality, but mostly making reservoir sampler use datachunk chunkcollection
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant