Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning: support prepared statement and client stmt cache in logical import mode #55482

Merged
merged 29 commits into from
Sep 21, 2024

Conversation

dbsid
Copy link
Contributor

@dbsid dbsid commented Aug 17, 2024

What problem does this PR solve?

Issue Number: close #54850

Problem Summary:

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

add doc for logical-import-prep-stmt

Test logical import on a table with 1200 columns

Baseline
image

With this PR
image

image

image

set db properties in DBFromConfig

increase minDeliverBytes to 10m

add log to show session variable

support prepared statement in tidb lightning mode

fix bugs

try to reduce the inuse memory

try to avoid oom

Update import.go

change lru cache to kvcache

add configuration LogicalImportPrepStmt

add logical logicalImportPrepStmt config

add initial test

test single statement

fix the bug in test
@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Aug 17, 2024
Copy link

ti-chi-bot bot commented Aug 17, 2024

Hi @dbsid. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 17, 2024
Copy link

tiprow bot commented Aug 17, 2024

Hi @dbsid. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lance6716
Copy link
Contributor

/ok-to-test

@ti-chi-bot ti-chi-bot bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Aug 19, 2024
Copy link

codecov bot commented Aug 19, 2024

Codecov Report

Attention: Patch coverage is 82.41758% with 16 lines in your changes missing coverage. Please review.

Project coverage is 73.6674%. Comparing base (2651b77) to head (8869768).
Report is 1 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #55482        +/-   ##
================================================
+ Coverage   73.4071%   73.6674%   +0.2603%     
================================================
  Files          1600       1631        +31     
  Lines        443627     447086      +3459     
================================================
+ Hits         325654     329357      +3703     
+ Misses        98033      97618       -415     
- Partials      19940      20111       +171     
Flag Coverage Δ
integration 43.9952% <0.0000%> (?)
unit 72.4941% <82.4175%> (+0.0239%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9567% <ø> (ø)
parser ∅ <ø> (∅)
br 45.7885% <ø> (-0.0910%) ⬇️

@lance6716
Copy link
Contributor

/cc @D3Hunter @lance6716

@lance6716
Copy link
Contributor

please merge master to fix lightning CI @dbsid

pkg/lightning/config/config.go Show resolved Hide resolved
pkg/lightning/backend/tidb/tidb.go Outdated Show resolved Hide resolved
@@ -309,6 +334,11 @@ type tidbBackend struct {
// affecting the cluster too much.
maxChunkSize uint64
maxChunkRows int
// implement stmtCache to improve performance, especially when the downstream is TiDB
stmtCache *kvcache.SimpleLRUCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as the table is importing, the prepared stmt is needed, no need such LRU

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to use LRU here is that the batch insert prepared statement may not be the same. where the insert/replace statement is constrained by logical-import-batch-size , the actual SQL executed might be longer or shorter, depending on the actual content imported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on row size is unstable for prepared stmt, in worse case, every prepared stmt is different. we can ignore batch-size in this case, and only honors batch-rows, cc @lance6716

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default logical-import-batch-rows is 65536, usually this batch size is too large, should we lower the default value of logical-import-batch-rows to some reasonable number like 128/256/512?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can ask the user that request this feature what's the root cause #46607 . If the prepared statement can solve that problem, we can ignore these configurations. @D3Hunter and I are busy this week, please help to contact the issue author.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the author of #46607, but we've got customer which database is unable to handle 96 KB transactions thus requiring the limit to be lowered.

The -batch-size does not need to be very precise. It is mainly used to ensure (number of rows) × (size of row) to be maintained at a constant limit. You should not ignore the batch-size entirely because the same limit is shared among all tables and the constraint should be reasonable for both wide and narrow tables.

pkg/lightning/backend/tidb/tidb.go Outdated Show resolved Hide resolved
@D3Hunter
Copy link
Contributor

/hold

need confirm from PM

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 21, 2024
pkg/lightning/common/util.go Outdated Show resolved Hide resolved
@@ -309,6 +334,11 @@ type tidbBackend struct {
// affecting the cluster too much.
maxChunkSize uint64
maxChunkRows int
// implement stmtCache to improve performance, especially when the downstream is TiDB
stmtCache *kvcache.SimpleLRUCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on row size is unstable for prepared stmt, in worse case, every prepared stmt is different. we can ignore batch-size in this case, and only honors batch-rows, cc @lance6716

pkg/lightning/backend/tidb/tidb.go Outdated Show resolved Hide resolved
pkg/lightning/backend/tidb/tidb.go Outdated Show resolved Hide resolved
pkg/lightning/backend/tidb/tidb.go Outdated Show resolved Hide resolved
pkg/lightning/config/config.go Show resolved Hide resolved
pkg/lightning/config/config.go Show resolved Hide resolved
@D3Hunter
Copy link
Contributor

please prepare a doc pr to this feature

@dbsid
Copy link
Contributor Author

dbsid commented Sep 17, 2024

please prepare a doc pr to this feature

pingcap/docs#18902
pingcap/docs-cn#18627

@hawkingrei
Copy link
Member

/retest

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Sep 18, 2024
Copy link

ti-chi-bot bot commented Sep 18, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-09-09 09:10:55.080272051 +0000 UTC m=+261124.820695992: ☑️ agreed by lance6716.
  • 2024-09-18 07:00:07.956322065 +0000 UTC m=+1030877.696746004: ☑️ agreed by D3Hunter.

@hawkingrei
Copy link
Member

/retest

Copy link

@yudongusa yudongusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

ti-chi-bot bot commented Sep 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, lance6716, qw4990, yudongusa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Sep 19, 2024
@dbsid
Copy link
Contributor Author

dbsid commented Sep 21, 2024

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 21, 2024
@dbsid
Copy link
Contributor Author

dbsid commented Sep 21, 2024

/test pull-integration-ddl-test

@dbsid
Copy link
Contributor Author

dbsid commented Sep 21, 2024

/test unit-test

Copy link

tiprow bot commented Sep 21, 2024

@dbsid: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test fast_test_tiprow
  • /test tidb_parser_test

Use /test all to run all jobs.

In response to this:

/test pull-integration-ddl-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

tiprow bot commented Sep 21, 2024

@dbsid: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test fast_test_tiprow
  • /test tidb_parser_test

Use /test all to run all jobs.

In response to this:

/test unit-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lance6716
Copy link
Contributor

/retest

Copy link

tiprow bot commented Sep 21, 2024

@dbsid: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 8869768 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@lance6716
Copy link
Contributor

/retest

1 similar comment
@purelind
Copy link
Contributor

/retest

@ti-chi-bot ti-chi-bot bot merged commit 2eb4dc8 into pingcap:master Sep 21, 2024
23 of 24 checks passed
winoros pushed a commit to winoros/tidb that referenced this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
8 participants