Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: refactoring, attributes to parse linear trees as well #50

Merged
merged 5 commits into from
May 26, 2023

Conversation

YYYasin19
Copy link
Contributor

@YYYasin19 YYYasin19 commented May 16, 2023

This is a new PR that incorporates some of the changes in #15
The other PR can stay open (as a draft maybe) as a reference though I'm not optimistic what its chances of success are, since lzma seems to be doing most of the optimization already.


changes here include

  • some light refactoring
  • start on supporting linear trees in lgbm
  • (maybe) some refactoring on the benchmarks

@github-actions
Copy link

github-actions bot commented May 16, 2023

(benchmark 5088954335 / attempt 1)
Base results / Our results / Change

Model Size Dump Time Load Time
sklearn rf 20M 20.8 MiB / 3.0 MiB / 6.87 x 0.02 s / 0.04 s / 2.17 x 0.02 s / 0.03 s / 1.77 x
sklearn rf 20M lzma 6.5 MiB / 2.0 MiB / 3.26 x 14.26 s / 1.41 s / 0.10 x 0.67 s / 0.22 s / 0.33 x
sklearn rf 200M 212.3 MiB / 30.6 MiB / 6.94 x 0.16 s / 0.38 s / 2.46 x 0.19 s / 0.34 s / 1.75 x
sklearn rf 200M lzma 47.4 MiB / 14.6 MiB / 3.24 x 117.01 s / 21.14 s / 0.18 x 5.19 s / 1.71 s / 0.33 x
sklearn rf 1G 1157.5 MiB / 166.8 MiB / 6.94 x 1.02 s / 1.87 s / 1.83 x 1.17 s / 1.60 s / 1.36 x
sklearn rf 1G lzma 258.1 MiB / 98.1 MiB / 2.63 x 616.50 s / 129.68 s / 0.21 x 28.34 s / 10.27 s / 0.36 x
sklearn gb 2M 2.2 MiB / 1.1 MiB / 2.08 x 0.04 s / 0.35 s / 8.48 x 0.04 s / 0.20 s / 4.48 x
sklearn gb 2M lzma 0.6 MiB / 0.2 MiB / 3.80 x 1.20 s / 0.54 s / 0.45 x 0.11 s / 0.17 s / 1.61 x
lgbm gbdt 2M 2.6 MiB / 1.0 MiB / 2.78 x 0.12 s / 0.31 s / 2.64 x 0.01 s / 0.18 s / 12.26 x
lgbm gbdt 2M lzma 0.9 MiB / 0.5 MiB / 1.90 x 1.98 s / 0.65 s / 0.33 x 0.09 s / 0.22 s / 2.35 x
lgbm gbdt 5M 5.3 MiB / 1.9 MiB / 2.81 x 0.23 s / 0.61 s / 2.67 x 0.03 s / 0.37 s / 12.00 x
lgbm gbdt 5M lzma 1.7 MiB / 0.8 MiB / 1.96 x 4.85 s / 1.38 s / 0.28 x 0.18 s / 0.45 s / 2.45 x
lgbm gbdt 20M 22.7 MiB / 7.6 MiB / 3.00 x 0.91 s / 2.48 s / 2.74 x 0.13 s / 1.50 s / 11.41 x
lgbm gbdt 20M lzma 6.3 MiB / 3.0 MiB / 2.09 x 25.14 s / 6.30 s / 0.25 x 0.73 s / 1.77 s / 2.43 x
lgbm gbdt 100M 101.1 MiB / 33.0 MiB / 3.06 x 3.96 s / 11.09 s / 2.80 x 0.62 s / 147.74 s / 239.18 x
lgbm gbdt 100M lzma 25.6 MiB / 10.6 MiB / 2.41 x 112.24 s / 30.10 s / 0.27 x 2.97 s / 142.08 s / 47.86 x
lgbm rf 10M 10.9 MiB / 3.2 MiB / 3.46 x 0.46 s / 0.83 s / 1.82 x 0.05 s / 0.67 s / 12.74 x
lgbm rf 10M lzma 0.7 MiB / 0.4 MiB / 1.85 x 2.47 s / 1.15 s / 0.47 x 0.14 s / 0.68 s / 4.72 x

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented May 16, 2023

according to this file in the lightgbm code, the required features for linear trees are:

  • leaf_const
  • num_features
  • leaf_features
  • leaf_coeff

of which we should've covered all :)

@YYYasin19 YYYasin19 marked this pull request as ready for review May 16, 2023 16:43
@YYYasin19 YYYasin19 requested a review from pavelzw as a code owner May 16, 2023 16:43
Copy link
Contributor

@jonashaag jonashaag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

slim_trees/lgbm_booster.py Outdated Show resolved Hide resolved
slim_trees/lgbm_booster.py Outdated Show resolved Hide resolved
tests/test_lgbm_compression.py Outdated Show resolved Hide resolved
Copy link
Member

@pavelzw pavelzw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @YYYasin19!

@pavelzw pavelzw merged commit 10320b5 into main May 26, 2023
@pavelzw pavelzw deleted the lgbm-refactoring branch May 26, 2023 08:46
@pavelzw pavelzw added the enhancement New feature or request label May 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants