[Dev] Disable Block reduction for int8 by default #140

LeiWang1999 · 2024-08-13T07:28:55Z

This pull request includes updates to a submodule and a conditional check in the bitblas module to enhance functionality and compatibility.

Submodule Update:

3rdparty/tvm: Updated the submodule commit to the latest version.

Code Enhancement:

bitblas/gpu/matmul_analysis.py: Modified the check_last_trait function to include a condition that only supports float16 data type when checking for dequantize_info attribute.

…ability and maintainability

…ainability

…tainability

…itBLAS into update_transform

…ce and code readability

…te_transform

LeiWang1999 added 30 commits July 5, 2024 08:54

Refactor BatchMatMulEmitter and BatchMatMulSelector for improved read…

d8884e6

…ability and maintainability

Refactor import statements for improved readability and maintainability

fc84173

Refactor import statements for improved readability and maintainability

02f64de

disable failure email for ci

397eee6

remove email notifications.

20f6ad1

move relax pass from testing to mlc_llm

b93c394

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

ba6a6df

Refactor scripts with se check_eual_ref_scripts_with_emitter function

257693a

Lint Fix

9bb7f49

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

39e7614

Refactor scripts with se check_eual_ref_scripts_with_emitter function

93eb5a5

bug fix in test

aa66a90

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

ae14a53

lint fix.

79b08e4

test cuda i4 kernel

86fd036

Refactor copyright notice in i4matmul.hpp

6b73a21

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

0ba90c1

Refactor BitBLASLinear test module for improved readability and maint…

086d208

…ainability

refactor test as version below python 3.9 cannot handle int32 overflow.

47a3abd

format lint for test

024b247

Refactor test_int4b_fp16_convert.py for improved readability and main…

bfedeaa

…tainability

remove unused design file

e672a23

move tile device from package to base

21e5430

dummy impl for codegen

fd11940

Refactor file structure for ladder_permutate module

9ccfa85

Refactor backend class and fix typos in comments

7c7d73e

Deep refactor Lib related code.

47d5fc5

remove ci pull.

53dd0dd

LintFix

d58ac43

refactor builder for whl build

37cb07c

LeiWang1999 and others added 29 commits August 6, 2024 17:32

lint fix

907d434

case fix

e7c805a

bug fix

7a16e5a

fix for legalize

8c666ac

bug fix

ce79943

chore: Clear global operator cache before running tests

e6c5eec

revert optimize_stratety into SingleBatchDecodeOnly

4e8ab23

typofix

8177e2f

update benchmark scripts

113e485

chore: Refactor benchmark scripts and fix typos

46b2d7d

fix for testing

9ecbb48

lint fix

fb128ed

fix import.

9b5e106

typo

54b4044

operator benchmark

23f88b7

optimize

bcffc49

always with shared.dyn

8b5f083

optimize cache.

54b5d3f

dsl fix

4289d7b

Merge branch 'main' into update_transform

72a1863

tqdm

2327e14

Merge branch 'update_transform' of https://github.com/LeiWang1999/MSB…

40a9b04

…itBLAS into update_transform

chore: Add serialize_results method to benchmark_matmul_strategies.py

a16ee62

fix performance issue for dynamic async copy

63fc654

chore: Refactor benchmark_matmul_strategies.py for improved performan…

d37d09f

…ce and code readability

bug fix

d4d4603

update readme

1816526

disable block reduce for int8

b81a3a8

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into upda…

9cde970

…te_transform

LeiWang1999 merged commit fcf3f63 into microsoft:main Aug 13, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Disable Block reduction for int8 by default #140

[Dev] Disable Block reduction for int8 by default #140

LeiWang1999 commented Aug 13, 2024

[Dev] Disable Block reduction for int8 by default #140

[Dev] Disable Block reduction for int8 by default #140

Conversation

LeiWang1999 commented Aug 13, 2024

Submodule Update:

Code Enhancement: