Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tools] Implement temporary block quantization tool #13830

Open
hseok-oh opened this issue Aug 29, 2024 · 1 comment
Open

[tools] Implement temporary block quantization tool #13830

hseok-oh opened this issue Aug 29, 2024 · 1 comment
Assignees

Comments

@hseok-oh
Copy link
Contributor

What

Let's implement and maintain Q4_0 and Q8_0 data type weight quantization tool to make test and example model.
It is temporary tool, and not for compiler module implementation.

Why

To help onert's LLM support feature development, we need tool to generate weight block quantization tool from fp32 circle test model.
It will also help PoC for circle schema update to support LLM model.

  • Type
    • Q4_0
    • Q8_0
  • Target operand
    • Gather's params
    • FullyConnected's weight
@hseok-oh hseok-oh self-assigned this Aug 29, 2024
@hseok-oh
Copy link
Contributor Author

hseok-oh commented Aug 29, 2024

Tool: #13758
It will not be merged.

@hseok-oh hseok-oh added this to the ONERT LLM Milestone 1 milestone Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

1 participant