Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ Tensor ] more types of Tensors are required. #2733

Open
EunjuYang opened this issue Sep 12, 2024 · 4 comments
Open

[ Tensor ] more types of Tensors are required. #2733

EunjuYang opened this issue Sep 12, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@EunjuYang
Copy link
Contributor

Our awesome TensorV2 has been successfully activated.
As of September 12th, 2024, we support the following types of tensors:

FloatTensor : fp32,
HalfTensor : fp16,
CharTensor : qint8,
ShortTensor : uint16.

For further expansion of our on-device training capabilities, we also need to incorporate tensors using unsigned 32 bit integers (uint32) and unsigned 8 bit integers (uint8).

@EunjuYang EunjuYang added the enhancement New feature or request label Sep 12, 2024
@taos-ci
Copy link
Collaborator

taos-ci commented Sep 12, 2024

:octocat: cibot: Thank you for posting issue #2733. The person in charge will reply soon.

EunjuYang added a commit to EunjuYang/nntrainer that referenced this issue Sep 13, 2024
- This commit resolves nnstreamer#2733
- This commit implements template of UIntTensor based on the ShortTensor class.
- Based on the template, this commit supports UInt8 / UInt16 / UInt32
- Implement UIntTensor Template
- Implement UInt8Tensor / UInt16Tensor / UInt32Tensor (ShortTensor is replaced)

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
EunjuYang added a commit to EunjuYang/nntrainer that referenced this issue Sep 13, 2024
- This commit resolves nnstreamer#2733
- This commit implements template of UIntTensor based on the ShortTensor class.
- Based on the template, this commit supports UInt8 / UInt16 / UInt32
- Implement UIntTensor Template
- Implement UInt8Tensor / UInt16Tensor / UInt32Tensor (ShortTensor is replaced)
- Unit tests for UInt8 and UInt32 are added

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
EunjuYang added a commit to EunjuYang/nntrainer that referenced this issue Sep 13, 2024
- This commit resolves nnstreamer#2733
- This commit implements template of UIntTensor based on the ShortTensor class.
- Based on the template, this commit supports UInt8 / UInt16 / UInt32
- Implement UIntTensor Template
- Implement UInt8Tensor / UInt16Tensor / UInt32Tensor (ShortTensor is replaced)
- Unit tests for UInt8 and UInt32 are added

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
EunjuYang added a commit to EunjuYang/nntrainer that referenced this issue Sep 13, 2024
- This commit resolves nnstreamer#2733
- This commit implements template of UIntTensor based on the ShortTensor class.
- Based on the template, this commit supports UInt8 / UInt16 / UInt32
- Implement UIntTensor Template
- Implement UInt8Tensor / UInt16Tensor / UInt32Tensor (ShortTensor is replaced)
- Unit tests for UInt8 and UInt32 are added

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
@skykongkong8
Copy link
Member

skykongkong8 commented Sep 23, 2024

Thanks for pointing this issue out!

On top of that, I believe we should discuss format of quantization parameters, along with which quantization method are we going to use.

For example, quantization parameters like zero point and scale factor are NYI as far as I know.
They might be given as a scalar value, or channel-wise vector, or even row / col - wise matrix (in 4-D tensor ).

EunjuYang added a commit to EunjuYang/nntrainer that referenced this issue Sep 23, 2024
- This commit resolves nnstreamer#2733
- This commit implements template of UIntTensor based on the ShortTensor class.
- Based on the template, this commit supports UInt8 / UInt16 / UInt32
- Implement UIntTensor Template
- Implement UInt8Tensor / UInt16Tensor / UInt32Tensor (ShortTensor is replaced)
- Unit tests for UInt8 and UInt32 are added

The `uint_tensor.cpp` file is implemented in the way to be imported in `uint_tensor.h`. Since uint_tensor.h declares class prototype with template, all its implementation should be included in one file. However, for better readability, I split the code into uint_tensor.cpp. For this reason, this code should not be included as `compile` but only used as inclusion purpose.

Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
@EunjuYang
Copy link
Contributor Author

Great point!
We may consider quantization scheme as well [ref]. This REF considers different quantization parameters according to the quantization scheme; whether simple float / int or lit of float, list of int and axis.

@skykongkong8
Copy link
Member

skykongkong8 commented Sep 24, 2024

I was planning to bring up this issue after implementing a draft version of qint-using gemm algorithm, but here's a temporal one that I am using :

template <typename T> struct ScalarIOQuantInfo {
...
// input quantization parameters
  T activation_zero_point;
  float activation_scale_factor;

// input information to compute time-of-flight (dynamic) quantization during GEMM
  T activation_min_val;
  T activation_max_val;

// output information to compute static quantization during GEMM
  T output_zero_point;
  float output_scale_factor;

// actual quantization parameter to quantize into u8 GEMM output
  T resultant_zero_point;
  float resultant_scale_factor;
...
};

template <typename T> struct VectorIOQuantInfo {
...
/*
 Information required from the kernel side:
0. list of T or float, obviously
1. axis (per channel?, row?, col?)
2. vector length (maybe using std::vector<T> wil do)
3. quantization algorithm (AFAIK some adds zp before multipluing sf and some do vice versa)
*/
...
};

...

typedef ScalarIOQuantInfo<uint8_t> QInfoU8;
typedef ScalarIOQuantInfo<int8_t> QInfoS8;

Please let me know if there's any suggestions or feedbacks.
And this is just an idea, but many use u8, s8, u16, s16 for unsigned/signed int8/int16 for abbreviation. I think using this terminology is more intuitive because I find the term qint8 quite unclear... does it mean signed int8 ? or signed int8 Tensor with quantization parameter? and if so, does uint16 Tensor have qParam?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants