You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've recently been experimenting with XNNPACK's weight cache to reduce load time by caching packed weights and also reduce memory pressure for repeated weights across the same kernels.
I was experiementing with fully-connected operator and found that the weight cache was never being hit. I noticed that when using the apis to create the xnn_weights_cache_t we set the look up function to be xnn_internal_weights_cache_look_up:
// The default implementation does not support this query.
returnXNN_CACHE_NOT_FOUND;
}
Now when I'm using the weights cache to create a runtime_t with only a fully connected operator, in the flow of creating the fully-connected operator, we look up the cache to see if the weights have been packed before, using xnn_weights_cache_look_up:
As a result, every look up would then fall to XNN_CACHE_NOT_FOUND, in which weights have to be repacked, and memory has to be allocated for the newly packed weights:
@mcr229 - Is it possible to not use the default constructor i.e. xnn_create_weights_cache_with_size and write a custom constructor which populates struct xnn_weights_cache_provider with your methods?
I've recently been experimenting with XNNPACK's weight cache to reduce load time by caching packed weights and also reduce memory pressure for repeated weights across the same kernels.
I was experiementing with fully-connected operator and found that the weight cache was never being hit. I noticed that when using the apis to create the
xnn_weights_cache_t
we set the look up function to bexnn_internal_weights_cache_look_up
:XNNPACK/src/runtime.c
Line 148 in 85071b8
looking at this function, it looks like a placeholder function which would always return
XNN_CACHE_NOT_FOUND
:XNNPACK/src/cache.c
Lines 491 to 496 in 85071b8
Now when I'm using the weights cache to create a runtime_t with only a fully connected operator, in the flow of creating the fully-connected operator, we look up the cache to see if the weights have been packed before, using xnn_weights_cache_look_up:
XNNPACK/src/operators/fully-connected-nc.c
Lines 154 to 157 in 85071b8
However this just uses the the placeholder function above, returning XNN_CACHE_NOT_FOUND:
XNNPACK/src/cache.c
Lines 530 to 534 in 85071b8
As a result, every look up would then fall to XNN_CACHE_NOT_FOUND, in which weights have to be repacked, and memory has to be allocated for the newly packed weights:
XNNPACK/src/operators/fully-connected-nc.c
Lines 159 to 179 in 85071b8
Am I looking at this incorrectly? Or is this a feature that is still a wip? Or is this a bug that is meant to be fixed in the future?
The text was updated successfully, but these errors were encountered: