Releases · ROCm/MIOpen

08 Jul 17:30

daniellowell

2.0.0

326bf22

MIOpen v2.0.0

Notes:

This release contains several new features including an immediate mode for selecting convolutions, bfloat16 support, new layers, modes, and algorithms.
MIOpenDriver, a tool for benchmarking and developing kernels is now shipped with MIOpen.
BFloat16 now supported in HIP requires an updated rocBLAS as a GEMM backend.
Immediate mode API now provides the ability to quickly obtain a convolution kernel.
MIOpen now contains HIP source kernels and implements the ImplicitGEMM kernels. This is a new feature and is currently disabled by default. Use the environmental variable "MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=1" to activation this feature. ImplicitGEMM requires an up to date HIP version of at least 1.5.9211.
A new "loss" catagory of layers has been added, of which, CTC loss is the first. See the API reference for more details.
2.0 is the last release of active support for gfx803 architectures. In future releases, MIOpen will not actively debug and develop new features specifically for gfx803.
System Find-Db in memory cache is disabled by default. Please see build instructions to enable this feature.

Changes:

Added support for bfloat16 datatype in convolutions
Added softmax channel mode and new softmax version 2 API
Added fast / accurate / log softmax algorithms
Added new implicit GEMM convolution algorithm for forward and backwards data passes, disabled by default
Added int32 datatype support for output tensors in int8 convolutions
Added immediate mode for finding the best convolution kernel for a given configuration
Added a Find-Db infrastructure which stashes results of find on a user's system
Added a shipped System Find-Db containing offline run Find() results
Added an additional, faster batch norm assembly kernel for fp16
Added CTC loss layer
Added MIOpenDriver as a default component in MIOpen's build #34
Fixed C compatability for boolean types in C API #103
Fixed incorrect calculation in per-activation batch norm backwards pass #104
Fixed bug #95 with asm batch norm ISA
Fixed IsApplicable bug in Conv3x3Asm for group convolutions
Improved performance of 1x1 stride 2 fp32 convolutions in the forward and backwards data passes
Improved 3-D convolution stability
Improved applicability of direct convolution backwards weights for 2x2, 5x10, and 5x20 filter sizes
Improved maintainability in kernels and cpp code
Updated rocBLAS minimum version to branch master-rocm-2.6

Assets 6

03 May 22:01

daniellowell

1.8.1

0bce818

MIOpen v1.8.1

Notes:

This release contains minor bug fixes and additional performance database improvements.

Changes:

Fixed accuracy issue with backwards weights
Fixed issue with name parsing for newer architectures
Added narrow workaround for 5x10 and 5x20 filter performance regression
Improved support in performance database for Radeon VII

Assets 6

12 Apr 04:33

daniellowell

1.8.0

917304e

MIOpen v1.8.0

Notes:

This release contains full 3-D convolution support and int8 support for inference.
Additionally, there are major updates in the performance database for major models including those found in Torchvision.
An assortment of bugs have been resolved in this release.

Changes:

Fixed various issues in assembly kernels
Fixed issue #92 and #79 for miopenOpTensor
Fixed issue #88 for bzip2
Fixed issue #77 algorithm mismatch
Added Winograd support for fp32 backwards weights
Added pooling inclusive mode
Added tuning for direct group convolution algorithms
Added additional kernel support for group convolutions
Added API for 3-D convolutions
Added support for int8 inference convolutions
Added integer selection for pooling indexing
Added minimum dependencies support
Added RNN fp16 support on the MIOpen-HIP backend
Added 1x1 convolution + bias + activation fusions
Added workaround for issue #84 GPU memory access fault
Added performance tuning for direct backwards weights
Improved performance database coverage
Improved internal quality by reducing redunant code
Improved build instructions in README.md
Improved performance database coverage for fusions
Updated Docker components and requirements

Known Issues:

RNNs do not support fp16 on the MIOpen-OpenCL backend
OpenCL backend does not support GEMM convolutions in fp16

Assets 6

06 Feb 16:01

daniellowell

1.7.1

6054829

MIOpen v1.7.1

Notes:

This release contains minor bug fixes and performance improvements.

Changes:

Fixed corrupt and obsolete performance database entries
Fixed issue #70
Fixed issue #72
Fixed issue #77
Removed default dependency of RNNs on rocBLAS
Added a workaround for softmax fp16 correctness issue
Added check to only make MIOpen with static boost libraries
Improved performance database coverage

Known Issues:

RNNs do not support fp16
OpenCL backend does not support GEMM convolutions in fp16
Layer fusions for convolution 1x1 fp16 are not supported
Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported

Assets 6

19 Dec 18:24

daniellowell

1.7.0

7cb5f5f

MIOpen v1.7.0

Notes:

This release contains general bug fixes and an updated performance database
Group convolutions backwards weights performance has been improved
Logging across the library has been improved
Performance database has been updated

Changes:

Fixed logging issues with group convolution and pooling
Fixed sphinx version issue in document generation
Fixed issues with corrupt entries in performance database
Removed external dependency on libSSL and libCrypto
Added support for large image backwards weights in direct convolution
Added fp16 support for RNNs on the HIP backend
Improved performance database coverage

Known Issues:

RNNs do not support fp16
OpenCL backend does not support GEMM convolutions in fp16
Layer fusions for convolution 1x1 fp16 are not supported
Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported

Assets 6

19 Nov 03:16

daniellowell

1.6.0

ffedda8

MIOpen v1.6.0

Notes:

Training in fp16 (half precision) including mixed-precision is now fully supported
Batch Normalization in fp16 (half precision) including mixed-precision are now available
Performance improvements for 3x3 and 1x1 single-precision convolutions
Layer fusions for BatchNorm+Activation are now available
Layer fusions with convolutions now support varying strides and padding configurations

Changes:

rocBLAS is now used as the default BLAS library for the HIP backend (minimum version 14.3.0)
Fixed various bugs in convolution kernels
Fixed issues with bad references in layer fusion
Fixed gfx803 assembily issues
Added support fp16 Winograd convolutions
Added support for fp16 pooling
Improved error reporting for convolutions and layer fusions
Improved documentation

Known Issues:

RNNs do not support fp16
OpenCL backend does not have full fp16 support
Layer fusions for convolution 1x1 fp16 are not supported

Assets 6

14 Sep 23:06

daniellowell

1.5.0

e3fb49c

MIOpen v1.5.0

Notes:

A new kernel fusion API is now available for inference for convolution, bias,
batch normalization, and activations.
This release includes new features and bug fixes
Group and Depthwise convolutions are now available
3D Batch Normalization has been implemented for fully packed tensors
Dilation for convolutions have been implemented

Changes:

Fixed bugs in direct convolutions
Fixed issue with paths when $HOME variable is not set
Fixed padding issues with 1x1 convolutions
Added incremental support for fp16
Added fused kernels for Winograd and direct with bias and activations
Added a getting started guide for kernel fusion.
Added group and depthwise API for convolutions
Added 3-D batch normalization support with 5-D tensors
Improved max pooling performance
Improved debug and error reporting information
Improved documentation for convolutions

Known Issues:

RNNs do not support fp16
Training with CNNs does not support fp16

Assets 2

30 Jul 18:59

daniellowell

1.4.2

d0ae7a6

MIOpen v1.4.2

Notes:

This release is a hot-fix to enable ICNet and PSPNet

Known Issues:

RNNs do not support fp16
Training with CNNs does not support fp16
Users may encounter a warning that their performance database is out of date. The performance database can be updated by setting the environment variable for just the initial run of an application: MIOPEN_FIND_ENFORCE=search
For more information on the performance database, see: https://rocmsoftwareplatform.github.io/MIOpen/doc/html/perfdatabase.html#

Assets 2

19 Jul 20:49

daniellowell

1.4.1

dd6e79c

MIOpen v1.4.1

Notes:

This release includes a bug fix for 3x3 convolutions
Updated README file configuration instructions

Known Issues:

RNNs do not support fp16
Training with CNNs does not support fp16
Users may encounter a warning that their performance database is out of date. The performance database can be updated by setting the environment variable for just the initial run of an application: MIOPEN_FIND_ENFORCE=search
For more information on the performance database, see: https://rocmsoftwareplatform.github.io/MIOpen/doc/html/perfdatabase.html#

Assets 2

06 Jul 15:05

daniellowell

1.4.0

3afe80a

MIOpen v1.4.0

Notes:

This release includes a number of performance improvements and bug fixes
New features have been added to convolutions for auto-tuning kernels
Activations now have new modes available
Documentation has been updated and corrected

Changes:

Fixed documentation errors
Fixed bug in activations with pass-through mode
Fixed performance database locking issues
Fixed Winograd kernel behavior for stride 2 backwards data
Fixed a bug in OpTensor layer
Fixed a timing issue with batch normalization inline assembly
Fixed issue with an unnecessary binary creation in assembly bug detection
Fixed issue with disk program cache directory not being created
Fixed a bug with convolution+bias
Added to performance database functionality
Added leaky-ReLU, clipped, and exponential-ReLU modes to activation
Added documentation for performance database usage
Added support for 1x1 convolutions with non-zero padding
Added API for printing status codes as strings
Added auto-tuning feature for convolutions
Improved LSTM and GRU backwards pass performance
Improved debug and error reporting information
Improved performance of batch normalization spatial mode
Improved find stage for convolutions
Improved readability for user database file

Known Issues:

RNNs do not support fp16
Training with CNNs does not support fp16

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes:

Changes:

Known Issues:

Releases: ROCm/MIOpen

MIOpen v2.0.0

MIOpen v1.8.1

MIOpen v1.8.0

MIOpen v1.7.1

MIOpen v1.7.0

MIOpen v1.6.0

MIOpen v1.5.0

MIOpen v1.4.2

MIOpen v1.4.1

MIOpen v1.4.0

Notes:

Changes:

Known Issues: