[Feature] Support Pure Python style Configuration File #1071

HAOCHENYE · 2023-04-12T05:49:18Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Add Pure Python style Configuration File

Current pure text style configuration files can satisfy most of our development needs and some module aliases can greatly simplify the configuration files (e.g. ResNet can refer to mmcls.models.ResNet). However, there are also some disadvantages:

In the configuration file, the type field is specified by a string, and IDE cannot directly jump to the corresponding class definition, which is not conducive to code reading and jumping.
The inheritance of configuration files is also specified by a string, and IDE cannot directly jump to the inherited file. When the inheritance structure of the configuration file is complex, it is not conducive to reading and jumping of the configuration file.
The inheritance rules are relatively implicit, and beginners find it difficult to understand how the configuration file merges variables with the same fields and derives special syntax such as _delete_, resulting in a higher learning cost.
It is easy for users to forget to register the module and cause module not found errors.
In the yet-to-be-mentioned cross-codebase inheritance, the introduction of the scope makes the inheritance rules of the configuration file more complicated, and beginners find it difficult to understand.

In summary, although pure text style configuration files can provide the same syntax rules for python, json, and yaml format configurations, when the configuration files become complex, pure text style configuration files will appear inadequate. Therefore, we provide a pure Python style configuration file, i.e., the lazy import mode, which can fully utilize Python's syntax rules to solve the above problems. At the same time, the pure Python style configuration file also supports exporting to json and yaml formats.

Basic Syntax

Simply describe the syntax difference between python style config and pure text style config

Module Construction

We use a simple example to compare pure Python style and pure text style configuration files:

Registration for pure Python Style and current pure text style:

Pure Python style

# No need for registration

Pure text style

from torch.optim import SGD
from mmengine.registry import OPTIMIZERS
OPTIMIZERS.register_module(module=SGD, name='SGD')

Configuration file writing for pure Python style and current pure text style:

Pure Python style

# Configuration file writing
from torch.optim import SGD
optimizer = dict(type=SGD, lr=0.1)

Pure text style

# Configuration file writing
optimizer = dict(type='SGD', lr=0.1)

Module construction for pure Python style and current pure text style:

The same for pure Python style and pure text style

import torch.nn as nn
from mmengine.registry import OPTIMIZERS
cfg = Config.fromfile('optimizer.py')
model = nn.Conv2d(1, 1, 1)
cfg.optimizer.params = model.parameters()
optimizer = OPTIMIZERS.build(cfg.optimizer)

From the above example, we can see that the difference between pure Python style and pure text style configuration files is:

Pure Python style configuration files do not require module registration.
In pure Python style configuration files, the type field is no longer a string but directly refers to the module. Correspondingly, import syntax needs to be added in the configuration file.

It should be noted that the OpenMMLab series algorithm library still retains the registration process when adding modules. When users build their own projects based on MMEngine, if they use pure Python style configuration files, registration is not required. You may wonder that if you are not in an environment with torch installed, you cannot parse the sample configuration file. Can this configuration file still be called a configuration file? Don't worry, we will explain this part later.

Inheritance

The inheritance syntax of pure Python style configuration files is slightly different:

Pure Python Style:

_base_ = [./optimizer.py]

if '_base_':
    from .optimizer import *

Pure Python style configuration files use import syntax to achieve inheritance. The advantage of doing this is that we can directly jump to the inherited configuration file for easy reading and jumping. The variable inheritance rule (add, delete, change, and search) is completely aligned with Python syntax. For example, if I want to modify the learning rate of the optimizer in the base configuration file:

if '_base_':
    from .optimizer import *

# optimizer is a variable defined in the base configuration file
optimizer.update(
    lr=0.01,
)

Of course, if you are already accustomed to the inheritance rules of pure text style configuration files and the variable is of the dict type in the _base_ configuration file, you can also use merge syntax to achieve the same inheritance rule as pure text style configuration files:

if '_base_':
    from .optimizer import *

# optimizer is a variable defined in the base configuration file
optimizer.merge(
    _delete_=True,
    lr=0.01,
    type='SGD'
)

# The equivalent Python style writing is as follows, completely consistent with Python's import rules
# optimizer = dict(
#     lr=0.01,
#     type='SGD'
# )

Compared with pure text style configuration files, the inheritance rule of pure Python style configuration files is completely aligned with the import syntax of Python, which is easier to understand and supports jumping between configuration files. You may wonder since both inheritance and module imports use import syntax, why do we need an if '_base_' statement for inheriting configuration files? On the one hand, this can improve the readability of configuration files, making inherited configuration files more prominent. On the other hand, it is also restricted by the rules of lazy_import, which will be explained later.

What is Lazy Import

You may find that pure Python style configuration files seem to organize configuration files using pure Python syntax. Then, I do not need configuration classes, and I could just import configuration files using Python syntax. If you have such a feeling, then it is worth celebrating because this is exactly the effect we want.

As mentioned earlier, parsing configuration files requires dependencies on third-party libraries referenced in the configuration files. This is actually a very unreasonable thing. For example, if I trained a model based on MMagic and wanted to deploy it with the onnxruntime backend of MMDeploy. Due to the lack of torch in the deployment environment, and torch is needed in the configuration file parsing process, this makes it inconvenient for me to directly use the configuration file of MMagic as the deployment configuration. To solve this problem, we introduced the concept of lazy_import.

It is a complex task to discuss the specific implementation of lazy_import, so here we only briefly introduce its function. The core idea of lazy_import is to delay the execution of the import statement in the configuration file until the configuration file is parsed, so that the dependency problem caused by the import statement in the configuration file can be avoided. During the configuration file parsing process, the equivalent code executed by the Python interpreter is as follows:

Original configuration file:

from torch.optim import SGD

optimizer = dict(type=SGD)

Code actually executed by the python interpreter through the configuration class:

lazy_obj = LazyObject('torch.optim', 'SGD')

optimizer = dict(type=lazy_obj)

As an internal type of the Config module, the LazyObject cannot be accessed directly by users. When accessing the type field, it will undergo a series of conversions to convert LazyObject into the actual torch.optim.SGD type. In this way, parsing the configuration file will not trigger the import of third-party libraries, while users can still access the types of third-party libraries normally when using the configuration file.

To access the internal type of LazyObject, you can use the Config.to_dict interface:

cfg = Config.fromfile('optimizer.py').to_dict()
print(type(cfg['optimizer']['type']))
# mmengine.config.lazy.LazyObject

At this point, the type accessed is the LazyObject type.

However, we cannot adopt the lazy import strategy for the inheritance (import) of base files since we need the configuration file parsed to include the fields defined in the base configuration file, and we need to trigger the import really. Therefore, we have added a restriction on importing base files, which must be imported in the if '_base_' code block.

Limitations

Functions and classes cannot be defined in the configuration file.
The configuration file name must comply with the naming convention of Python modules, which can only contain letters, numbers, and underscores, and cannot start with a number.
When importing variables from the base configuration file, such as from ._base_.alpha import beta, the alpha here must be the module (module) name, i.e., a Python file, rather than the package (package) name containing __init__.py.
Importing multiple variables simultaneously in an absolute import statement, such as import torch, numpy, os, is not supported. Multiple import statements need to be used instead, such as import torch; import numpy; import os.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

open-mmlab/mmdetection#10366
open-mmlab/mmyolo#787
open-mmlab/mmrazor#539
open-mmlab/mmpose#2390
open-mmlab/mmpretrain#1567

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMCls.
The documentation has been modified accordingly, like docstring or example tutorials.

… registry

docs/en/advanced_tutorials/config.md

ly015 · 2023-06-14T10:18:17Z

LGTM

tests/data/config/lazy_module_config/_base_/default_runtime.py

mmengine/registry/registry.py

mmengine/runner/runner.py

mmengine/registry/registry.py

docs/en/advanced_tutorials/config.md

codecov · 2024-08-23T15:01:45Z

Codecov Report

Attention: Patch coverage is 82.94011% with 94 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@3715fea). Learn more about missing BASE report.

Files	Patch %	Lines
mmengine/config/config.py	80.90%	40 Missing and 19 partials ⚠️
mmengine/config/utils.py	91.66%	5 Missing and 3 partials ⚠️
mmengine/runner/runner.py	33.33%	5 Missing and 3 partials ⚠️
mmengine/utils/package_utils.py	55.55%	4 Missing and 4 partials ⚠️
mmengine/config/lazy.py	93.58%	3 Missing and 2 partials ⚠️
mmengine/utils/misc.py	87.50%	2 Missing and 1 partial ⚠️
mmengine/registry/registry.py	77.77%	2 Missing ⚠️
mmengine/runner/loops.py	0.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1071   +/-   ##
=======================================
  Coverage        ?   77.90%           
=======================================
  Files           ?      140           
  Lines           ?    11974           
  Branches        ?     2464           
=======================================
  Hits            ?     9328           
  Misses          ?     2205           
  Partials        ?      441

Flag	Coverage Δ
unittests	`77.90% <82.94%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

HAOCHENYE force-pushed the new_config_type branch from 16ac0f7 to ea5a049 Compare April 20, 2023 02:46

HAOCHENYE marked this pull request as ready for review April 25, 2023 09:58

HAOCHENYE requested review from zhouzaida, RangiLyu and C1rN09 as code owners April 25, 2023 09:58

HAOCHENYE force-pushed the new_config_type branch from cb40f97 to 8d19fa5 Compare April 25, 2023 10:00

HAOCHENYE added 24 commits April 25, 2023 18:07

experimental test

1270fb4

experimental test

001950a

support remove ml_collections

9aa3aac

Support configure torch.backends.cudnn.deterministic when CUDA>=10.2

a85e2ce

Fix: built duplicated instance

322f807

tmp save

3290f04

Support training mmrazor

ccbe7ad

Refactor lazy config

41b49e3

add config

98719d0

add config

9df438e

tmp save

d3817d2

tmp save

59a9699

support training on mmdet

df2d117

inherit by import 'base'

153d7c2

remove lazy_config

654f90a

remove experimental runner

dccdafc

add docstring and rename

48d0eb9

minor refine

0c0e380

Hidden the LazyObject to user

cab091d

Fix is_builtin_module

4280160

rename unwrap_lazy and add docstring

d2165b9

1.Support define None default scope 2.Fix get callable functions from…

1ca12b0

… registry

Automatically deduce lazy_import

5661cb1

Fix setitem and update

8c246f1

HAOCHENYE added 2 commits June 14, 2023 16:41

Refine unit test as comments

85a9f5b

Fix code order in config docs

b559200

ly015 reviewed Jun 14, 2023

View reviewed changes

docs/en/advanced_tutorials/config.md Outdated Show resolved Hide resolved

add beta to Pure python style config

bf8eebc

zhouzaida reviewed Jun 14, 2023

View reviewed changes

tests/data/config/lazy_module_config/_base_/default_runtime.py Outdated Show resolved Hide resolved

zhouzaida reviewed Jun 14, 2023

View reviewed changes

mmengine/registry/registry.py Outdated Show resolved Hide resolved

zhouzaida reviewed Jun 14, 2023

View reviewed changes

mmengine/runner/runner.py Outdated Show resolved Hide resolved

zhouzaida reviewed Jun 14, 2023

View reviewed changes

mmengine/runner/runner.py Outdated Show resolved Hide resolved

HAOCHENYE added 4 commits June 14, 2023 21:25

Update unit test and fix small bugs

e38c133

update test resources

825168f

Fix default scope will not be set if it is None

066581e

rename locate to get_object_from_string

4779185

zhouzaida reviewed Jun 14, 2023

View reviewed changes

mmengine/registry/registry.py Outdated Show resolved Hide resolved

refine comments

0753c7b

zhouzaida reviewed Jun 14, 2023

View reviewed changes

mmengine/registry/registry.py Outdated Show resolved Hide resolved

HAOCHENYE added 2 commits June 14, 2023 23:44

refine docstring

dbe90b6

refine docstring

7fde891

zhouzaida reviewed Jun 15, 2023

View reviewed changes

docs/en/advanced_tutorials/config.md Outdated Show resolved Hide resolved

HAOCHENYE added 4 commits June 15, 2023 11:32

Refine docs

6602044

Update unittest for get_install_path and is_install

7f56367

Fix submodule_search_locations is not subscritable in Python 3.7

add21c1

Fix ut

78a75ce

zhouzaida changed the title ~~New config type~~ [Feature] Support Pure Python style Configuration File Jun 15, 2023

HAOCHENYE added 2 commits June 15, 2023 18:11

Fix ut

12a8df9

Fix ut

18eb7c4

zhouzaida approved these changes Jun 16, 2023

View reviewed changes

zhouzaida merged commit 6ece63e into open-mmlab:main Jun 16, 2023

okotaku mentioned this pull request Sep 2, 2023

DiffEngine Plan in 2023 okotaku/diffengine#30

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Pure Python style Configuration File #1071

[Feature] Support Pure Python style Configuration File #1071

HAOCHENYE commented Apr 12, 2023 •

edited

Loading

ly015 commented Jun 14, 2023

codecov bot commented Aug 23, 2024 •

edited

Loading

[Feature] Support Pure Python style Configuration File #1071

[Feature] Support Pure Python style Configuration File #1071

Conversation

HAOCHENYE commented Apr 12, 2023 • edited Loading

Motivation

Add Pure Python style Configuration File

Basic Syntax

Module Construction

Inheritance

What is Lazy Import

Limitations

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

ly015 commented Jun 14, 2023

codecov bot commented Aug 23, 2024 • edited Loading

Codecov Report

HAOCHENYE commented Apr 12, 2023 •

edited

Loading

codecov bot commented Aug 23, 2024 •

edited

Loading