mmrazor.apis¶
mmcls¶
- mmrazor.apis.mmcls.set_random_seed(seed, deterministic=False)[source]¶
Set random seed.
- Parameters
seed (int) – Seed to be used.
deterministic (bool) – Whether to set the deterministic option for CUDNN backend, i.e., set
torch.backends.cudnn.deterministic
to True andtorch.backends.cudnn.benchmark
to False. Default: False.
- mmrazor.apis.mmcls.train_model(model, dataset, cfg, distributed=False, validate=False, timestamp=None, device='cuda', meta=None)[source]¶
Copy from mmclassification and modify some codes.
This is an ugly implementation, and will be deprecated in the future. In the future, there will be only one train api and no longer distinguish between mmclassificaiton, mmsegmentation or mmdetection.
mmdet¶
mmseg¶
mmrazor.core¶
hooks¶
- class mmrazor.core.hooks.DistSamplerSeedHook[source]¶
Data-loading sampler for distributed training.
When distributed training, it is only useful in conjunction with
EpochBasedRunner
, while :obj:IterBasedRunner
achieves the same purpose withIterLoader
.
- class mmrazor.core.hooks.DropPathProbHook(max_prob, interval=- 1, by_epoch=True, **kwargs)[source]¶
Set drop_path_prob periodically.
- Parameters
max_prob (float) – The max probability of dropping.
interval (int) – The saving period. If
by_epoch=True
, interval indicates epochs, otherwise it indicates iterations. Default: -1, which means “never”.by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.
- class mmrazor.core.hooks.SearchSubnetHook(interval=- 1, by_epoch=True, out_dir=None, max_keep_ckpts=- 1, save_last=True, **kwargs)[source]¶
Save checkpoints periodically.
- Parameters
interval (int) – The saving period. If
by_epoch=True
, interval indicates epochs, otherwise it indicates iterations. Default: -1, which means “never”.by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.
out_dir (str, optional) – The directory to save checkpoints. If not specified,
runner.work_dir
will be used by default.max_keep_ckpts (int, optional) – The maximum checkpoints to keep. In some cases we want only the latest few checkpoints and would like to delete old ones to save the disk space. Default: -1, which means unlimited.
save_last (bool) – Whether to force the last checkpoint to be saved regardless of interval.
optimizer¶
- mmrazor.core.optimizer.build_optimizers(model, cfgs)[source]¶
Build multiple optimizers from configs. If cfgs contains several dicts for optimizers, then a dict for each constructed optimizers will be returned. If cfgs only contains one optimizer config, the constructed optimizer itself will be returned. For example, 1) Multiple optimizer configs: code-block:
optimizer_cfg = dict( model1=dict(type='SGD', lr=lr), model2=dict(type='SGD', lr=lr))
The return dict is
dict('model1': torch.optim.Optimizer, 'model2': torch.optim.Optimizer)
2) Single optimizer config: .. code-block:optimizer_cfg = dict(type='SGD', lr=lr)
The return is
torch.optim.Optimizer
. :param model: The model with parameters to be optimized. :type model:nn.Module
:param cfgs: The config dict of the optimizer. :type cfgs: dict- Returns
The initialized optimizers.
- Return type
dict[
torch.optim.Optimizer
] |torch.optim.Optimizer
runners¶
- class mmrazor.core.runners.MultiLoaderEpochBasedRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[source]¶
Multi Dataloaders EpochBasedRunner.
There are three differences from EpochBaseRunner: 1)Support load data from multi dataloaders. 2) Support freeze some optimizer’s lr update when runner has multi optimizers. 3) Add
search_subnet
api.- register_lr_hook(lr_config)[source]¶
Resister a hook for setting learning rate.
- Parameters
lr_config (dict) – Config for setting learning rate.
- search_subnet(out_dir, filename_tmpl='epoch_{}.yaml', create_symlink=True)[source]¶
Search the best subnet.
- Parameters
out_dir (str) – The directory that subnets are saved.
filename_tmpl (str, optional) – The subnet filename template, which contains a placeholder for the epoch number. Defaults to ‘epoch_{}.yaml’.
create_symlink (bool, optional) – Whether to create a symlink “latest.yaml” to point to the latest subnet. Defaults to True.
- class mmrazor.core.runners.MultiLoaderIterBasedRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[source]¶
Multi Dataloaders IterBasedRunner.
There are three differences from IterBasedRunner 1)Support load data from multi dataloaders. 2) Support freeze some optimizer’s lr update when runner has multi optimizers. 3) Add
search_subnet
api.- register_lr_hook(lr_config)[source]¶
Resister a hook for setting learning rate.
- Parameters
lr_config (dict) – Config for setting learning rate.
- run(data_loaders, workflow, max_iters=None, **kwargs)[source]¶
Start running.
- Parameters
data_loaders (list[
DataLoader
]) – Dataloaders for training and validation.workflow (list[tuple]) – A list of (phase, iters) to specify the running order and iterations. E.g, [(‘train’, 10000), (‘val’, 1000)] means running 10000 iterations for training and 1000 iterations for validation, iteratively.
max_iters (int) – Specify the max iters.
- search_subnet(out_dir, filename_tmpl='epoch_{}.yaml', create_symlink=True)[source]¶
Search the best subnet.
- Parameters
out_dir (str) – The directory that subnets are saved.
filename_tmpl (str, optional) – The subnet filename template, which contains a placeholder for the epoch number. Defaults to ‘epoch_{}.yaml’.
create_symlink (bool, optional) – Whether to create a symlink “latest.yaml” to point to the latest subnet. Defaults to True.
searcher¶
- class mmrazor.core.searcher.EvolutionSearcher(algorithm, dataloader, test_fn, work_dir, logger, candidate_pool_size=50, candidate_top_k=10, constraints={'flops': 330000000.0}, metrics=None, metric_options=None, score_key='accuracy_top-1', max_epoch=20, num_mutation=25, num_crossover=25, mutate_prob=0.1, resume_from=None, **search_kwargs)[source]¶
Implement of evolution search.
- Parameters
algorithm (
torch.nn.Module
) – Algorithm to be used.dataloader (nn.Dataloader) – Pytorch data loader.
test_fn (function) – Test api to used for evaluation.
work_dir (str) – Working direction is to save search result and log.
logger (logging.Logger) – To log info in search stage.
candidate_pool_size (int) – The length of candidate pool.
candidate_top_k (int) – Specify top k candidates based on scores.
constraints (dict) – Constraints to be used for screening candidates.
metrics (str) – Metrics to be used for evaluating candidates.
metric_options (str) – Options to be used for metrics.
score_key (str) – To be used for specifying one metric from evaluation results.
max_epoch (int) – Specify max epoch to end evolution search.
num_mutation (int) – The number of candidates got by mutation.
num_crossover (int) – The number of candidates got by crossover.
mutate_prob (float) – The probability of mutation.
resume_from (str) – Specify the path of saved .pkl file for resuming searching
- class mmrazor.core.searcher.GreedySearcher(algorithm, dataloader, target_flops, test_fn, work_dir, logger, max_channel_bins, min_channel_bins=1, metrics='accuracy', metric_options=None, score_key='accuracy_top-1', resume_from=None, **search_kwargs)[source]¶
Search with the greedy algorithm.
We start with the largest model and compare the network accuracy among the architectures where each layer is slimmed by one channel bin. We then greedily slim the layer with minimal performance drop. During the iterative slimming, we obtain optimized channel configurations under different resource constraints. We stop until reaching the strictest constraint (e.g., 200M FLOPs).
- Parameters
algorithm (
torch.nn.Module
) – Specific implemented algorithm based specific task in mmRazor, eg: AutoSlim.dataloader (
torch.nn.Dataloader
) – Pytorch data loader.target_flops (list) – The target flops of the searched models.
test_fn (callable) – test a model with samples from a dataloader, and return the test results.
work_dir (str) – Output result file.
logger (logging.Logger) – To log info in search stage.
max_channel_bins (int) – The maximum number of channel bins in each layer. Note that each layer is slimmed by one channel bin.
min_channel_bins (int) – The minimum number of channel bins in each layer. Default to 1.
metrics (str | list[str]) – Metrics to be evaluated. Default value is
accuracy
metric_options (dict, optional) – Options for calculating metrics. Allowed keys are ‘topk’, ‘thrs’ and ‘average_mode’. Defaults to None.
score_key (str) – The metric to judge the performance of a model. Defaults to accuracy_top-1.
resume_from (str, optional) – Specify the path of saved .pkl file for resuming searching. Defaults to None.
utils¶
- mmrazor.core.utils.broadcast_object_list(object_list, src=0)[source]¶
Broadcasts picklable objects in
object_list
to the whole group.Note that all objects in
object_list
must be picklable in order to be broadcasted.- Parameters
object_list (List[Any]) – List of input objects to broadcast. Each object must be picklable. Only objects on the src rank will be broadcast, but each rank must provide lists of equal sizes.
src (int) – Source rank from which to broadcast
object_list
.
mmrazor.models¶
algorithms¶
- class mmrazor.models.algorithms.AutoSlim(num_sample_training=4, input_shape=(3, 224, 224), bn_training_mode=False, **kwargs)[source]¶
AutoSlim: A one-shot architecture search for channel numbers.
Please refer to the paper <https://arxiv.org/abs/1903.11728> for details.
- Parameters
num_sample_training (int) – In each iteration we train the model at smallest width, largest width and (num_sample_training − 2) random widths. It should be no less than 2. Defaults to 4
input_shape (tuple) – Input shape used for calculation the flops of the supernet.
bn_training_mode (bool) – Whether set bn to training mode when model is set to eval mode. Note that in slimmable networks, accumulating different numbers of channels results in different feature means and variances, which further leads to inaccurate statistics of shared BN. Set
bn_training_mode
to True to use the feature means and variances in a batch.
- class mmrazor.models.algorithms.Darts(unroll, **kwargs)[source]¶
- train_step(data, optimizer)[source]¶
The iteration step during training.
This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating are also defined in this method, such as GAN.
- Parameters
data (dict) – The output of dataloader.
optimizer (
torch.optim.Optimizer
| dict) – The optimizer of runner is passed totrain_step()
. This argument is unused and reserved.
- Returns
- It should contain at least 3 keys:
loss
,log_vars
, num_samples
.loss
is a tensor for back propagation, which can be a weighted sum of multiple losses.log_vars
contains all the variables to be sent to the logger.num_samples
indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.
- It should contain at least 3 keys:
- Return type
dict
- class mmrazor.models.algorithms.GeneralDistill(with_student_loss=True, with_teacher_loss=False, **kwargs)[source]¶
General Distillation Algorithm.
- Parameters
with_student_loss (bool) – Whether to use student loss. Defaults to True.
with_teacher_loss (bool) – Whether to use teacher loss. Defaults to False.
- class mmrazor.models.algorithms.SPOS(input_shape=(3, 224, 224), bn_training_mode=False, **kwargs)[source]¶
Implementation of SPOS
architectures¶
- class mmrazor.models.architectures.MMClsArchitecture(**kwargs)[source]¶
Architecture based on MMCls.
distillers¶
- class mmrazor.models.distillers.SelfDistiller(components, **kwargs)[source]¶
Transfer knowledge inside a single model.
- Parameters
components (dict) – The details of the distillation. It usually includes the module names of the teacher and the student, and the losses used in the distillation.
- exec_student_forward(student, data)[source]¶
Forward computation of the student.
- Parameters
student (
torch.nn.Module
) – The student model to be used in the distillation.data (dict) – The output of dataloader.
- exec_teacher_forward(teacher, data)[source]¶
Forward computation of the teacher.
- Parameters
teacher (
torch.nn.Module
) – The teacher model to be used in the distillation.data (dict) – The output of dataloader.
- prepare_from_student(student)[source]¶
Registers a global forward hook for each teacher module and student module to be used in the distillation.
- Parameters
student (
torch.nn.Module
) – The student model to be used in the distillation.
- class mmrazor.models.distillers.SingleTeacherDistiller(teacher, teacher_trainable=False, teacher_norm_eval=True, components=(), **kwargs)[source]¶
Distiller with single teacher.
- Parameters
teacher (dict) – The config dict for teacher.
teacher_trainable (bool) – Whether the teacher is trainable. Default: False.
teacher_norm_eval (bool) – Whether to set teacher’s norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: True.
components (dict) – The details of the distillation. It usually includes the module names of the teacher and the student, and the losses used in the distillation.
- build_align_module(cfg)[source]¶
Build
align_module
from the cfg.align_module
is needed when the number of channels output by the teacher module is not equal to that of the student module, or for some other reasons.- Parameters
cfg (dict) – The config dict for
align_module
.
- exec_student_forward(student, data)[source]¶
Execute the teacher’s forward function.
After this function, the student’s featuremaps will be saved in
student_outputs
.
- exec_teacher_forward(data)[source]¶
Execute the teacher’s forward function.
After this function, the teacher’s featuremaps will be saved in
teacher_outputs
.
- prepare_from_student(student)[source]¶
Registers a global forward hook for each teacher module and student module to be used in the distillation.
- Parameters
student (
torch.nn.Module
) – The student model to be used in the distillation.
- student_forward_output_hook(module, inputs, outputs)[source]¶
Save the module’s forward output.
- Parameters
module (
torch.nn.Module
) – The module to register hook.inputs (tuple) – The input of the module.
outputs (tuple) – The output of the module.
losses¶
- class mmrazor.models.losses.ChannelWiseDivergence(tau=1.0, loss_weight=1.0)[source]¶
PyTorch version of `Channel-wise Distillation for Semantic Segmentation.
<https://arxiv.org/abs/2011.13256>`_.
- Parameters
tau (float) – Temperature coefficient. Defaults to 1.0.
loss_weight (float) – Weight of loss. Defaults to 1.0.
- class mmrazor.models.losses.KLDivergence(tau=1.0, reduction='batchmean', loss_weight=1.0)[source]¶
A measure of how one probability distribution Q is different from a second, reference probability distribution P.
- Parameters
tau (float) – Temperature coefficient. Defaults to 1.0.
reduction (str) –
Specifies the reduction to apply to the loss:
'none'
|'batchmean'
|'sum'
|'mean'
.'none'
: no reduction will be applied,'batchmean'
: the sum of the output will be divided bythe batchsize,
'sum'
: the output will be summed,'mean'
: the output will be divided by the number ofelements in the output.
Default:
'batchmean'
loss_weight (float) – Weight of loss. Defaults to 1.0.
- forward(preds_S, preds_T)[source]¶
Forward computation.
- Parameters
preds_S (torch.Tensor) – The student model prediction with shape (N, C, H, W) or shape (N, C).
preds_T (torch.Tensor) – The teacher model prediction with shape (N, C, H, W) or shape (N, C).
- Returns
The calculated loss value.
- Return type
torch.Tensor
- class mmrazor.models.losses.WSLD(tau=1.0, loss_weight=1.0, num_classes=1000)[source]¶
PyTorch version of Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective.
- Parameters
tau (float) – Temperature coefficient. Defaults to 1.0.
loss_weight (float) – Weight of loss. Defaults to 1.0.
num_classes (int) – Defaults to 1000.
- forward(student, teacher)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
mutables¶
- class mmrazor.models.mutables.DifferentiableEdge(with_arch_param, **kwargs)[source]¶
Differentiable Edge.
Search the best module from choices by learnable parameters.
- Parameters
with_arch_param (bool) – whether build learable architecture parameters.
- forward(prev_inputs, arch_param=None)[source]¶
forward function.
In some algorithms, there are several
MutableModule
share the same architecture parameters. So the architecture parameters are passed in as args.- Parameters
prev_inputs (list[torch.Tensor]) – each choice’s inputs.
arch_param (torch.nn.Parameter) – architecture parameters.
- class mmrazor.models.mutables.DifferentiableOP(with_arch_param, **kwargs)[source]¶
Differentiable OP.
Search the best module from choices by learnable parameters.
- Parameters
with_arch_param (bool) – whether build learable architecture parameters.
- forward(x, arch_param=None)[source]¶
forward function.
In some algorithms, there are several
MutableModule
share the same architecture parameters. So the architecture parameters are passed in as args.- Parameters
prev_inputs (list[torch.Tensor]) – each choice’s inputs.
arch_param (torch.nn.Parameter) – architecture parameters.
- class mmrazor.models.mutables.GumbelEdge(**kwargs)[source]¶
Gumbel Edge.
Search the best module from choices by gumbel trick.
- class mmrazor.models.mutables.GumbelOP(tau=1.0, hard=True, **kwargs)[source]¶
Gumbel OP.
Search the best module from choices by gumbel trick.
- class mmrazor.models.mutables.MutableEdge(choices, **kwargs)[source]¶
Mutable Edge. In some NAS algorithms (Darts, AutoDeeplab, etc.), the connections between modules are searchable, such as the connections between a node and its previous nodes in Darts.
MutableEdge
has N modules to process N inputs respectively.- Parameters
choices (torch.nn.ModuleDict) – Unlike
MutableOP
, there are already created modules in choices.
- class mmrazor.models.mutables.MutableModule(space_id, num_chosen=1, init_cfg=None, **kwargs)[source]¶
Base class for
MUTABLES
. Searchable module for building searchable architecture in NAS. It mainly consists of module and mask, and achieving searchable function by handling mask.- Parameters
space_id (str) – Used to index
Placeholder
, it is one and only index for eachPlaceholder
.num_chosen (str) – The number of chosen
OPS
in theMUTABLES
.init_cfg (dict) – Init config for
BaseModule
.
- build_choice_mask()[source]¶
Generate the choice mask for the choices of
MUTABLES
.- Returns
Init choice mask. Its elements’ type is bool.
- Return type
torch.Tensor
- abstract build_choices(cfg)[source]¶
Build all chosen
OPS
used to combineMUTABLES
, and the choices will be sampled.- Parameters
cfg (dict) – The config for the choices.
- build_space_mask()[source]¶
Generate the space mask for the search spaces of
MUTATORS
.- Returns
Init choice mask. Its elements’ type is float.
- Return type
torch.Tensor
- property choice_modules¶
The choices’ modules.
- Returns
The values of the choices.
- Return type
tuple
- property choice_names¶
The choices’ names.
- Returns
The keys of the choices.
- Return type
tuple
- export(chosen)[source]¶
Delete not chosen
OPS
in the choices.- Parameters
chosen (list[str]) – Names of chosen
OPS
.
- abstract forward(x)[source]¶
Forward computation.
- Parameters
x (tensor | tuple[tensor]) – x could be a Torch.tensor or a tuple of Torch.tensor, containing input data for forward computation.
- property num_choices¶
The number of the choices.
- Returns
the length of the choices.
- Return type
int
- class mmrazor.models.mutables.MutableOP(choices, choice_args, **kwargs)[source]¶
An important type of
MUTABLES
, inherits fromMutableModule
.- Parameters
choices (dict) – The configs for the choices, the chosen
OPS
used to combineMUTABLES
.choice_args (dict) – The args used to set chosen
OPS
.
- build_choices(cfgs, choice_args)[source]¶
Build all chosen
OPS
used to combineMUTABLES
, and the choices will be sampled.- Parameters
cfgs (dict) – The configs for the choices.
choice_args (dict) – The args used to set chosen
OPS
.
- Returns
Consists of chosen
OPS
in the arg cfgs.- Return type
torch.nn.ModuleDict
- class mmrazor.models.mutables.OneShotOP(**kwargs)[source]¶
A type of
MUTABLES
for the one-shot NAS.- forward(x)[source]¶
Forward computation for chosen
OPS
, in one-shot NAS, the number of chosenOPS
can only be one.- Parameters
x (tensor | tuple[tensor]) – x could be a Torch.tensor or a tuple of Torch.tensor, containing input data for forward computation.
- Returns
The result of forward.
- Return type
torch.Tensor
mutators¶
- class mmrazor.models.mutators.DifferentiableMutator(**kwargs)[source]¶
A mutator for the differentiable NAS, which mainly provide some core functions of changing the structure of
ARCHITECTURES
.- build_arch_params(supernet)[source]¶
This function will build many arch params, which are generally used in diffirentiale search algorithms, such as Darts’ series. Each space_id corresponds to an arch param, so the Mutable with the same space_id share the same arch param.
- Parameters
supernet (
torch.nn.Module
) – The architecture to be used in your algorithm.- Returns
- the arch params are got after traversing
the supernet.
- Return type
torch.nn.ParameterDict
- class mmrazor.models.mutators.OneShotMutator(**kwargs)[source]¶
A mutator for the one-shot NAS, which mainly provide some core functions of changing the structure of
ARCHITECTURES
.- static crossover(subnet_dict1, subnet_dict2)[source]¶
Crossover used in evolution search.
- Parameters
subnet_dict1 (dict) – Record the information to build the subnet from the supernet, its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are masks.subnet_dict2 (dict) – Record the information to build the subnet from the supernet, its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are masks.
- Returns
A new subnet_dict after crossover.
- Return type
dict
- static get_random_mask(space_info, searching)[source]¶
Generate random mask for randomly sampling.
- Parameters
space_info (dict) – Record the information of the space need to sample.
searching (bool) – Whether is in search stage.
- Returns
Random mask generated.
- Return type
torch.Tensor
- mutation(subnet_dict, prob=0.1)[source]¶
Mutation used in evolution search.
- Parameters
subnet_dict (dict) – Record the information to build the subnet from the supernet, its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are masks.prob (float) – The probability of mutation.
- Returns
A new subnet_dict after mutation.
- Return type
dict
- static reset_in_subnet(m, in_subnet=True)[source]¶
Reset the module’s attribution.
- Parameters
m (
torch.nn.Module
) – The module in the supernet.in_subnet (bool) – If the module in subnet, set
in_subnet
to True, otherwise set to False.
- sample_subnet(searching=False)[source]¶
Random sample subnet by random mask.
- Parameters
searching (bool) – Whether is in search stage.
- Returns
- Record the information to build the subnet from the supernet,
its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are random mask generated.
- Return type
dict
- set_chosen_subnet(subnet_dict)[source]¶
Set chosen subnet in the search_spaces after searching stage.
- Parameters
subnet_dict (dict) – Record the information to build the subnet from the supernet, its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are masks.
- set_subnet(subnet_dict)[source]¶
Setting subnet in the supernet based on the result of
sample_subnet
by changing the flag:in_subnet
, which is easy to implement some operations for subnet, such asforward
, calculate flops and so on.- Parameters
subnet_dict (dict) – Record the information to build the subnet from the supernet, its keys are the properties
space_id
of placeholders in the mutator’s search spaces, its values are masks.
ops¶
- class mmrazor.models.ops.DartsDilConv(kernel_size, use_drop_path=False, norm_cfg={'type': 'BN'}, **kwargs)[source]¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.DartsPoolBN(pool_type, kernel_size=3, norm_cfg={'type': 'BN'}, use_drop_path=False, **kwargs)[source]¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.DartsSepConv(kernel_size, use_drop_path=False, norm_cfg={'type': 'BN'}, **kwargs)[source]¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.DartsSkipConnect(use_drop_path=False, norm_cfg={'type': 'BN'}, **kwargs)[source]¶
Reduce feature map size by factorized pointwise (stride=2).
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.DartsZero(**kwargs)[source]¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.Identity(conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg=None, **kwargs)[source]¶
Base class for searchable operations.
- Parameters
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Default: None.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.MBBlock(kernel_size, expand_ratio, se_cfg=None, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, drop_path_rate=0.0, with_cp=False, **kwargs)[source]¶
Mobilenet block for Searchable backbone.
- Parameters
kernel_size (int) – Size of the convolving kernel.
expand_ratio (int) – The input channels’ expand factor of the depthwise convolution.
se_cfg (dict, optional) – Config dict for se layer. Defaults to None, which means no se layer.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).
drop_path_rate (float) – stochastic depth rate. Defaults to 0.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
- Returns
The output tensor.
- Return type
Tensor
- class mmrazor.models.ops.ShuffleBlock(kernel_size, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False, **kwargs)[source]¶
InvertedResidual block for Searchable ShuffleNetV2 backbone.
- Parameters
kernel_size (int) – Size of the convolving kernel.
stride (int) – Stride of the convolution layer. Default: 1
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
- Returns
The output tensor.
- Return type
Tensor
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class mmrazor.models.ops.ShuffleXception(conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False, **kwargs)[source]¶
Xception block for ShuffleNetV2 backbone.
- Parameters
conv_cfg (dict, optional) – Config dict for convolution layer. Defaults to None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Defaults to dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Defaults to dict(type=’ReLU’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Defaults to False.
- Returns
The output tensor.
- Return type
Tensor
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
pruners¶
- class mmrazor.models.pruners.RatioPruner(ratios, **kwargs)[source]¶
A random ratio pruner.
Each layer can adjust its own width ratio randomly and independently.
- Parameters
ratios (list | tuple) – Width ratio of each layer can be chosen from ratios randomly. The width ratio is the ratio between the number of reserved channels and that of all channels in a layer. For example, if ratios is [0.25, 0.5], there are 2 cases for us to choose from when we sample from a layer with 12 channels. One is sampling the very first 3 channels in this layer, another is sampling the very first 6 channels in this layer. Default to None.
- convert_switchable_bn(module, num_bns)[source]¶
Convert normal
nn.BatchNorm2d
toSwitchableBatchNorm2d
.- Parameters
module (
torch.nn.Module
) – The module to be converted.num_bns (int) – The number of
nn.BatchNorm2d
in aSwitchableBatchNorm2d
.
- Returns
- The converted module. Each
nn.BatchNorm2d
in this module has been converted to aSwitchableBatchNorm2d
.
- Return type
torch.nn.Module
- sample_subnet()[source]¶
Random sample subnet by random mask.
- Returns
- Record the information to build the subnet from the supernet,
its keys are the properties
space_id
in the pruner’s search spaces, and its values are corresponding sampled out_mask.
- Return type
dict
- switch_subnet(channel_cfg, subnet_ind=None)[source]¶
Switch the channel config of the supernet according to channel_cfg.
If we train more than one subnet together, we need to switch the channel_cfg from one to another during one training iteration.
- Parameters
channel_cfg (dict) – The channel config of a subnet. Key is space_id and value is a dict which includes out_channels (and in_channels if exists).
subnet_ind (int, optional) – The index of the current subnet. If we replace normal BatchNorm2d with
SwitchableBatchNorm2d
, we should switch the index ofSwitchableBatchNorm2d
when switch subnet. Defaults to None.
- class mmrazor.models.pruners.StructurePruner(except_start_keys=['head.fc'])[source]¶
Base class for structure pruning. This class defines the basic functions of a structure pruner. Any pruner that inherits this class should at least define its own sample_subnet and set_min_channel functions. This part is being continuously optimized, and there may be major changes in the future.
Reference to https://github.com/jshilong/FisherPruning
- Parameters
except_start_keys (List[str]) – the module whose name start with a string in except_start_keys will not be prune.
- build_channel_spaces(name2module)[source]¶
Build channel search space.
- Parameters
name2module (dict) – A mapping between module_name and module.
- Returns
- The channel search space. The key is space_id and the value
is the corresponding out_mask.
- Return type
dict
- concat_backward_parser(grad_fn, module2name, var2module, cur_path, result_paths, visited)[source]¶
Parse the backward of a concat operation.
Example
>>> conv = nn.Conv2d(3, 3, 3) >>> pseudo_img = torch.rand(1, 3, 224, 224) >>> out1 = conv(pseudo_img) >>> out2 = conv(pseudo_img) >>> out = torch.cat([out1, out2], dim=1) >>> print(out.grad_fn.next_functions) ((<ThnnConv2DBackward object at 0x0000020E405F24C8>, 0), (<ThnnConv2DBackward object at 0x0000020E405F2648>, 0)) >>> # the length of ``out.grad_fn.next_functions`` is two means >>> # ``out`` is obtained by concatenating two tensors
- conv_backward_parser(grad_fn, module2name, var2module, cur_path, result_paths, visited)[source]¶
Parse the backward of a conv layer.
Example
>>> conv = nn.Conv2d(3, 3, 3) >>> pseudo_img = torch.rand(1, 3, 224, 224) >>> out = conv(pseudo_img) >>> print(out.grad_fn.next_functions) ((None, 0), (<AccumulateGrad object at 0x0000020E405CBD88>, 0), (<AccumulateGrad object at 0x0000020E405CB588>, 0)) >>> # op.next_functions[0][0] is None means this ThnnConv2DBackward >>> # op has no parents >>> # op.next_functions[1][0].variable is the weight of this Conv2d >>> # module >>> # op.next_functions[2][0].variable is the bias of this Conv2d >>> # module
- find_make_group_parser(node_name, name2module)[source]¶
Find the corresponding make_group_parser according to the
node_name
- find_node_parents(paths)[source]¶
Find the parent node of a node.
A node in the
paths
can be a module name or a operation name such as concat_140719322997152. Note that the string of numbers followingconcat
do not have a particular meaning. It just make the operation name unique.- Parameters
paths (list) – The traced paths.
- get_max_channel_bins(max_channel_bins)[source]¶
Get the max number of channel bins of all the groups which can be pruned during searching.
- Parameters
max_channel_bins (int) – The max number of bins in each layer.
- get_space_id(module_name)[source]¶
Get the corresponding space_id of the module_name.
The modules who share the same space_id will share the same out_mask. If the module is the output module(there is no other
nn.Module
whose input is its output), this function will return None. As the output module can not be pruned. If the input of this module is the concatenation of the output of severalnn.Module
, this function will return a dict object. If this module is in one of the groups, this function will return the group name. As the modules in the same group should share the same space_id. Otherwise, this function will return the module_name as space_id.- Parameters
module_name (str) – the name of a
nn.Module
.- Returns
the corresponding space_id of the module_name.
- Return type
str or dict or None
- linear_backward_parser(grad_fn, module2name, var2module, cur_path, result_paths, visited)[source]¶
Parse the backward of a conv layer.
Example
>>> fc = nn.Linear(3, 3, bias=True) >>> input = torch.rand(3, 3) >>> out = fc(input) >>> print(out.grad_fn.next_functions) ((<AccumulateGrad object at 0x0000020E405F75C8>, 0), (None, 0), (<TBackward object at 0x0000020E405F7D48>, 0)) >>> # op.next_functions[0][0].variable is the bias of this Linear >>> # module >>> # op.next_functions[1][0] is None means this AddmmBackward op >>> # has no parents >>> # op.next_functions[2][0] is the TBackward op, and >>> # op.next_functions[2][0].next_functions[0][0].variable is >>> # the transpose of the weight of this Linear module
- make_same_out_channel_groups(node2parents, name2module)[source]¶
Modules have the same child should be in the same group.
- abstract sample_subnet()[source]¶
Sample a subnet from the supernet.
- Returns
- Record the information to build the subnet from the supernet,
its keys are the properties
space_id
in the pruner’s search spaces, and its values are corresponding sampled out_mask.
- Return type
dict
- set_channel_bins(channel_bins_dict, max_channel_bins)[source]¶
Set subnet according to the number of channel bins in a layer.
- Parameters
channel_bins_dict (dict) – The number of bins in each layer. Key is the space_id of each layer and value is the corresponding mask of channel bin.
max_channel_bins (int) – The max number of bins in each layer.
- set_subnet(subnet_dict)[source]¶
Modify the in_mask and out_mask of modules in supernet according to subnet_dict.
- Parameters
subnet_dict (dict) – the key is space_id and the value is the corresponding sampled out_mask.
- trace_bn_conv_links(grad_fn, module2name, var2module, bn_conv_links, visited)[source]¶
Get the convolutional layer placed before a bn layer in the model.
Example
>>> conv = nn.Conv2d(3, 3, 3) >>> bn = nn.BatchNorm2d(3) >>> pseudo_img = torch.rand(1, 3, 224, 224) >>> out = bn(conv(pseudo_img)) >>> print(out.grad_fn.next_functions) ((<ThnnConv2DBackward object at 0x0000020E40639688>, 0), (<AccumulateGrad object at 0x0000020E40639208>, 0), (<AccumulateGrad object at 0x0000020E406398C8>, 0)) >>> # op.next_functions[0][0] is ThnnConv2DBackward means >>> # the parent of this NativeBatchNormBackward op is >>> # ThnnConv2DBackward >>> # op.next_functions[1][0].variable is the weight of this bn >>> # module >>> # op.next_functions[2][0].variable is the bias of this bn >>> # module