TorchSeg is an actively maintained and up-to-date fork of the Segmentation Models PyTorch (torchseg).
Features
The main features of this library are:
High level API (just two lines to create a neural network)
9 models architectures for binary and multi class segmentation (including legendary Unet)
124 available encoders (and 500+ encoders from timm)
All encoders have pre-trained weights for faster and better convergence
Popular losses for training routines
Example Usage
Segmentation model is just a PyTorch nn.Module, which can be created as easy as:
importtorchsegmodel=torchseg.Unet(
encoder_name="resnet34", # choose encoder, e.g. mobilenet_v2 or efficientnet-b7encoder_weights="imagenet", # use `imagenet` pre-trained weights for encoder initializationin_channels=1, # model input channels (1 for gray-scale images, 3 for RGB, etc.)classes=3, # model output channels (number of classes in your dataset)
)
The following is a list of supported encoders in TorchSeg. Select the appropriate family of encoders and click to expand the table and select a specific encoder and its pre-trained weights (encoder_name and encoder_weights parameters).
ResNet
Encoder
Weights
Params, M
resnet18
imagenet / ssl / swsl
11M
resnet34
imagenet
21M
resnet50
imagenet / ssl / swsl
23M
resnet101
imagenet
42M
resnet152
imagenet
58M
ResNeXt
Encoder
Weights
Params, M
resnext50_32x4d
imagenet / ssl / swsl
22M
resnext101_32x4d
ssl / swsl
42M
resnext101_32x8d
imagenet / instagram / ssl / swsl
86M
resnext101_32x16d
instagram / ssl / swsl
191M
resnext101_32x32d
instagram
466M
resnext101_32x48d
instagram
826M
ResNeSt
Encoder
Weights
Params, M
timm-resnest14d
imagenet
8M
timm-resnest26d
imagenet
15M
timm-resnest50d
imagenet
25M
timm-resnest101e
imagenet
46M
timm-resnest200e
imagenet
68M
timm-resnest269e
imagenet
108M
timm-resnest50d_4s2x40d
imagenet
28M
timm-resnest50d_1s4x24d
imagenet
23M
Res2Ne(X)t
Encoder
Weights
Params, M
timm-res2net50_26w_4s
imagenet
23M
timm-res2net101_26w_4s
imagenet
43M
timm-res2net50_26w_6s
imagenet
35M
timm-res2net50_26w_8s
imagenet
46M
timm-res2net50_48w_2s
imagenet
23M
timm-res2net50_14w_8s
imagenet
23M
timm-res2next50
imagenet
22M
RegNet(x/y)
Encoder
Weights
Params, M
timm-regnetx_002
imagenet
2M
timm-regnetx_004
imagenet
4M
timm-regnetx_006
imagenet
5M
timm-regnetx_008
imagenet
6M
timm-regnetx_016
imagenet
8M
timm-regnetx_032
imagenet
14M
timm-regnetx_040
imagenet
20M
timm-regnetx_064
imagenet
24M
timm-regnetx_080
imagenet
37M
timm-regnetx_120
imagenet
43M
timm-regnetx_160
imagenet
52M
timm-regnetx_320
imagenet
105M
timm-regnety_002
imagenet
2M
timm-regnety_004
imagenet
3M
timm-regnety_006
imagenet
5M
timm-regnety_008
imagenet
5M
timm-regnety_016
imagenet
10M
timm-regnety_032
imagenet
17M
timm-regnety_040
imagenet
19M
timm-regnety_064
imagenet
29M
timm-regnety_080
imagenet
37M
timm-regnety_120
imagenet
49M
timm-regnety_160
imagenet
80M
timm-regnety_320
imagenet
141M
GERNet
Encoder
Weights
Params, M
timm-gernet_s
imagenet
6M
timm-gernet_m
imagenet
18M
timm-gernet_l
imagenet
28M
SE-Net
Encoder
Weights
Params, M
senet154
imagenet
113M
se_resnet50
imagenet
26M
se_resnet101
imagenet
47M
se_resnet152
imagenet
64M
se_resnext50_32x4d
imagenet
25M
se_resnext101_32x4d
imagenet
46M
SK-ResNe(X)t
Encoder
Weights
Params, M
timm-skresnet18
imagenet
11M
timm-skresnet34
imagenet
21M
timm-skresnext50_32x4d
imagenet
25M
DenseNet
Encoder
Weights
Params, M
densenet121
imagenet
6M
densenet169
imagenet
12M
densenet201
imagenet
18M
densenet161
imagenet
26M
Inception
Encoder
Weights
Params, M
inceptionresnetv2
imagenet / imagenet+background
54M
inceptionv4
imagenet / imagenet+background
41M
xception
imagenet
22M
EfficientNet
Encoder
Weights
Params, M
efficientnet-b0
imagenet
4M
efficientnet-b1
imagenet
6M
efficientnet-b2
imagenet
7M
efficientnet-b3
imagenet
10M
efficientnet-b4
imagenet
17M
efficientnet-b5
imagenet
28M
efficientnet-b6
imagenet
40M
efficientnet-b7
imagenet
63M
timm-efficientnet-b0
imagenet / advprop / noisy-student
4M
timm-efficientnet-b1
imagenet / advprop / noisy-student
6M
timm-efficientnet-b2
imagenet / advprop / noisy-student
7M
timm-efficientnet-b3
imagenet / advprop / noisy-student
10M
timm-efficientnet-b4
imagenet / advprop / noisy-student
17M
timm-efficientnet-b5
imagenet / advprop / noisy-student
28M
timm-efficientnet-b6
imagenet / advprop / noisy-student
40M
timm-efficientnet-b7
imagenet / advprop / noisy-student
63M
timm-efficientnet-b8
imagenet / advprop
84M
timm-efficientnet-l2
noisy-student
474M
timm-efficientnet-lite0
imagenet
4M
timm-efficientnet-lite1
imagenet
5M
timm-efficientnet-lite2
imagenet
6M
timm-efficientnet-lite3
imagenet
8M
timm-efficientnet-lite4
imagenet
13M
MobileNet
Encoder
Weights
Params, M
mobilenet_v2
imagenet
2M
timm-mobilenetv3_large_075
imagenet
1.78M
timm-mobilenetv3_large_100
imagenet
2.97M
timm-mobilenetv3_large_minimal_100
imagenet
1.41M
timm-mobilenetv3_small_075
imagenet
0.57M
timm-mobilenetv3_small_100
imagenet
0.93M
timm-mobilenetv3_small_minimal_100
imagenet
0.43M
DPN
Encoder
Weights
Params, M
dpn68
imagenet
11M
dpn68b
imagenet+5k
11M
dpn92
imagenet+5k
34M
dpn98
imagenet
58M
dpn107
imagenet+5k
84M
dpn131
imagenet
76M
VGG
Encoder
Weights
Params, M
vgg11
imagenet
9M
vgg11_bn
imagenet
9M
vgg13
imagenet
9M
vgg13_bn
imagenet
9M
vgg16
imagenet
14M
vgg16_bn
imagenet
14M
vgg19
imagenet
20M
vgg19_bn
imagenet
20M
Mix Vision Transformer
Backbone from SegFormer pretrained on Imagenet! Can be used with other decoders from package, you can combine Mix Vision Transformer with Unet, FPN and others!
Limitations:
encoder is not supported by Linknet, Unet++
encoder is supported by FPN only for encoder depth = 5
Encoder
Weights
Params, M
mit_b0
imagenet
3M
mit_b1
imagenet
13M
mit_b2
imagenet
24M
mit_b3
imagenet
44M
mit_b4
imagenet
60M
mit_b5
imagenet
81M
MobileOne
Apple's "sub-one-ms" Backbone pretrained on Imagenet! Can be used with all decoders.
Note: In the official github repo the s0 variant has additional num_conv_branches, leading to more params than s1.
Encoder
Weights
Params, M
mobileone_s0
imagenet
4.6M
mobileone_s1
imagenet
4.0M
mobileone_s2
imagenet
6.5M
mobileone_s3
imagenet
8.8M
mobileone_s4
imagenet
13.6M
* ssl, swsl - semi-supervised and weakly-supervised learning on ImageNet (repo).
Timm Encoders
Pytorch Image Models (a.k.a. timm) has a lot of pretrained models and interface which allows using these models as encoders in torchseg, however, not all models are supported
not all transformer models have features_only functionality implemented that is required for encoder
some models have inappropriate strides
Total number of supported encoders: 549
Models API
model.encoder - pretrained backbone to extract features of different spatial resolution
model.decoder - depends on models architecture (Unet/Linknet/PSPNet/FPN)
model.segmentation_head - last block to produce required number of mask channels (include also optional upsampling and activation)
model.classification_head - optional block which create classification head on top of encoder
model.forward(x) - sequentially pass x through model`s encoder, decoder and segmentation head (and classification head if specified)
Input channels
Input channels parameter allows you to create models, which process tensors with arbitrary number of channels.
If you use pretrained weights from imagenet - weights of first convolution will be reused. For
1-channel case it would be a sum of weights of first convolution layer, otherwise channels would be
populated with weights like new_weight[:, i] = pretrained_weight[:, i % 3] and than scaled with new_weight * 3 / new_in_channels.
All models support aux_params parameters, which is default set to None.
If aux_params = None then classification auxiliary output is not created, else
model produce not only mask, but also label output with shape NC.
Classification head consists of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be
configured by aux_params as follows:
aux_params=dict(
pooling='avg', # one of 'avg', 'max'dropout=0.5, # dropout ratio, default is Noneactivation='sigmoid', # activation function, default is Noneclasses=4, # define number of output labels
)
model=torchseg.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label=model(x)
Depth
Depth parameter specify a number of downsampling operations in encoder, so you can make
your model lighter if specify smaller depth.
The Tidelift Subscription provides access to a continuously curated stream of human-researched and maintainer-verified data on open source packages and their licenses, releases, vulnerabilities, and development practices.