Abstract With the recent advancements in the field of semantic segmentation, an encoderdecoder approach like U-Net are most widely used to solve biomedical image segmentation tasks. To improve upon the existing U-Net, we proposed a novel architecture called Multi-Scale Dilated Fusion Network (MSDFNet). In this work, we have used the pre-trained ResNet50 as the encoder, which had already learned features that can be used by the decoder to generate the binary mask. In addition, we had used skip-connections to facilitate the transfer of features from the encoder to the decoder directly. Some of these features are lost due to the depth of the network. The decoder consists of a Multi-Scale Dilated Fusion block, as the main components of the decoder, where we fused the multi-scale features and then apply some dilated convolution upon them. We have trained both the UNet and the proposed architecture on the Ksavir-Instrument dataset, where the proposed architecture has a 3.701 % gain in the F1 score and 4.376 % in the Jaccard. These results show the improvement over the existing U-Net model.
Alan : Mühendislik
Dergi Türü : Uluslararası
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|