English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Revisiting Image Deblurring with an Efficient ConvNet

Ruan, L., Bemana, M., Seidel, H.-P., Myszkowski, K., & Chen, B. (2023). Revisiting Image Deblurring with an Efficient ConvNet. Retrieved from https://arxiv.org/abs/2302.02234.

Item is

Files

show Files
hide Files
:
arXiv:2302.02234.pdf (Preprint), 52MB
Name:
arXiv:2302.02234.pdf
Description:
File downloaded from arXiv at 2023-03-17 08:30
OA-Status:
Not specified
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Ruan, Lingyan1, Author           
Bemana, Mojtaba1, Author           
Seidel, Hans-Peter1, Author                 
Myszkowski, Karol1, Author                 
Chen, Bin1, Author           
Affiliations:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047              

Content

show
hide
Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV
 Abstract: Image deblurring aims to recover the latent sharp image from its blurry
counterpart and has a wide range of applications in computer vision. The
Convolution Neural Networks (CNNs) have performed well in this domain for many
years, and until recently an alternative network architecture, namely
Transformer, has demonstrated even stronger performance. One can attribute its
superiority to the multi-head self-attention (MHSA) mechanism, which offers a
larger receptive field and better input content adaptability than CNNs.
However, as MHSA demands high computational costs that grow quadratically with
respect to the input resolution, it becomes impractical for high-resolution
image deblurring tasks. In this work, we propose a unified lightweight CNN
network that features a large effective receptive field (ERF) and demonstrates
comparable or even better performance than Transformers while bearing less
computational costs. Our key design is an efficient CNN block dubbed LaKD,
equipped with a large kernel depth-wise convolution and spatial-channel mixing
structure, attaining comparable or larger ERF than Transformers but with a
smaller parameter scale. Specifically, we achieve +0.17dB / +0.43dB PSNR over
the state-of-the-art Restormer on defocus / motion deblurring benchmark
datasets with 32% fewer parameters and 39% fewer MACs. Extensive experiments
demonstrate the superior performance of our network and the effectiveness of
each module. Furthermore, we propose a compact and intuitive ERFMeter metric
that quantitatively characterizes ERF, and shows a high correlation to the
network performance. We hope this work can inspire the research community to
further explore the pros and cons of CNN and Transformer architectures beyond
image deblurring tasks.

Details

show
hide
Language(s): eng - English
 Dates: 2023-02-042023
 Publication Status: Published online
 Pages: 30 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 2302.02234
URI: https://arxiv.org/abs/2302.02234
BibTex Citekey: ruan2023revisiting
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show