Revisiting Image Deblurring with an Efficient ConvNet

Ruan, Lingyan; Bemana, Mojtaba; Seidel, Hans-Peter; Myszkowski, Karol; Chen, Bin

Local TagsRelease HistoryDetailsSummary

Revisiting Image Deblurring with an Efficient ConvNet

Ruan, L., Bemana, M., Seidel, H.-P., Myszkowski, K., & Chen, B. (2023). Revisiting Image Deblurring with an Efficient ConvNet. Retrieved from https://arxiv.org/abs/2302.02234.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-000C-C7B9-3 Version Permalink: https://hdl.handle.net/21.11116/0000-000F-4E57-9

Genre: Paper

Files

show Files

hide Files

:

arXiv:2302.02234.pdf (Preprint), 52MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-000C-C7BB-1

Name:
arXiv:2302.02234.pdf

Description:
File downloaded from arXiv at 2023-03-17 08:30

OA-Status:
Not specified

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Ruan, Lingyan¹, Author
Bemana, Mojtaba¹, Author
Seidel, Hans-Peter¹, Author
Myszkowski, Karol¹, Author
Chen, Bin¹, Author

Affiliations:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047

Content

show

hide

Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV

Abstract: Image deblurring aims to recover the latent sharp image from its blurry
counterpart and has a wide range of applications in computer vision. The
Convolution Neural Networks (CNNs) have performed well in this domain for many
years, and until recently an alternative network architecture, namely
Transformer, has demonstrated even stronger performance. One can attribute its
superiority to the multi-head self-attention (MHSA) mechanism, which offers a
larger receptive field and better input content adaptability than CNNs.
However, as MHSA demands high computational costs that grow quadratically with
respect to the input resolution, it becomes impractical for high-resolution
image deblurring tasks. In this work, we propose a unified lightweight CNN
network that features a large effective receptive field (ERF) and demonstrates
comparable or even better performance than Transformers while bearing less
computational costs. Our key design is an efficient CNN block dubbed LaKD,
equipped with a large kernel depth-wise convolution and spatial-channel mixing
structure, attaining comparable or larger ERF than Transformers but with a
smaller parameter scale. Specifically, we achieve +0.17dB / +0.43dB PSNR over
the state-of-the-art Restormer on defocus / motion deblurring benchmark
datasets with 32% fewer parameters and 39% fewer MACs. Extensive experiments
demonstrate the superior performance of our network and the effectiveness of
each module. Furthermore, we propose a compact and intuitive ERFMeter metric
that quantitatively characterizes ERF, and shows a high correlation to the
network performance. We hope this work can inspire the research community to
further explore the pros and cons of CNN and Transformer architectures beyond
image deblurring tasks.

Details

show

hide

Language(s): eng - English

Dates: Created: 2023-02-04Published Online: 2023

Publication Status: Published online

Pages: 30 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 2302.02234
URI: https://arxiv.org/abs/2302.02234
BibTex Citekey: ruan2023revisiting

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show