hide
Free keywords:
Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Artificial Intelligence, cs.AI
Abstract:
Current efficient LiDAR-based detection frameworks are lacking in exploiting
object relations, which naturally present in both spatial and temporal manners.
To this end, we introduce a simple, efficient, and effective two-stage
detector, termed as Ret3D. At the core of Ret3D is the utilization of novel
intra-frame and inter-frame relation modules to capture the spatial and
temporal relations accordingly. More Specifically, intra-frame relation module
(IntraRM) encapsulates the intra-frame objects into a sparse graph and thus
allows us to refine the object features through efficient message passing. On
the other hand, inter-frame relation module (InterRM) densely connects each
object in its corresponding tracked sequences dynamically, and leverages such
temporal information to further enhance its representations efficiently through
a lightweight transformer network. We instantiate our novel designs of IntraRM
and InterRM with general center-based or anchor-based detectors and evaluate
them on Waymo Open Dataset (WOD). With negligible extra overhead, Ret3D
achieves the state-of-the-art performance, being 5.5% and 3.2% higher than the
recent competitor in terms of the LEVEL 1 and LEVEL 2 mAPH metrics on vehicle
detection, respectively.