English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  From Pixels to People

Omran, M. (2021). From Pixels to People. PhD Thesis, Universität des Saarlandes, Saarbrücken. doi:10.22028/D291-36605.

Item is

Basic

show hide
Genre: Thesis
Subtitle : Recovering Location, Shape and Pose of Humans in Images

Files

show Files

Locators

show
hide
Description:
-
OA-Status:
Green

Creators

show
hide
 Creators:
Omran, Mohamed1, 2, Author           
Schiele, Bernt1, Advisor                 
Gall, Jürgen3, Referee           
Affiliations:
1Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_1116547              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              
3Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047              

Content

show
hide
Free keywords: -
 Abstract: Abstract
Humans are at the centre of a significant amount of research in computer vision.
Endowing machines with the ability to perceive people from visual data is an immense
scientific challenge with a high degree of direct practical relevance. Success in automatic
perception can be measured at different levels of abstraction, and this will depend on
which intelligent behaviour we are trying to replicate: the ability to localise persons in
an image or in the environment, understanding how persons are moving at the skeleton
and at the surface level, interpreting their interactions with the environment including
with other people, and perhaps even anticipating future actions. In this thesis we tackle
different sub-problems of the broad research area referred to as "looking at people",
aiming to perceive humans in images at different levels of granularity.
We start with bounding box-level pedestrian detection: We present a retrospective
analysis of methods published in the decade preceding our work, identifying various
strands of research that have advanced the state of the art. With quantitative exper-
iments, we demonstrate the critical role of developing better feature representations
and having the right training distribution. We then contribute two methods based
on the insights derived from our analysis: one that combines the strongest aspects of
past detectors and another that focuses purely on learning representations. The latter
method outperforms more complicated approaches, especially those based on hand-
crafted features. We conclude our work on pedestrian detection with a forward-looking
analysis that maps out potential avenues for future research.
We then turn to pixel-level methods: Perceiving humans requires us to both separate
them precisely from the background and identify their surroundings. To this end, we
introduce Cityscapes, a large-scale dataset for street scene understanding. This has since
established itself as a go-to benchmark for segmentation and detection. We additionally
develop methods that relax the requirement for expensive pixel-level annotations, focusing
on the task of boundary detection, i.e. identifying the outlines of relevant objects and
surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and
labelling to fine-grained spatial understanding. We contribute a method for recovering
3D human shape and pose, which marries the advantages of learning-based and model-
based approaches.
We conclude the thesis with a detailed discussion of benchmarking practices in
computer vision. Among other things, we argue that the design of future datasets
should be driven by the general goal of combinatorial robustness besides task-specific
considerations.

Details

show
hide
Language(s): eng - English
 Dates: 2021-12-1520212021
 Publication Status: Issued
 Pages: 252 p.
 Publishing info: Saarbrücken : Universität des Saarlandes
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: Omranphd2021
DOI: 10.22028/D291-36605
URN: nbn:de:bsz:291--ds-366053
Other: hdl:20.500.11880/33466
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show