非表示:
キーワード:
-
要旨:
Background: Automated species identification is a long term research subject. Contrary to flowers and fruits, leaves
are available throughout most of the year. Offering margin and texture to characterize a species, they are the most
studied organ for automated identification. Substantially matured machine learning techniques generate the need
for more training data (aka leaf images). Researchers as well as enthusiasts miss guidance on how to acquire suitable
training images in an efficient way.
Methods: In this paper, we systematically study nine image types and three preprocessing strategies. Image types
vary in terms of in-situ image recording conditions: perspective, illumination, and background, while the preprocessing
strategies compare non-preprocessed, cropped, and segmented images to each other. Per image type-preprocessing
combination, we also quantify the manual effort required for their implementation. We extract image features
using a convolutional neural network, classify species using the resulting feature vectors and discuss classification
accuracy in relation to the required effort per combination.
Results: The most effective, non-destructive way to record herbaceous leaves is to take an image of the leaf’s top
side. We yield the highest classification accuracy using destructive back light images, i.e., holding the plucked leaf
against the sky for image acquisition. Cropping the image to the leaf’s boundary substantially improves accuracy,
while precise segmentation yields similar accuracy at a substantially higher effort. The permanent use or disuse of a
flash light has negligible effects. Imaging the typically stronger textured backside of a leaf does not result in higher
accuracy, but notably increases the acquisition cost.
Conclusions: In conclusion, the way in which leaf images are acquired and preprocessed does have a substantial
effect on the accuracy of the classifier trained on them. For the first time, this study provides a systematic guideline
allowing researchers to spend available acquisition resources wisely while yielding the optimal classification accuracy.