Classification of Two Comic Books based on Convolutional Neural Networks

  • Miki Ueno
    Toyohashi University of Technology ueno[at]imc.tut.ac.jp
  • Toshinori Suenaga
    Toyohashi University of Technology
  • Hitoshi Isahara
    Toyohashi University of Technology

Abstract

Unphotographic images are the powerful representations described various situations. Thus, understanding intellectual products such as comics and picture books is one of the important topics in the field of artificial intelligence. Hence, stepwise analysis of a comic story, i.e., features of a part of the image, information features, features relating to continuous scene etc., was pursued. Especially, the length and each scene of four-scene comics are limited so as to ensure a clear interpretation of the contents.In this study, as the first step in this direction, the problem to classify two four-scene comics by the same artists were focused as the example. Several classifiers were constructed by utilizing a Convolutional Neural Network(CNN), and the results of classification by a human annotator and by a computational method were compared.From these experiments, we have clearly shown that CNN is efficient way to classify unphotographic gray scaled images and found that characteristic features of images to classify incorrectly.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Eitz, M., Hays, J., and Alexa M., 2012. How Do Humans Sketch Objects?, ACM Trans. Graph. (Proc. SIGGRAPH), Vol. 31, No. 4, pp. 44:1-44:10.

Fujino, H., 2007. Konpeito ! 1 (Confetti ! 1), Houbunsha.

Fukushima, K. and Miyake, S. 1982. Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position, Pattern Recognition, Vol. 15, Issue 6, pp. 455-469. https://doi.org/10.1016/0031-3203(82)90024-3

Krizhevsky, A., Sutskever, I., and Hinton G. E., 2012. Imagenet classification with deep convolutional neural networks, In Advances in neural information processing systems, pp. 1097-1105.

Quoc, V. Le., 2013. Building high-level features using large scale unsupervised learning, In Acoustics, Speech and Signal Processing (ICASSP), pp. 8595-8598.

Tanaka, T., Toyama, F., Miyamichi, J., and Shoji, K., 2010. Detection and Classification of Speech Balloons in Comic Images, The journal of the Institute of Image Information and Television Engineers, Vol. 64, No. 12, pp. 1933-1939

Tokui, S., Oono, K., Hido, S., and Clayton, J., 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning, In Workshop on Machine Learning Systems at Neural Information Processing Systems (NIPS).
Ueno, M., Suenaga, T., & Isahara, H. (2017). Classification of Two Comic Books based on Convolutional Neural Networks. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 6(1), 5–12. https://doi.org/10.14201/ADCAIJ201761512

Downloads

Download data is not yet available.
+