Home / Information / Laboratories and Academic Staff / Visual Information Processing

Visual Information Processing (Matsuyama Lab)

Human beings are endowed with highly flexible visual perception capabilities to recognize objects and understand dynamic situations. The goal of our research is to analyze human intelligence and realize image understanding systems enough capable as human beings. We study a wide spectrum of hardware and software technologies for image processing, recognition, understanding, and generation. Currently, we are conducting two projects: 1) 3D Video project for capturing, editing, and visualizing real 3D motion pictures and 2) Human Communication project for developing information systems that can communicate with human naturally.

Academic Staff


Hiroaki KAWASHIMAAssociate Professor (Graduate School of Informatics)

Research Interests

Video processing and recognition / Temporal-pattern recognition / Multimedia integration / Human-computer Interaction (Human communication) / Hybrid dynamical system


Engineering Bld. 3, Room I303
TEL: +81-75-753-3327
FAX: +81-75-753-3327


Shohei NOBUHARAJunior Associate Professor (Graduate School of Informatics)

Research Interests

Computer vision / 3D shape / motion estimation / 3D video


Engineering Bld. 3, Room I302
TEL: +81-75-753-5883

Introduction to R&D Topics

Development of 3D Video

3D video is an advanced image media recording dynamic visual events in the real world as is; it records time varying 3D object shape with high fidelity surface properties (i.e. color and texture). Its applications cover wide varieties of personal and social human activities: entertainment (e.g. 3D game and 3D TV), education (e.g. 3D animal picture books), sports (e.g. sport performance analysis), medicine (e.g. 3D surgery monitoring), culture (e.g. 3D archive of traditional dances) and so on. Our research topics include (but not limited to):

[Real-time Full 3D Shape and Motion Reconstruction from Multi-Viewpoint Video]
A group of network connected active cameras surrounding a person capture multi-viewpoint video data, from which the 3D shape and motion of the person are computed in real-time. When the person moves widely, the cameras automatically track him/her to capture multi-viewpoint video.

[3D Video Generation and Visualization]
3D video is generated by mapping captured video data on the surface of the reconstructed 3D object. Then, we can interactively view the object action by changing viewpoint and/or its real 3D image using a 3D display.

[Lighting Environment Estimation]
In general, captured images of an object change depending on lighting environments (i.e. light source types and spatial arrangements). We are studying methods to estimate lighting environments from captured image data. Once lighting environments are correctly estimated, we can generate object images under different lighting environments (e.g. object images lit by moving candle lights).

[3D Video Compression]
Since the size of 3D video data is huge, we have to develop efficient coding methods to transmit and/or store 3D video. We are developing 3D video coding algorithms. While the algorithms themselves are quite different from MPEG for 2D video data, they can be implemented as pre and post processing processes of an MPEG codec so that they can be easily used in ordinary digital TV and video recorders.

[Archiving Intangible Cultural Assets]
With 3D video, we can record and archive intangible cultural assets such as traditional dances and Olympic sports.


Development of Human Communication Systems

We promote human-machine interaction researches by shifting focus from traditional reactive (command-response) systems to proactive frameworks. Proactive interaction systems actively understand meanings of human behaviors, and autonomously make response and action at proper timing. Dynamic interaction mechanisms will play crucial roles to facilitate natural and smooth communication in this framework.

Our research includes (but not limited to) the following topics.

[Measurement and Analysis of Human Behaviors using Multi-Camera Systems]
With 3D video technologies, we measure human postures and actions (e.g., pointing gesture and facial expression) by using multi-camera systems, and analyze 3D human behaviors to estimate their intentional/unintentional meanings (e.g., agreement and disagreement) and human internal mental states (e.g., degrees of interest and concentration).

[Algorithm Development for Dynamic Pattern Recognition]
In order to understand human behaviors such as facial expression, gaze and lip motion, utterance pattern, and gesture, we are developing algorithms to recognize and extract dynamic characteristics of human behaviors from multimedia signals (e.g., video and audio data).

[Analysis and Design of Dynamic Human Interaction]
To characterize interaction dynamics (i.e., spatio-temporal patterns and structures among utterances and actions) in human-human communication is a crucial step to realize natural human-machine interactions. We study varieties of human-human communication scenarios and have obtained interesting dynamic characteristics, based on which several human-machine interaction systems have been developed.

[Development of Real-world Interaction Systems]
By integrating the fundamental technologies above, we are developing proactive interaction systems. They actively estimate meanings of unintentional behaviors and internal mental states of human beings (e.g., drivers), and autonomously present context-sensitive information and make response at proper timing. We have prepared several research environments: a multimedia presentation display for information navigation and a driving simulator with versatile sensors such as cameras, microphones, polygraphs, and near-infrared spectroscopy (NIRS) for brain-activity measurement.

drivingsimulator displaysystem