The rapid technological develpoment registered in
the last years as far as the acquisition, processement, transmission and storage of
audiovisual data is concerned, has been responsible for the ever growing production
of data in several formats and domains (e.g. broadcasting and internet). In practice, the
usefulness and value inherent to any audiovisual data depends on the easiness with which
one can find it, making this an important problem to be solved for both the consumers who
intend to find content as well as the vendors who want to sell their content. In this context, where it is easier to acquire and code content than
it is to find content that is supposed to exist (possibly on-line), a project emerged
under the name of Multimedia
Content Description Interface or MPEG-7.
This project aims at establishing a standard that finds a solution to describe and
retrieve audiovisual data in a quick and efficent manner, standardizing four things:
|
a set of descriptors; |
|
a set of description schemes; |
|
a language to define
description schemes and probably descriptors; and |
|
one or more methods to code the
descriptions. |
This project's goal is to produce descriptions
using only audiovisual characteristics, such as shape, color, texture, motion and spatial
relations, not finding any new solutions for textual descriptions. However, MPEG-7 will
consider existing solutions for the description of text documents.
The work that I have been developing within this
context is somehow a contribution to find a solution to the problem here presented,
consisting of two projects namely an "Automatic Video Description Application"
and a study of "Description and Retrieval of Video using Shape Descriptors".
"Automatic Video Description
Application"
This first project hereby concluded was a Windows
application developed in Visual C++, that automatically described video using audiovisual
characteristics, such as shape, color, motion and spatial relation, and a pre-defined
generic description model. Consequently, this model could be adapted by the user of the
application to define a domain-specific description model. The Figures 1-5 are examples of
the representation of some of the descriptors used to describe the audiovisual data
analysed by the application developed.
|
|
a) |
b) |
Figura 1: Color
descriptor - a) Stefan.qcif frame 0, and b) its color histogram.
|
|
a) |
b) |
Figura 2: Color
descriptor - a) Cyclamen.sif frame 0, and b) its color set back projection.
|
|
a) |
b) |
Figura 3: Motion
descriptor - a) Coastguard.qcif frame 195, and b) its motion vectors field from frame
195-198.
|
|
a) |
b) |
Figura 4: Shape
descriptor - a) Max chord of the object cyclamen_obj.sif frame 142, and b) its equivalent
circular diameter.
Figura 5: Spatial
relation descriptor - a) Coastguard.qcif frame 50, and b) its object's masks, and c) their
relative position to be described with a 2D-String.
"Visual Content's Description
and Retrieval using Shape Descriptors"
As was seen before, some of todays most important tendencies with
respect to audiovisual technologies are the interactivity, the personalization and the
universal access of contents; nevertheless these are closely related to an
object-oriented representation model. An object is visually represented by
means of its texture (luminance and chrominance) and shape information as shown in Figure 6. The shape
information appears as something new with respect to the representation, codification or
description, of objects since the texture information has previously been used in
rectangular frames based models.
Figura 6: Bream -
frame 1, a) object's texture information, and b) object's shape information.
Due to the importance attributed to the concept of object and
shape in the audiovisual world and to its novelty, the work here presented
relates to a study being elaborated close with the work that is being done in MPEG-7s XM.
The objects analysed are of two types: simple objects consisting of a region (set of
connected pixels) and complex objects consisting of a set of regions.
In
this manner, the objectives of the work here presented are to:
|
Analyse
comparatively the available shape descriptors
|
|
Implementation
of MPEG-7s shape descriptors and its comparison with other alternative descriptors
|
|
Development
of a description and retrieval mechanism with a special emphasis on shape characteristics
|
|
Evaluation of
the shape descriptors performance for a given dataset and strict experimental conditions
(using MPEG-7s evaluations method)
|
|