Smart Multimedia Information Retrieval

Classic information retrieval (IR) refers to the processes associated with searching for and finding information in text collections. By and large, these are very sophisticated and capable of indexing, distributing and archiving text collections, calculating metrics and implementing retrieval models (e.g. ad hoc retrieval, filtering). The "Multimedia Information Retrieval (MMIR)" area uses these techniques to enable access to any multimedia objects (image, sound, video). Multimedia features are extracted for this purpose. These are small units of information that can represent certain properties of multimedia objects. The methods of classical IR can then be applied on the basis of these features.

However, multimedia collections differ in many respects from pure text collections and must therefore also be treated specifically. Smart Multimedia Information Retrieval comprises MMIR concepts that are characterized by particular efficiency (quite good), effectiveness (quite fast), explainability (quite expressive and comprehensible) and scalability (quite large and much). An overview and further information can be found here:

The concepts of Smart Multimedia Information Retrieval have been implemented as part of a general framework which is available for research and teaching purposes and also forms the basis for a number of projects.

Algorithms and metrics for highly scalable multimedia processing

Multimedia objects such as images, videos or graphics are becoming more and more detailed, the resolutions are getting higher and higher and at the same time more and more multimedia objects are being recorded. We are therefore not only dealing with larger objects, but also with significantly more objects, whose richness of detail represents a challenge for any type of multimedia application. As part of various research projects, we have been able to develop algorithms and metrics for such collections that enable linear rather than exponential scaling and can therefore provide extreme performance gains, especially in large projects.

The Generic Multimedia Analysis Framework (GMAF)

The Generic Multimedia Analysis Framework (GMAF) and Multimedia Feature Graphs (MMFG) are data structures and applications that unify the representation of multimedia features and standardize the analysis and extraction of multimedia features through numerous plugins. Graph codes are used as the central index structure, which are characterized by extremely high efficiency with large amounts of data and can therefore be used very well in rapidly growing, large multimedia collections.

Theses and / or projects can be realized on the basis of GMAF. This page therefore serves as a starting point to familiarize yourself with the concepts and mechanisms of GMAF, MMFG and Graph Codes. It is best to start with the following publications:

GMAF Wiki and Github Repository:

Quick start instructions