Untangling the world wide web

A new project is trying to develop AI systems that can analyse multimedia content and make sense of it

A new project is trying to develop AI systems that can analyse multimedia content and make sense of it

Making sense of the media content available on the internet is a daunting task. It would take 24 hours just to watch the videos that are added to YouTube every sixty seconds.

For organisations that need to keep track of what is being said about them and about their products, this is a big headache.

Among those working on a potential solution are researchers at the Boemie project. The name is an acronym for the not-snappily-titled Bootstrapping Ontology Evolution with Multimedia Information Extraction.

Turning the “ore” available on the internet into “gold” is how they describe their aim.

The Boemie team are building highly structured “knowledge bases” that can automatically – or, for now, semi-automatically – identify, analyse and index almost any multimedia content.

Video, text and audio content from a multitude of sources can be categorised, labelled, indexed, searched and retrieved as needed.

The system needs a bit of human intervention to get started. Someone with knowledge of a particular topic – sport is the one they’ve been experimenting with – defines a few key concepts. They might define “tennis match” as a type of sporting event, and the concept “Wimbledon finals” as an example of a tennis match.

The computer then takes over. It builds an ontology – a formal way of linking concepts together – and uses it to extract useful information from a variety of multimedia sources.

As it learns more about tennis, the computer suggests new concepts to add to the ontology, which an operator who knows about sport can accept, reject or modify.

What makes Boemie special is that once it’s built some knowledge about tennis, it goes back and analyses all of its information again, reviewing it in the light of what it has learned.

It can repeat this process again and again.

The Boemie project has significant commercial potential, says George Paliouras, its technical manager.

“Without semantic indexing, it’s very difficult to retrieve multimedia content,” he says.

“Boemie offers a new approach to do this at a large scale and with high precision. It can speed and improve the analysis, categorisation, indexing and retrieval of almost any kind of multimedia content.”

We might finally be able to untangle the web.