Technologies of Vision: semantic image labelling
Imagine walking through an alpine meadow, looking up to see the mountain peaks all around you and thinking, "I wonder what that peak is called?" or "How far away is that ridge over there?". Or maybe seeing a rifugio or malga across the valley and wondering "What time do they serve food?" or "Do they have a bed free for tonight?".
We are developing solutions to satisfy just these very questions; and moreover we are working towards such a prototype device to offer visitors real-time detailed knowledge of their surroundings. In order to achieve this we need to understand what is being seen through the lens of a camera. This task is not a simple one, and requires the integration of diverse skills from many disciplines, ranging from cartographic projection, advanced 3D computer graphics and cutting-edge machine vision algorithms.
Each day, worldwide, millions of geo-referenced photographs are captured and shared via Internet websites such as Flickr and Panoramio, often within minutes of capture. For example, the Flickr API tells us that over 800,000 geo-referenced photos were taken in the Alps alone in 2008. There is a clear trend: geo-tagging photos for the sole purpose of placing one's photos 'on the map', either automatically (via integrated GPS) or manually (via GUIs), is becoming increasingly popular. Other resources such as webcams can also provide us with a source of geo-referenced imagery; especially in Trentino where hundreds are installed in refuges or hotels often situated in rugged and remote locations.