FBK home > INFORMATION TECHNOLOGY > Technologies of Vision > semantic image labelling

Technologies of Vision: semantic image labelling

Content Creation

Once a photo has been accurately aligned with the planet, each pixel can be assigned latitude, longitude, altitude and a distance from the camera. This in effect enables us to see inside a photo and helps us to reason about its content.

From a photo an understanding of land coverage (or view-shed) can thus be made. Comparing this viewshed with a geo-database, such as OpenStreetMap or geonames, we can understand whether villages, mountain peaks, etc are visible in a photo.

By comparing the values of the pixels from one aligned photo to another (or many others) on a spatial basis (i.e. latitude and longitude), it is possible (after colour normalisation stages) to understand which pixels/regions may contain snow or even observe the changing colours of the leaves on the trees over time.

Each photo pixel is examined to see whether it is close to geo-referenced feature points. Here, mountain peaks, their province and also a GPS track have been overlaid. Here a GIS layer representing tree cover has been draw into an aligned photo. By comparing aligned images taken over a period of time from different places we can automatically estimate snow coverage (White: high probability of snow - Dark blue: very low probability of snow; grey: steep cliffs).