X reality 

Augmented Reality (AR) and Mixed Reality (MR) seek to enrich the user's view of the environment with relevant information. It has raised importance with the pervasive use of mobile applications such as smart-phones: in this case our pioneering activity focused on low-complexity computer vision methods for analyzing images captured with mobile devices to aid AR/MR services such as Geo-referenced information overlay. As part of bigger systems, this research activity takes place mainly within large projects. With  today's powerful devices, extended reality is having more and more impact on Industry 4.0 and gaining the attention of medium and large industries. Several industrial sectors benefit from AR to support an immersive customer experiences, learning process, professional skills training, reducing risks.

In human-machine interactions, including industrial environment, augmented and mixed reality holds great application potential. X reality should be understood as a sum of different technologies: computer graphics, interfaces, and artificial vision: it benefits of Computer Vision, from object recognition to scene understanding modules, to properly embed information and/or virtual objects into artificially generated scenes. 

The challenges are to inject AR data in a natural manner according to the surrounding environment, occlusions, lighting and shadows, and taking into account the user activity. Systems with XR features make use of computer vision for landmark recognition, object reconstruction and tracking as well as computer graphic techniques and specific user interaction modes. Being a transversal technology an interdisciplinary team is needed to build such systems, therefore, research in this area is often linked to large projects, such as European one. Through the coordination of European projects, FBK’s TeV Unit has gained solid experience in the field of computer vision for augmented reality.


AR in Industry 4.0 is changing the way machines communicate with people because users can see virtual information via synthetic overlays on the real world and can interact with the virtual information to better understand the real world. This pose several challenges: 

In a concrete application in the industrial field artificial vision and augmented reality techniques have been implemented to facilitate monitoring and troubleshooting operations of large plants. A set of modules was integrated to process images captured by a tablet’s camera to locate, recognize and calculate the position of components of industrial machinery under maintenance.  A combination of markers and feature tracking methods was used for the relative positioning between operator and component, with real-time updating of the information useful for the maintainer made accessible through the cloud connection. Augmented reality technology was integrated with big data prognostics and assisted maintenance with the aim of making the latter more interactive and functional in the production environment. 


Our research focuses on image analysis to support precise alignment of geo-referenced information with the capturing camera view, where GPS and inertial measurements are used to support the registration with the visual environment.

The quality of AR Experience on mobile devices often depends on the precision with which the content is aligned with relevant features of the environment. A better experience is delivered when such information is locked onto the user's view rather than roughly overlayed from GPS and inertial sensor measurements. Visual registration is also a core functionality to enable geo-structured access to image databases, and opens up important opportunities in environment monitoring from crowd-sourced media.


L. Porzi, S. Rota Bulò and E. Ricci. A Deeply-Supervised Deconvolutional Network for Horizon Line Detection, ACM on Multimedia Conference - ACMMM, 2016

L. Porzi, S. Rota Bulò, O. Lanz, P. Valigi and E. Ricci. Learning Contours for Automatic Annotations of Mountains Pictures on a Smartphone. ACM/IEEE International Conference on Distributed Smart Cameras - ICSDC, 2014

P. Chippendale, M. Zanin and M. Dalla Mura. Geo-positional Image Forensics through Scene-Terrain Registration. International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - VISAPP, 2013

L. Porzi, E. Ricci, T. A. Ciarfuglia and M. Zanin, Visual-inertial Tracking on Android for Augmented Reality Applications. IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems - EESMS, pp. 35-41, 2012


REPLICATE - cReative-asset harvEsting PipeLine to Inspire Collective-AuThoring and Experimentation (2016-2018) is a EC Horizon 2020 project coordinated by FBK. REPLICATE aimed at developing a multi-user, 3D-acquisition platform to transform the real-world into new forms of creative-assets powered by intelligent semantic decomposition to unlock sub-elements of objects enabling users to add lifelike properties to complex objects.

related publications


VENTURI - immersiVe ENhancemenT of User-woRld Interactions  (2011-2014) is an FP7-ICT European project coordinated by FBK. VENTURI aimed to create a pervasive AR paradigm, where available information is presented in a 'user' rather than a 'device' centric way. 

related publications


TeV has been working on the project since 2007 till 2010. 

Imagine walking through an alpine meadow, looking up to see the mountain peaks all around you and thinking, "I wonder what that peak is called?" or "How far away is that ridge over there?". We are developing solutions to satisfy just these very questions; and moreover we are working towards such a prototype device to offer visitors real-time detailed knowledge of their surroundings. In order to achieve this we need to understand what is being seen through the lens of a camera. This task is not a simple one, and requires the integration of diverse skills from many disciplines, ranging from cartographic projection, advanced 3D computer graphics and cutting-edge machine vision algorithms.

Each day, worldwide, millions of geo-referenced photographs are captured and shared via Internet websites such as Flickr and Panoramio, often within minutes of capture. For example, the Flickr API tells us that over 800,000 geo-referenced photos were taken in the Alps alone in 2008. There is a clear trend: geo-tagging photos for the sole purpose of placing one's photos 'on the map', either automatically (via integrated GPS) or manually (via GUIs), is becoming increasingly popular. Other resources such as webcams can also provide us with a source of geo-referenced imagery; especially in Trentino where hundreds are installed in refuges or hotels often situated in rugged and remote locations.

MARMOTA MOBILE - An android-based pioneering prototype. Each pixel of the image is associated with information such as altitude, latitude, longitude and distance from the observer. When activated by a user, the device locates itself with a built-in GPS, then sends that information via the Internet to the central Marmota server at FBK. Once those coordinates have been processed by that server, a data package of about 50 to 120 KB is sent back to the device, and displayed as a high-resolution 360-degree augmented on screen overlay. The device itself reportedly only uses a small amount of memory, letting the server do all the heavy lifting. Marmota gives specs on mountains but also provides names and locations of hiking trails, rivers and lakes, and will draw these items onto the screen to highlight them. It limits itself to what’s visible from the user’s point of view, so as not to create confusion with an overabundance of information.

related publications