You are here

Segmentation and Recognition

Main activities

Recognizing and classifying all the entities appearing in an image is a fundamental goal of computer vision, and constitutes a key element in the semantic understanding of images, videos and other multi-media resources. This macro-activity is concerned with the development of deep learning theory and practice to understand the image content. More images depicting the same objects can be used for a 3D reconstruction of the scene.

This activity focuses on the development of algorithms for the detection of landmarks naturally embedded in the scene.

We develop algorithms devoted to the detection of text embedded in scenes, its segmentation from the background and its adjustment to facilitate its readability by an OCR engine. 

We investigate how to deploy representation learning within the framework of decision forests, which are ensembles of binary decision trees that have become very popular in computer vision.

Object recognition systems provide a smart tool for the automatic indexing of an image by its visual content, allowing a high-level (semantic) description of the visual data.


Cystic Fibrosis: new image processing methods accelerate studies on drugs

From June 30, 2021 the current TeV site (as FBK Drupal web site) is frozen and no longer editable.
On November 30, 2021, the Drupal server will be permanently shut down and this site will be no longer available.


Automatic food recognition

If a piece of valuable wood has some defects, it cannot be used to make certain objects, but some others can be made avoiding the defect zone.

Quality control of industrial pieces

cReative-asset harvEsting PipeLine to Inspire Collective-AuThoring and Experimentation - REPLICATE enhances creativity through the integration of novel Mixed-Reality user experiences, enabling 3D/4D storyboarding in unconstrained environments and the ad-hoc expression of ideas by disassembling and reassembling objects in a co-creative workspace.

TRAVEL - Traffic Road Analysis by Visual Event Labelling project is about the automatic analysis of traffic sequences from static or moving cameras, aiming at the detection, classification and tracking of vehicles on the road.

MY-E-DIRECTOR 2012 - Real-Time Context-Aware and Personalized Media Streaming Environments for Large Scale Broadcasting Applications, is an FP7-ICT-2007-1 Project. The user becomes the director in personalized tailored sports broadcasting.

VIKEF - Virtual Information and Knowledge Environment Framework is to advanced semantic-enabled support for Information, Content and Knowledge (ICK) production, acquisition, processing, annotation, sharing and use by empowering information and knowledge environments for scientific and business communities.


MEMORI is a memory-based system for the detection and recognition of objects in digital images.

Tools and facilities

TeV has a long history of research in document image segmentation and on camera-based text detection. This page is a survey page that provides (some) links to academic research projects around the world, related to Document Image Understanding and Text Extraction from generic images, arranged by topics.