Technologies of Vision: semantic image labelling
3D from 2D
We register photos by correlating their content against a rendered 3D spherical panorama of the World generated about the photo's 'geo-location'. In essence we generate a complete 360° synthetic image of what an observer would see all around them at a given location using a Digital Terrain Model and ray-tracing techniques.
The virtual view is 'unwrapped' into a 360° by 180° rectangular window and the photo is then deformed into the same 'space' depending upon estimated camera parameters, such as pan, tilt, lens distortion, etc. Scaling information (i.e. zoom) is extracted from the focal length meta-data contained within the EXIF JPEG data of the photo.
A correlation algorithm then attempts to find the best match between the synthetic image features and details extracted from the photo such as, land-sky junctions and perceived depth discontinuities (i.e. to estimate the ridges of non-horizon forming mountains).



