Saturday, November 4, 2017

Eliminating the observer effect: Shadow removal in orthomosaics of the road network

Title: Eliminating the observer effect: Shadow removal in orthomosaics of the road network
Authors: Supannee Tanathong, William Smith and Stephen Remde
In Proc. ICCV Workshop 2017 on Computer Vision for Road Scene Understanding and Autonomous Driving.

[PDF][Poster]

Friday, May 13, 2016

Automation of road condition surveys using computer vision

I'm currently working on a computer vision project to solve the problem of maintaining the UK's most valuable publicly-owned asset. The main objective is to develop software and systems which automate and support road and pavement condition and treatment surveys and maintenance decisions using computer vision and artificial intelligence. This project involves in the following computer vision techniques: Structure from Motion, Multiple Camera Calibration, Image Stitching, GPS-assisted Image Registration, Image Analysis and Machine Learning.

Monday, August 4, 2014

Visualization of Canal Cross-Section using Data Acquired from Teleoperated Boat

This work is to be presented in Asian Conference of Remote Sensing (27-31 Oct 2014).

This work is part of a research project, the objective of which is to provide its users, mainly flood management officers, a better understanding of canal or waterway profiles that allow them to develop an effective water diversion plans for flood preparedness.

This study directly continues from our recently-published work, 3D Reconstruction of Canal Profiles using Data Acquired from Teleoperated Boat, which focuses on presenting 3D image of canal profiles using laser data, depth data and geolocation data acquired from the teleoperated boat. In this work, we focus more on presenting cross-section images of the surveyed canals.

Ideally, GPS, IMU, 2D laser scanner and depth sounder should be installed on the boat at the same position, preferably the centroid of the boat. Since all of these sensors cannot share the same position due to their size, lever arm offsets must be included in the computation. Instead of using the offsets directly, we derive them by asking the users to input the positions of IMU (O_BSC), 2D laser scanner (O_LCS) and depth sounder (O_DCS) which are specified with respect to the position of GPS (O_GCS), as illustrated in the figure below.


The application to reconstruct canals in 3D and visualize the cross-sections of canals was developed in the C/C++ language. We used the OpenGL library to facilitate graphical image rendering, and used the Qt library to create graphical user interface that allows the users to interact with the application and adjust the visualization results. This section discusses the input data and the process to convert them into georeferenced 3D points. We explain the data structure and the detail of creating cross-section images. The general system architecture and the point cloud coloring method were discussed in our recent paper (Tanathong et al., 2014).

The application requires the GPS/IMU data in conjunction with either laser data and depth data. The format for each data file used in this work is presented as the below figure. Note that these three sources of data are synchronized by acquisition time.


The GPS/IMU records must be stored in an internal storage, defined as an array of GPSIMU_RECORDs, such that the application can immediately retrieve these data for drawing the trajectory line of the boat to aid the interpretation of visualized canal scenes. On the contrary, the laser and depth data are not kept as raw inputs but converted into 3D points and stored into an array of 3DPOINTs to be visualized as 3D point clouds. The data structure are presented below:


The data processing procedure is summarized as the below figure. Note that this processing procedure intuitively sorts the 3D points in the point cloud storage by the acquisition time.


Prior to talking about the visualization of canal cross-section, we have to understand the term projection and unprojection. Plotting onto the display screen the 3D points retrieved from the 3D point storage presents the image of canal profiles in 3D. Although these 3D points are said to be presented in 3D, since the display screen is flat and has only two dimensions, they, in fact, are positioned as 2D points on screen. The process that converts the 3D coordinates into their corresponding 2D positions on a flat screen is known as projection (consult Shreiner et al. (2013) for more detail), while the reversed process is unofficially referred to as unprojection.

In order to visualize a cross-section image of the canal reconstructed from the data acquired from the teleoperated boat, the users browse along the 3D canal by customizing the viewing parameters, then mark a good position by clicking on the display screen. The user-click action unprojects the 2D screen coordinate at the clicked position (xʹ,yʹ) into its corresponding 3D coordinate (X,Y,Z). Then, this 3D coordinate is used to retrieved its N neighboring 3D points stored adjacently in the array of points. These 3D points are then used to produce the cross-section image of the selected position. The procedure to visualize cross-sections is summarized as the figure below:


Some of the results is illustrated as the figure below:



Publications:
Tanathong, S., Rudahl, K.T., Goldin, S.E., 2014. Towards visualizing canal cross-section using data acquired from teleoperated boat. To be included in the Proceedings of Asian Conference of Remote Sensing (ACRS 2014), Nay Pyi Taw, Myanmar, 27-31 Oct 2014.

Thursday, July 3, 2014

3D Reconstruction of Canal Profiles using Data Acquired from Teleoperated Boat

Over the past three years, Thailand has experienced widespread flooding in several regions across the country. In fact, flooding has always been a recurring hazard in Thailand, but recently, due to rapid urban development and deforestation, flooding has become more severe. In order to reduce the hazard severity and minimize the area affected by flood, flood management personnel need to know the profiles of canals and waterways. The profile data should, at least, include physical descriptions of canal banks, width between two sides of canal, depth of canal bed, water level, and existing structures along the stretches of waterways. This will allow them to effectively direct the water flow in ways that can reduce the amount of flood water.

In this work, we equip a teleoperated boat with a 2D laser scanner, a single-beam depth sounder, and GPS/IMU sensors to make it possible to describe canal banks and bottom profiles during navigation along the waterway.


The 2D laser scanner documents the environment by emitting its laser pulses from the left side to the right side, which covers 270deg. The figure below illustrates the installation of the laser scanner and the moving direction.


The laser range and depth parameters while the boat documenting canal banks and the bottom profiles can be illustrated as the figure below.

Both laser scanner data and depth data are acquired in their own local coordinate systems. In order for these data to be integrated and be able to produce the picture of waterways, they must be georeferenced into the same coordinate system, principally the world coordinate system (GCS). The transformation from one coordinate system to another is basically a series of transformations, each of which is defined by an individual 3D rotation matrix, from the initial coordinate system in which the data is defined into the destination coordinate system. Here, we transform the laser coordinate system (LCS) into the boat coordinate system (BCS) then the North-East-Down coordinate system (NED) and finally the world coordinate system (GCS).


In this work, the application to reconstruct and visualize canals in 3D is developed in the C/C++ language. In order for the application to present 2D/3D graphical images, we employ the OpenGL library (www.opengl.org), which is free from licensing requirements. The graphical user interface for the end users to operate the application and adjust the visualization is implemented with the Qt library (qt-project.org) under open source license. This is how the application looks like.


Some experimental results:


The program is now completed but I haven't created its snapshop video yet. Here is the application at its 70%-completed status (27 May 2014).



Publications:
Tanathong, S., Rudahl, K.T., Goldin, S.E., 2014. 3D reconstruction of canal profiles using data acquired from teleoperated boat. In: Proceedings of Asia GIS, Chiang Mai, Thailand. [PDF]

Presentations:
GeoFest2014 Seminar, King Mongkut's University of Technology Thonburi [Powerpoint]

Object detection based on template matching through use of Best-So-Far ABC

Template matching is a technique in computer vision used for finding a sub-image of a target image which matches a template image.

The template matching technique requires extensive computational cost since the matching process involves moving the template image to all possible positions in a larger target image and computing a numerical index that indicates how well the template matches the image in that position. Therefore, this problem can be considered as an optimization problem. The algorithms based on swarm intelligence approach have been considered as a way to alleviate the drawback of the long processing time in this problem.

This study employs the best-so-far ABC (Banharnsakun, 2011) to improve the solution quality in detecting the target objects and to optimize the time used to reach the solution.

The software solution was developed in C/C++. Banharsakun, who proposed the Best-so-far ABC, took the major part in integrating the best-so-far ABC approach with the matching.

The detected object results from the best-so-far ABC with RGB histogram:



Publications:
1. Banharnsakun, A., Tanathong, S., 2014. Object detection based on template matching through use of Best-So-Far ABC. Computational Intelligence and Neuroscience, vol. 2014. [LINK]
2. Tanathong, S., Banharnsakun, A., 2014. Multiple Object Tracking Based on a Hierarchical Clustering of Features Approach. In Proceedings of ACIIDS, Bangkok, Thailand. [LINK]

Realtime Image Matching for Vision Based Car Navigation

In this project, I am responsible for the image matching part that will be integrated to the car navigation project. The image matching is implemented based on the Kanade-Lucas-Tomasi (KLT), which is well-known for its computational efficiency, and widely used for real-time applications.

Although KLT is a promising approach to the real-time acquisition of tie-points, extracting tie-points from urban traffic scenes captured from a moving camera is a challenging task. To be used as a source of inputs for the bundle adjustment process, tie-points must not be acquired from moving objects but only from stationary objects. When the camera (observer) is at a fixed position, moving objects can be distinguished from stationary objects by considering the direction and magnitude of optical flow vectors. However, when the camera moves, it also induces optical flows for stationary objects. This makes it difficult to separate them from moving objects. The problem is more complicated on road scenes which involve several moving objects. At this point, the problem of image matching is not only to produce tie-points but also to discard those associated with moving objects.

This study presents an image matching system based on the KLT algorithm. To simplify the aforementioned problem, the built-in sensory data are employed. The sensors offer translation velocity and angular velocity of the camera (in fact, the vehicle that boards the camera). These data can be used to derive the position and attitude parameters of the camera, which will be referred to as preliminary exterior orientation (EO).

We develop our image matching system based on the KLT algorithm. The procedure of the system is presented below. Typically, we perform tracking and output a set of tie-points every second. Since KLT only works when the displacement between frames is small, we thus perform tracking on a number of frames for each second but return a single set of tie-points to AT. In this work, basic outlier removal includes performing (1) cross correlation coefficient, (2) KLT tracking cross-check, and (3) optical flow evaluation, in respective order. For moving object removal, we use initial EOs to perform projection and identify moving objects based on the discrepancy between tracking points and projecting points.

The procedure of the proposed image matching for a car navigation system.



The image matching software is developed in C/C++ based on the OpenCV library (OpenCV 1.1).

The tie-point projection result is presented below:

Some of the image matching results are presented below:


Publication:
Choi, K., Tanathong, S., Kim, H., Lee, I., 2013. Realtime image matching for vision based car navigation with built-in sensory data. Proceedings of ISPRS Annuals of Photogrammetry, Remote Sensing and Spatial Information Sciences, Antalya, Turkey. [PDF]

Fast Image Matching for Real-time Georeferencing using Exterior Orientation Observed from GPS/INS

Advances in science and technology provide new capabilities to improve human security. One example is the area of disaster response, which has become far more effective due to the application of remote sensing and other computing technologies. However, traditional photogrammetric georeferencing techniques which rely on manual control point selection are too slow to meet the challenge as both the number and severity of disasters are increasing worldwide. To use imagery effectively for disaster response, we need so-called real-time georeferencing. That is, it must be possible to obtain accurate external orientation (EO) of photographs or images in real time.

In this study, we present a fast, automated image matching system based on the Kanade-Lucas-Tomasi (KLT) algorithm that, when operated in conjunction with a real-time aerial triangulation (AT), allows the EOs to be determined immediately after image acquisition.


Although KLT shows a promising ability to deliver tie-points to end applications in real time, the algorithm is vulnerable when the adjacent images undergo large displacement or are captured during a sharp turn of the acquisition platform (in our research, an Unmanned Aerial Vehicle or UAV). This can be illustrated as the figure below:


This study proposes to overcome these limitations by determining a good initial approximation for the KLT problem from the EOs observed through GPS/INS. This allows the algorithm to converge to the final solution more quickly.


Integrating the proposed approach into the pyramidal tracking model, the implementation procedure (derived based on the pyramidal implementation from Bouguet, 2000) can be presented as the figure below. The derivation in the figure is based on the translation model, Eq. (3.2).


The related equations are summarized below:


In this research, we also present a mathematical solution to determine the number of depth levels for the image pyramid, which has previously been defined manually by operational personnel. As a result, our system can function automatically without human intervention. For more clarification, please have a look at this document.


In addition, the research work reported here enhances the KLT feature detection to obtain a larger number of features, and introduces geometric constraints to improve the quality of tie-points. This leads to greater success in AT.

The proposed image matching system used for the experimental testing is developed in C/C++ using the Microsoft Visual Studio 2008 framework. The KLT algorithm adopted for this research study is implemented by modifying the OpenCV library version 1.1 (OpenCV, 2012). Most of the auxiliary functions used to deal with images are also based on the OpenCV library. The implementation of the image matching for real-time georeferencing can be illustrated as the figure below.


The goal of the experiment is to present the improvement in accuracy of the directly observed EOs (EOS1, EOS2 and EOS3) after being refined through a bundle adjustment process given tie-points produced by the proposed image matching. In addition, the experiment measures the accuracy of the adjusted EOs (EOA1, EOA2 and EOA3) when a larger number of tie-points is involved in the computation of AT. The accuracy of the initial EOs is listed in the tables below:


The experimental results demonstrate that the image matching system in conjunction with AT can refine the accuracy of all initial EOS1, EOS2 and EOS3. The adjusted EOs have a lower RMS of discrepancy against the true EOs when compared with that of the directly observed EOs. Moreover, the level of accuracy can be further improved by increasing the number of tie-points for AT. For example, the RMS of the initial EOS3 is measured as 2.122m, 1.670m and 1.725m for three position parameters and 1.780deg, 1.790deg and 2.079deg for three attitude parameters. The AT process (std_ip = 15) can improve the accuracy of position parameters by 25% and attitude parameters by 58%, when the maximum number of tie-points is defined as 3×3×3 per stereo image. Increasing the number of tie-points to 3×3×4, the accuracy of the adjusted EOA3 improves by 37% and 69% for position and attitude, when compared with the directly observed EOs. The improvements in accuracy for position and attitude parameters are up to 40% and 70% when the number of tie-points is defined as 3×3×5.The experimental results for this dataset are summarized in the figure below:



Publications:
1. Tanathong, S., Lee, I., 2014. Translation-based KLT tracker under severe camera rotation using GPS/INS data. IEEE Geoscience and Remote Sensing Letters, vol. 11, no. 1, pp. 64-68. [LINK][Sourcecode]
2. Tanathong, S., Lee, I., In Press. Using GPS/INS data to enhance image matching for real-time aerial triangulation. Computers & Geosciences. [LINK]
3. Tanathong, S., Lee, I., Submitted for publication. Accuracy assessment of the rotational invariant KLT tracker and its application to real-time georeferencing. Journal of Applied Remote Sensing. (Revision in progress)
4. Tanathong, S., Lee, I., 2011. A development of a fast and automated image matching based on KLT tracker for real-time image georeferencing. Proceedings of ISRS, Yeosu, Korea. (Student Paper Award)
5. Tanathong, S., Lee, I., 2011. An automated real-time image georeferencing system. Proceedings of IPCV, Las Vegas, USA.
6. Tanathong, S., Lee, I., 2010. Speeding up the KLT tracker for real-time image georeferencing using GPS/INS data. Korean Journal of Remote Sensing, vol. 26, no. 6, pp. 629-644.[LINK][PDF]
7. Tanathong, S., Lee, I., 2010. Towards improving the KLT tracker for real-time image georeferencing using GPS/INS data. Proceedings of 16th Korea-Japan Joint Workshop on Frontiers of Computer Vision, Hiroshima, Japan.
8. Tanathong, S., Lee, I., 2009. The improvement of KLT for real-time feature tracking from UAV image sequence. Proceedings of ACRS, Beijing, China.