Περίληψη σε άλλη γλώσσα
Stereo vision deals with images acquired by a stereo camera setup, where the disparity between the stereo images allows depth estimation within a scene. 3-D information, hence, is retrieved which is essential in many machine vision applications such as such as high-speed tracking, mobile robots, object recognition and vision-guided robotics. Disparity map extraction of an image is a computationally demanding task. Due to the computational complexity involved in stereo vision algorithms, powerful CPUs are required to achieve acceptable performance. Recent research has shown that such low level image processing algorithms can be efficiently implemented in hardware. FPGAs entered the world of computer vision to be utilized as hardware-accelerators for economical CPUs. Previous work on disparity map computation is mainly limited to software based techniques on general-purpose architectures. In this PhD Thesis, we develop a new hardware-efficient real-time disparity map computation module. ...
Stereo vision deals with images acquired by a stereo camera setup, where the disparity between the stereo images allows depth estimation within a scene. 3-D information, hence, is retrieved which is essential in many machine vision applications such as such as high-speed tracking, mobile robots, object recognition and vision-guided robotics. Disparity map extraction of an image is a computationally demanding task. Due to the computational complexity involved in stereo vision algorithms, powerful CPUs are required to achieve acceptable performance. Recent research has shown that such low level image processing algorithms can be efficiently implemented in hardware. FPGAs entered the world of computer vision to be utilized as hardware-accelerators for economical CPUs. Previous work on disparity map computation is mainly limited to software based techniques on general-purpose architectures. In this PhD Thesis, we develop a new hardware-efficient real-time disparity map computation module. This enables a hardware based cellular automata (CA) parallel-pipelined design, for the overall module, realized on a single FPGA device, the typical operating frequency of which is 256 MHz. Accurate disparity maps are computed at a rate of nearly 275 per second, for a stereo image pair with a disparity range of 80 pixels and 640x480 pixels spatial resolution. The presented hardware-based algorithm provides very good processing speed at the expense of accuracy, with very good scalability in terms of disparity levels. The proposed method allows a fast disparity map computational module to be built, enabling a suitable module for real-time stereo vision applications. Furthermore, disparity images are usually heavily corrupted. This type of error is introduced during the disparity value assignment stage. The disparity value assigned to some pixels does not correspond to the appropriate value. Hence, in a given window, some pixels might have been assigned with the correct disparity value and some others not. Various standard filtering techniques, such as mean, median, Gaussian cannot provide efficient refinement. Typical low-pass filters result in loss of detail and do not present adequate false matchings removal. Adaptive filtering is also unsuccessful, presenting similar results and limited improvement in the quality of the produced disparity maps, usually reaching a percentage of 1-5%. Filtering using a cellular automata (CA) approach presents better error removal, up to 15% improvement, with detail preservation and extremely easy, simple and parallel hardware implementation. Practical stereo vision systems involve wide ranges of disparity levels, requiring greater computational complexity, and thus relatively higher execution times. In this PhD Thesis, we propose a hardware based fuzzy inference system (FIS) parallel-pipelined design, which maintains almost constant frame rate output, while the disparity range increases from small up to large values. This scaling novelty compared to other algorithms, relies on its parallel-pipelined architecture. Proper scalability is achieved by using parallel design blocks, to handle various disparity ranges. Moving from a small to a large operating disparity range, the computation time is maintained almost constant, since the parallel nature of the proposed module allows for greater number of iterations to be performed with the same execution time. This provides accurate disparity map computation at a rate of nearly 440 frames per second, given a stereo image pair with a disparity range of 80 pixels and 640x480 pixels spatial resolution. The fuzzy inference system (FIS) module is implemented as a post processing step, to produce semi-dense disparity maps with minimal error. The use of fuzzy techniques in image processing is an effective tool with many successful applications. Fuzzy operators can deal efficiently with image enhancement operations such as noise filtering. The FIS filter can be ideally implemented in hardware since FIS structures offer sequential processing, enhanced speed by parallel, pipeline architectures and are easily implemented by using straightforward digital design techniques. Depth estimation in a scene using image pairs acquired by a stereo camera setup, is one of the important tasks of stereo vision systems. Considering that disparity map extraction of an image is a computationally demanding task, practical real-time hardware based algorithms require high device utilization recourse usage, depending on the disparity levels operational range, which leads to significant power consumption. In this PhD Thesis, a new hardware-efficient real time disparity map computation module is developed. The module constantly estimates the precisely required range of disparity levels upon a given stereo image set, maintaining this range as low as possible by verging the stereo setup cameras axes. This enables a parallel-pipelined design, for the overall module, realized on a single FPGA device, operating at a maximum frequency of 441 MHz. Accurate disparity maps are computed at a rate of more than 320 frames per second, for a stereo image pair of 640x480 pixels spatial resolution with a disparity range of 80 pixels. The presented technique provides very good processing speed at the expense of accuracy, with very good scalability in terms of disparity levels. The proposed method enables a suitable module delivering high performance in real-time stereo vision applications, where space and power are significant concerns. Summarizing, in the present PhD Thesis, new techniques are presented for the fast and accurate extraction of depth information, in stereo vision systems. Based on these techniques, efficient real-time methods for addressing the fundamental stereo vision correspondence problem are proposed, improving both the quantitative and qualitative performance of the whole process. Additionally, the hardware implementation of the proposed techniques, enables them in numerous applications, where space and performance are two critical parameters. A considerably compact and accurate stereo system, will always comprise a desirable option in machine vision areas.
περισσότερα