Bayesian Classification for the Statistical Hough Transform

International Conference on Pattern Recognition, Florida USA, December 2008

We have introduced the Statistical Hough transform that extends the standard Hough transform by using a kernel mixture as a robust alternative to the 2 dimensional accumulator histogram. This work develops further this framework by proposing a Bayesian classification scheme to associate the spatial coordinates  to one particular class defined in the Hough
space. In a first step, we segment the Hough space into meaningful classes. Then using the inverse Radon transform, we backproject the different classes into the image space. We illustrate our approach on a synthetic image and on real images.

I've Read This
  • 120 Views
Bayesian Classification for the Statistical Hough Transform
Rozenn Dahyot School of Computer Science and Statistics Trinity College Dublin, Ireland Rozenn.Dahyot@cs.tcd.ie

Abstract
We have introduced the Statistical Hough transform [2] that extends the standard Hough transform by using a kernel mixture as a robust alternative to the 2 dimensional accumulator histogram. This work develops further this framework by proposing a Bayesian classification scheme to associate the spatial coordinates (x, y) to one particular class defined in the Hough space (θ, ρ). In a first step, we segment the Hough space into meaningful classes. Then using the inverse Radon transform, we backproject the different classes into the image space. We illustrate our approach on a synthetic image and on real images.

1

Introduction

To overcome histogram limitations in the computation of the Standard Hough transform, we have proposed a new kernel based modelling of the density pθρ (θ, ρ) [2]. All the observations available are used in this modelling and no pre-segmentation of the edges is necessary. In this new formalism, the maxima in the Hough space cannot be associated directly with their contributing pixels (as with the standard Hough transform) since all the pixels have contributed to them at different degrees. We propose here to perform an unsupervised segmentation of our kernel estimate of the Hough transform, and using the inverse Radon transform, find the associated pixels in the spatial domain. We start in paragraph 2 by a short review, and we introduce our unsupervised dual segmentation scheme (in both the Hough and spatial domains) in section 3.

robust estimator for estimating lines in a set of points [3]. As a consequence, it has been used extensively to recover aligned segments amongst the contours of images. Recent works [5, 1] have proposed to consider the angle of the gradient as an additional information for computing the Hough transform. In addition the variance of the angle is also computed and we have proposed in [2], to use it as variable bandwidths when estimating the Hough transform with a mixture of kernels. This new Statistical Hough Transform is replacing the discrete 2D histogram of the Standard Hough Transform, by a smoother continuous kernel estimate. We have shown [2] that such modelling was encoding better the alignment content of images. First we define the sets of observations Sθxy = {(θi , xi , yi )}i∈[1···N ] and Sxy = {(xi , yi )}i∈[1···N ] , with θi is the angle direction of the gradient, and (xi , yi ) is the spatial position on the pixel i of an image. Using the Bayes formula, the statistical Hough transform proposes to model the joined probability density function of the hough variables (θ, ρ) and the spatial variables (x, y) by: pθρxy (θ, ρ, x, y) = pρ|θxy (ρ|θ, x, y) · pθxy (θ, x, y) (1) When x, y, θ are known, the variable ρ is deterministic since we have the relation: ρ = x cos θ + y sin θ (2)

Therefore we propose to model the conditional probability: pρ|θxy (ρ|θ, x, y, Sθxy ) = δ(ρ − x cos θ − y sin θ) (3) where δ(·) is the dirac distribution. As a consequence, only pθxy (θ, x, y) is to estimate using kernels on the set of observations Sθxy = {(xi , yi , θi )}i=1,··· ,N :
N

2

Statistical Hough Transform

pθxy (θ, x, y|Sθxy ) = ˆ
i=1

1 kx hxi 1 kθ hθi

x − xi hxi θ − θi hθi pi (4)

Many works have been published on the Hough Transform since its first publication [4]. It is a very

1 ky hyi

y − yi hyi

978-1-4244-2175-6/08/$25.00 ©2008 IEEE

where pi is the prior on the observation (xi , yi , θi ). In this paper, all the pixels are selected with equiprobable 1 prior pi = N , ∀i = 1, · · · , N . By integration with respect to the spatial variables (x, y), an estimate of the Hough transform pθρ (θ, ρ) can be computed:
N

pθρ (θ, ρ|Sθxy ) = ˆ
i=1

1 kθ hθi

θ − θi hθi

Ri (θ, ρ) pi

pθρ (θ, ρ|Sθxy ) ˆ

pθρ (θ, ρ|Sxy ) ˆ

(5) where Ri (θ, ρ) is the Radon transform of the spatial kernels: Ri (θ, ρ) = 1 kx hxi δ(ρ − x cos θ − y sin θ) x − xi hxi 1 ky hyi y − yi hyi dxdy (6)

Figure 1. Density estimates in the Hough space.

Then we extract the alignment content by thresholding the pdf pθρ (θ, ρ|Sθxy ) such that: ˆ pθρA (θ, ρ, A) = ˆ   pθρ (θ, ρ|Sθxy ) ˆ  µ σˆ if pθρ (θ, ρ|Sθxy ) > (ˆ + 3ˆ ) pθρ (θ, ρ|Sxy ) ˆ    0 otherwise (10) A indicates the class alignment content and A is its complement defined as: pθρA (θ, ρ, A) = ˆ   pθρ (θ, ρ|Sθxy ) ˆ  µ σˆ if pθρ (θ, ρ|Sθxy ) ≤ (ˆ + 3ˆ ) pθρ (θ, ρ|Sxy ) ˆ    0 otherwise (11) Hence we have pθρ (θ, ρ|Sθxy ) = pθρA (θ, ρ, A) + ˆ ˆ pθρA (θ, ρ, A). Figure 2 shows the results of the ˆ thresholding: in (a) the thresholding map (view from above) in the Hough Space, and the corresponding estimate pθρA (θ, ρ, A). ˆ

All kernels kx , ky and kθ are chosen Gaussian. The spatial bandwidths are naturally set to the resolution grid of the image hxi = hyi = 1, ∀i and the variable bandwidths of the angle are estimated from the image data [2]. If no observation is available for θ (i.e. the set of observations is Sxy ) , then we simplify the kernel for the angle by a uniform distribution such that: pθρ (θ, ρ|Sxy ) = ˆ 1 π
N

Ri (θ, ρ) pi
i=1

(7)

3

Bayesian classification

We want to extract and recover all the alignment content from an image. This is performed in 4 steps. We illustrate them using the image diamond (see figure 4) with additional centered Gaussian noise (σn = 20). 1. Estimates of the pdfs. For an image I, compute both the pdfs pθρ (θ, ρ|Sθxy ) and pθρ (θ, ρ|Sxy ) ˆ ˆ (see figure 1). Since all the pixels are taken, the pdf pθρ (θ, ρ|Sxy ) gives an estimate of the distriˆ bution if no aligned content was occurring in the image I. 2. Threshold of the pdf. We extract the relevant regions in pθρ (θ, ρ|Sθxy ) that are significaˆ tive of aligned content in the image. This is done by fitting robustly the surface pθρ (θ, ρ|Sxy ) ˆ to pθρ (θ, ρ|Sθxy ). In particular, the following valˆ ues are estimated: µ = median ˆ pθρ (θ, ρ|Sθxy ) ˆ pθρ (θ, ρ|Sxy ) ˆ (8)

(a)

(b)

Figure 2. Segmentation in the θρ−space. 3. Segmentation of pθρA (θ, ρ, A). We first detect ˆ all maxima of pθρA (θ, ρ, A) and sort them in deˆ scending order. For each mode, we aim at segmenting the corresponding bump [6]. The class in the Hough space is computed by thresholding at

and the robust Median Absolute Deviation: σ = 1.48 median ˆ pθ,ρ (θ, ρ|Sθxy ) ˆ −µ ˆ pθρ (θ, ρ|Sxy ) ˆ (9)

half height of the mode. In figure 3(a), the results of the segmentation in the Hough space is shown: the segmentation is unsupervised and gives here 20 θρ classes {Cj }j=1,··· ,20 , each associated with a particular color.

been found in the Hough space is also reported. Some θρ classes Cj , segmented in the Hough space, do not have any associated data in the spatial domain. The number of non-empty classes in the xy-space (also reported in fig. 4) can therefore be lower. We have re-ordered the classes in the spatial domain in descending order of their number of pixels. In figure 5, the four main classes found in spatial domain are shown for the real images from figure 4. Main borders of the road are easily detected. The aerial image is more complex (i.e. more cluttered), however the main roads are also well detected.

(a)

(b)

Figure 3. Classification in the θρ-space (a) and xy−space (b).

4. Backprojection in the xy−space using Bayesian θρ Classification. The class Cj in the Hough space xy is associated to the set of pixels Cj . Each pixel (x, y) is labelled as follow:
xy xy (x, y) → Cj = arg max pxy (x, y, Ck ) ˆ xy Ck

(12) 1&2 3&4

xy where pxy (x, y, Cj ) is computed by inverse ˆ θρ Radon transform of pθρ (θ, ρ, Cj |Sθxy ). In figure ˆ 3(b), the result of the classification in the spatial domain is shown using the same color code as in 3(a). We note that all 19 linear contours are well detected.

xy Figure 5. Main 4 classes Cj in xy−space.

4

Experimental Results

We illustrate our approach on several images in figure 4. The first is the image diamond with a very important Gaussian noise (not shown on the image). We can see that some relevant regions have been segmented in the Hough space and their backprojection in the spatial domain allows to recover the corresponding straight edges. The edges that are missed have low gradient magnitude. Therefore, because of the important noise level, the corresponding bandwidths for the angle in the Statistical Hough transform pθρ (θ, ρ|Sθxy ) are then ˆ very large, and no peaks can then be detected in the Hough space [2]. The second and the third images, road and aerial, present less well defined straight edges. For the road image, most of the aligned content is recovered apart from one side of the traffic sign. The aligned content in the aerial image is more difficult to extract but we note that our segmentation scheme allows the extraction of the main roads. The number of classes that have

Comparison with the Standard Hough transform. When using the Standard Hough transform, edges are first segmented. This preliminary step can be understood in our statistical framework as choosing binary priors {pi }. Edge segmentation is a sensitive operation especially in noisy images and some edges may be completely lost. Here, we have chosen equiprobable priors which remove the need of segmenting the edges. The Statistical Hough Transform is a more generic framework and it allows for smoother estimates of pθρ (θ, ρ) when choosing Gaussian kernels for instance. Its computation is however more time consuming as each pixel from the original images is used (not only the edges), and the infinite tail of the Gaussian kernels has also to be taken into account. As an alternative, other kernels that have a finite support (e.g. Epanechnikov) could be chosen to speed up the process. The classification scheme presented in this paper considers the bumps of the peaks of the estimate of pθρ (θ, ρ). This is a different approach from using only the maxima as it is done in the Standard Hough transform for recovering the lines. The results presented here

diamond [1] (+ Gaussian noise σn = 100)

(12 classes)

(11 classes)

road

(25 classes)

(15 classes)

aerial

(79 classes)

(53 classes)

Figure 4. Classification results: in the Hough space (middle column) and spatial domain (right column).

shows that the shape of the peaks itself contains meaningfull information for recovering almost aligned set of points such as the curvy roads in figure 5.

References
[1] A. Bonci, T. Leo, and S. Longhi. A bayesian approach to the hough transform for line detection. IEEE Transactions on Systems, Man, and Cybernetics, 35(6), November 2005. [2] R. Dahyot. Statistical hough transform. Technical Report TCD-CS-2007-37, Computer Science Trinity College Dublin, July 2007. [3] A. Goldenshluger and A. Zeevi. The hough transform estimator. The Annals of Statistics, 32(5), October 2004. [4] P. Hough. Methods of means for recognising complex patterns, 1962. [5] Q. Ji and R. M. Haralick. Error propagation for the hough transform. Pattern Recognition Letters, 22:813– 823, 2001. [6] B. W. Silverman. Density Estimation for Statistics and Data Analysis. London: Chapman and Hall, 1986.

5

Conclusion

We have presented an unsupervised classification applied to dual spaces: the spatial and Hough spaces. The framework allows to recover the positions that are associated to a particular bump in the Hough space. Acknowledgment This work has been supported by the Enterprise Ireland Innovation Partnership IP-2006-412 and a Google Research Award.



Readers

 

Academia © 2010