Bhattacharyya distance

In statistics, the Bhattacharyya distance measures the similarity of two probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after Anil Kumar Bhattacharyya, a statistician who worked in the 1930s at the Indian Statistical Institute.[1] He has developed the method to measure the distance between two non-normal distributions and illustrated this with the classical multinomial populations[2] as well as probability distributions that are absolutely continuous with respect to the Lebesgue measure.[3][4] The latter work appeared partly in 1943 in the Bulletin of the Calcutta Mathematical Society [vol. 35, pp. 99-109],[4] while the former part, despite being submitted for publication in 1941, appeared almost five years later in Sankhya [vol. 7, 1946, pp. 401-406][2].[1]

The coefficient can be used to determine the relative closeness of the two samples being considered. It is used to measure the separability of classes in classification and it is considered to be more reliable than the Mahalanobis distance, as the Mahalanobis distance is a particular case of the Bhattacharyya distance when the standard deviations of the two classes are the same. Consequently, when two classes have similar means but different standard deviations, the Mahalanobis distance would tend to zero, whereas the Bhattacharyya distance grows depending on the difference between the standard deviations.

Definition

For probability distributions p and q over the same domain X, the Bhattacharyya distance is defined as

D_{B}(p,q)=-\ln \left(BC(p,q)\right)

where

BC(p,q)=\sum _{x\in X}{\sqrt {p(x)q(x)}}

is the Bhattacharyya coefficient for discrete probability distributions.

For continuous probability distributions, the Bhattacharyya coefficient is defined as

BC(p,q)=\int {\sqrt {p(x)q(x)}}\,dx

In either case, $0\leq BC\leq 1$ and $0\leq D_{B}\leq \infty$ . $D_{B}$ does not obey the triangle inequality, but the Hellinger distance, which is given by ${\sqrt {1-BC(p,q)}}$ does obey the triangle inequality.

In its simplest formulation, the Bhattacharyya distance between two classes under the normal distribution can be calculated[5] by extracting the mean and variances of two separate distributions or classes:

D_{B}(p,q)={\frac {1}{4}}\ln \left({\frac {1}{4}}\left({\frac {\sigma _{p}^{2}}{\sigma _{q}^{2}}}+{\frac {\sigma _{q}^{2}}{\sigma _{p}^{2}}}+2\right)\right)+{\frac {1}{4}}\left({\frac {(\mu _{p}-\mu _{q})^{2}}{\sigma _{p}^{2}+\sigma _{q}^{2}}}\right)

where:

$\sigma _{p}^{2}$	is the variance of the p-th distribution,
$\mu _{p}$	is the mean of the p-th distribution, and
$p,q$	are two different distributions.

The Mahalanobis distance used in Fisher's linear discriminant analysis is a particular case of the Bhattacharyya Distance.

For multivariate normal distributions $p_{i}={\mathcal {N}}({\boldsymbol {\mu }}_{i},\,{\boldsymbol {\Sigma }}_{i})$ ,

D_{B}={1 \over 8}({\boldsymbol {\mu }}_{1}-{\boldsymbol {\mu }}_{2})^{T}{\boldsymbol {\Sigma }}^{-1}({\boldsymbol {\mu }}_{1}-{\boldsymbol {\mu }}_{2})+{1 \over 2}\ln \,\left({\det {\boldsymbol {\Sigma }} \over {\sqrt {\det {\boldsymbol {\Sigma }}_{1}\,\det {\boldsymbol {\Sigma }}_{2}}}}\right),

where ${\boldsymbol {\mu }}_{i}$ and ${\boldsymbol {\Sigma }}_{i}$ are the means and covariances of the distributions, and

{\boldsymbol {\Sigma }}={{\boldsymbol {\Sigma }}_{1}+{\boldsymbol {\Sigma }}_{2} \over 2}.

Note that, in this case, the first term in the Bhattacharyya distance is related to the Mahalanobis distance.

Bhattacharyya coefficient

The Bhattacharyya coefficient is an approximate measurement of the amount of overlap between two statistical samples. The coefficient can be used to determine the relative closeness of the two samples being considered.

Calculating the Bhattacharyya coefficient involves a rudimentary form of integration of the overlap of the two samples. The interval of the values of the two samples is split into a chosen number of partitions, and the number of members of each sample in each partition is used in the following formula,

BC(\mathbf {p} ,\mathbf {q} )=\sum _{i=1}^{n}{\sqrt {p_{i}q_{i}}},

[6]

where, considering the samples p and q, n is the number of partitions, and $p_{i}$ , $q_{i}$ are the numbers of members of samples p and q in the i-th partition.

This formula is hence larger with each partition that has members from both samples, and larger with each partition that has a large overlap of the two sample's members within it. The choice of the number of partitions depends on the number of members in each sample; too few partitions will lose accuracy by overestimating the overlap region, and too many partitions will lose accuracy by creating individual partitions with no members despite being in a densely populated sample space.

The Bhattacharyya coefficient will be 0 if there is no overlap at all due to the multiplication by zero in every partition. This means the distance between fully separated samples will not be exposed by this coefficient alone.

The Bhattacharyya coefficient is used in the construction of polar codes.[7]

Applications

The Bhattacharyya distance is widely used in research of feature extraction and selection,[8] image processing,[9] speaker recognition,[10] and phone clustering.[11]

A "Bhattacharyya space" has been proposed as a feature selection technique that can be applied to texture segmentation.[12]

References

Sen, Pranab Kumar (1996). "Anil Kumar Bhattacharyya (1915-1996): A Reverent Remembrance". Calcutta Statistical Association Bulletin.
Bhattacharyya, A. "On a Measure of Divergence between Two Multinomial Populations". Sankhyā.
Bhattacharyya, A. (1943). "On a measure of divergence between two statistical populations defined by their probability distributions". Bulletin of the Calcutta Mathematical Society. 35: 99–109. MR 0010358.
"Bulletin of the Calcutta Mathematical Society, Vol-35". 1943. Article: "On a measure of divergence between two statistical populations defined by their probability distributions" by Bhattacharyya, A. is in Page: 99-109 {{cite journal}}: Cite journal requires |journal= (help)
Guy B. Coleman, Harry C. Andrews, "Image Segmentation by Clustering", Proc IEEE, Vol. 67, No. 5, pp. 773–785, 1979
D. Comaniciu, V. Ramesh, P. Meer, Real-Time Tracking of Non-Rigid Objects using Mean Shift Archived 2010-08-14 at the Wayback Machine, BEST PAPER AWARD, IEEE Conf. Computer Vision and Pattern Recognition (CVPR'00), Hilton Head Island, South Carolina, Vol. 2, 142–149, 2000
Arıkan, Erdal (July 2009). "Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels". IEEE Transactions on Information Theory. 55 (7): 3051–3073. arXiv:0807.3917. doi:10.1109/TIT.2009.2021379.
Euisun Choi, Chulhee Lee, "Feature extraction based on the Bhattacharyya distance", Pattern Recognition, Volume 36, Issue 8, August 2003, Pages 1703–1709
François Goudail, Philippe Réfrégier, Guillaume Delyon, "Bhattacharyya distance as a contrast parameter for statistical processing of noisy optical images", JOSA A, Vol. 21, Issue 7, pp. 1231−1240 (2004)
Chang Huai You, "An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition", Signal Processing Letters, IEEE, Vol 16, Is 1, pp. 49-52
Mak, B., "Phone clustering using the Bhattacharyya distance", Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, Vol 4, pp. 2005–2008 vol.4, 3−6 Oct 1996
Reyes-Aldasoro, C.C., and A. Bhalerao, "The Bhattacharyya space for feature selection and its application to texture segmentation", Pattern Recognition, (2006) Vol. 39, Issue 5, May 2006, pp. 812–826

Nielsen, F.; Boltz, S. (2010). "The Burbea–Rao and Bhattacharyya centroids". IEEE Transactions on Information Theory. 57 (8): 5455–5466. arXiv:1004.5049. doi:10.1109/TIT.2011.2159046.

Kailath, T. (1967). "The Divergence and Bhattacharyya Distance Measures in Signal Selection". IEEE Transactions on Communication Technology. 15 (1): 52–60. doi:10.1109/TCOM.1967.1089532.

Djouadi, A.; Snorrason, O.; Garber, F. (1990). "The quality of Training-Sample estimates of the Bhattacharyya coefficient". IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (1): 92–97. doi:10.1109/34.41388.

For a short list of properties, see: http://www.mtm.ufsc.br/~taneja/book/node20.html

External links

"Bhattacharyya distance", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
Bhattacharyya's distance measure as a precursor of genetic distance measures, Journal of Biosciences, 2004

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[:0-1] Sen, Pranab Kumar (1996). "Anil Kumar Bhattacharyya (1915-1996): A Reverent Remembrance". Calcutta Statistical Association Bulletin.

[:1-2] Bhattacharyya, A. "On a Measure of Divergence between Two Multinomial Populations". Sankhyā.

[3] Bhattacharyya, A. (1943). "On a measure of divergence between two statistical populations defined by their probability distributions". Bulletin of the Calcutta Mathematical Society. 35: 99–109. MR 0010358.

[:2-4] "Bulletin of the Calcutta Mathematical Society, Vol-35". 1943. Article: "On a measure of divergence between two statistical populations defined by their probability distributions" by Bhattacharyya, A. is in Page: 99-109 {{cite journal}}: Cite journal requires |journal= (help)

[Coleman79-5] Guy B. Coleman, Harry C. Andrews, "Image Segmentation by Clustering", Proc IEEE, Vol. 67, No. 5, pp. 773–785, 1979

[Ref_-6] D. Comaniciu, V. Ramesh, P. Meer, Real-Time Tracking of Non-Rigid Objects using Mean Shift Archived 2010-08-14 at the Wayback Machine, BEST PAPER AWARD, IEEE Conf. Computer Vision and Pattern Recognition (CVPR'00), Hilton Head Island, South Carolina, Vol. 2, 142–149, 2000

[7] Arıkan, Erdal (July 2009). "Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels". IEEE Transactions on Information Theory. 55 (7): 3051–3073. arXiv:0807.3917. doi:10.1109/TIT.2009.2021379.

[8] Euisun Choi, Chulhee Lee, "Feature extraction based on the Bhattacharyya distance", Pattern Recognition, Volume 36, Issue 8, August 2003, Pages 1703–1709

[Goudail-9] François Goudail, Philippe Réfrégier, Guillaume Delyon, "Bhattacharyya distance as a contrast parameter for statistical processing of noisy optical images", JOSA A, Vol. 21, Issue 7, pp. 1231−1240 (2004)

[You-10] Chang Huai You, "An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition", Signal Processing Letters, IEEE, Vol 16, Is 1, pp. 49-52

[Mak-11] Mak, B., "Phone clustering using the Bhattacharyya distance", Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, Vol 4, pp. 2005–2008 vol.4, 3−6 Oct 1996

[Reyes-Aldasoro-12] Reyes-Aldasoro, C.C., and A. Bhalerao, "The Bhattacharyya space for feature selection and its application to texture segmentation", Pattern Recognition, (2006) Vol. 39, Issue 5, May 2006, pp. 812–826