- Research
- Open access
- Published:
Approximations of Jensen divergence for twice differentiable functions
Journal of Inequalities and Applications volume 2013, Article number: 267 (2013)
Abstract
The Jensen divergence is used to measure the difference between two probability distributions. This divergence has been generalised to allow the comparison of more than two distributions. In this paper, we consider some bounds for generalised Jensen divergence of twice differentiable functions with bounded second derivatives. Evidently, these bounds provide approximations for the Jensen divergence of twice differentiable functions by the Jensen divergence of simpler functions such as the power functions and the paired entropies associated to the Harvda-Charvát functions.
MSC:26D15, 94A17.
1 Introduction
One of the more important applications of probability theory is finding an appropriate measure of distance (or difference) between two probability distributions [1]. A number of these divergence measures have been widely studied and applied by a number of mathematicians such as Burbea and Rao [2], Havrda and Charvát [3], Lin [4] and others.
In Burbea and Rao [2], a generalisation of Jensen divergence is considered to allow the comparison of more than two distributions. If Φ is a function defined on an interval I of the real line ℝ, the (generalised) Jensen divergence between two elements and in (where ) is given by the following equation (cf. Burbea and Rao [2]):
for all . Several measures have been proposed to quantify the difference (also known as the divergence) between two (or more) probability distributions. We refer to Grosse et al. [5], Kullback and Leibler [6] and Csiszar [7] for further references.
We denote by
Utilising the family of functions, for ,
by Havrda and Charvát in [3] to introduce their entropies of degree α, Burbea and Rao [2] introduced the following family of Jensen divergences:
that can be defined on with the convention that for . We note that the divergence is also known as the Jensen-Shannon divergence [8].
These measures have been applied in a variety of fields, for example, in information theory [9]. The Jensen divergence introduced in Burbea and Rao [2] has its applications in bioinformatics [10, 11], where it is usually utilised to compare two samples of healthy population (control) and diseased population (case) in detecting gene expression for a certain disease. We refer the readers to Dragomir [1] for the applications in other areas.
In a recent paper by Dragomir et al. [12], the authors found sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ, including functions of bounded variation, absolutely continuous functions, Lipschitzian continuous functions, convex functions and differentiable functions. We recall some of these results in Section 2, which motivates the new results we obtain in this paper.
In this paper, we provide bounds for Jensen divergence of twice differentiable function Φ whose second derivative satisfies some boundedness conditions. These bounds provide approximations of the Jensen divergence (cf. (1)) by the divergence of simpler functions such as the power functions (cf. Section 3) and the above mentioned family of Jensen divergences (cf. Section 4). Finally, we apply these bounds to some elementary functions in Section 5.
2 Definitions, notation and previous results
In this section, we provide definitions and notation that will be used in the paper. We also provide some results regarding sharp bounds for the generalised Jensen divergence as stated in Dragomir et al. [12].
2.1 Definitions and notation
Throughout the paper, for any real number , we define to be its Hölder conjugate, that is, .
Definition 1 (Bullen [13])
If s is an extended real number, the generalised logarithmic mean of order s of two positive numbers x and y is defined by
and .
This mean is homogeneous and symmetric [[13], p.385]. In particular, there is no loss in generality by assuming . Note also that
for and . This mean generalises not only logarithmic mean (when ), which is particularly useful in distribution of electrical charge of a conductor, but also arithmetic mean (when ) and geometric mean (when ).
We use the following notations for Lebesgue integrable functions: for any Lebesgue integrable function g on , we define, for ,
and for , we denote .
We recall that a function is absolutely continuous on if and only if it is differentiable almost everywhere in , the derivative is Lebesgue integrable on this interval and for any .
2.2 Previous results
In a recent paper by Dragomir et al. [12], the authors provide sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ. Some results are stated in the following.
Theorem 2 (Dragomir et al. [12])
Assume that is absolutely continuous on . Then we have the bounds
for any .
Moreover, if the modulus of the derivative is convex, then we have the inequality
for any , where .
The constant is best possible in both inequalities.
Some more assumptions for Φ lead to the following results.
Theorem 3 (Dragomir et al. [12])
Let be a differentiable function on the interval of real numbers ℝ.
-
(i)
If the derivative is of bounded variation on , then
(5)
for any .
The constant is best possible in both inequalities (5).
-
(ii)
If the derivative is K-Lipschitzian on with the constant , then
(6)
for any , where
The constant is best possible in (6).
Motivated by these results, we state bounds for for twice differentiable functions Φ with some boundedness conditions for the second derivative in the next sections.
3 Approximating with Jensen divergence for power functions
In this section we provide some bounds for the generalised Jensen divergence for twice differentiable function , whose second derivative is bounded above and below in the following sense:
for some and and all ; and
for some , some and all . These conditions enable us to provide approximations of the Jensen divergence for Φ via the functions for and , i.e.
Lemma 4 (Dragomir et al. [12])
Let be a differentiable function and let the derivative be absolutely continuous. Then
We refer to [12] for the proof of the above lemma.
Lemma 5 Let be a twice differentiable function and . If satisfies (7), then
and
where is the sth generalised logarithmic mean.
Proof Note that condition (7) is equivalent to
since . This is also equivalent to
We take the supremum of both sides to obtain (10). For , we note that (12) is equivalent to
which proves (11). □
Theorem 6 Let be a twice differentiable function and . If satisfies (7), then
Proof Since any differentiable function is absolutely continuous, we may employ Lemma 4. Combining this with Lemma 5, we have
as desired. □
We omit the proofs for the next results as they follow similarly to those of Lemma 5 and Theorem 6.
Lemma 7 Let be a twice differentiable function and . If satisfies (8), then
and
where is the sth generalised logarithmic mean.
Theorem 8 Let be a twice differentiable function and . If satisfies (8), then
4 Further approximations
In this section, we present approximations for by utilising the family of the Jensen divergence
and
Although is defined for in [2], we may let α to be negative in (15), and for , we define
Theorem 9 Let be a twice differentiable function on I. If satisfies (7), then
Furthermore, if satisfies (8), then
Proof We consider the auxiliary function defined by , where . We observe that is twice differentiable on I and the second derivative is given by
Utilising condition (7) and since for , we deduce that for any which means that is convex on I. Since for a convex function we have that , then we can write that
and the first inequality in (18) is proved. To prove the second inequality in (18), we consider the auxiliary function with , for which we perform a similar argument; and we omit the details.
Now, if and if we consider the auxiliary function with , then ψ is twice differentiable and
since . Therefore is concave on I, which implies that for any and, as above, we obtain
The second inequality in (19) follows by considering the auxiliary function with , and we omit the details. This completes the proof. □
Theorem 10 Let be a twice differentiable function on I. If there exist the constants such that
then we have the bounds
If there exist the constants such that
then we have the bounds
Proof Consider the auxiliary function with . We observe that is twice differentiable, and by (20) we have for any , then we can conclude that is a convex function on I. Therefore we have for any , which implies that
and the first inequality in (21) is proved. Now, consider the auxiliary function with . Then for any ; and by (20) it is a convex function on I. By similar arguments, we deduce the second inequality in (21).
To prove the second part of the theorem, consider the auxiliary function , . We observe that is twice differentiable and , for . Since by (22) we have for all , then we can conclude that is a convex function on I. The proof now follows along the lines outlined above and the first part of (23) is proved. The second part of (23) also follows by employing the auxiliary function , ; and this completes the proof. □
5 Applications to some elementary functions
We consider the approximations mentioned in Section 4 for some elementary functions.
We consider the function for and have the following bounds for all :
In what follows, we apply these bounds to the above function on the interval , where and (cf. Figure 1).
Discussion In this example, the best lower approximation (amongst the three) is given by , and the best upper approximation is given by , where and . However, it remains an open question whether this is true in general.
We consider the Havrda-Charvát function
For , we have the following bounds for all :
We have the following bounds for all :
In Figure 2, we apply these bounds to the above function on the interval , where , , .
We also have, for all ,
for , where
for and .
In Figure 3, we apply these bounds to the above function on the interval , where , , and .
Similarly, we have, for all ,
for and , where
for . In Figure 4, we apply these bounds to the above function on the interval , where , , and .
Discussion In this example, the best lower approximation (amongst the five) is given by , and the best upper approximation is given by , where , . However, it remains an open question whether this is true in general.
References
Dragomir SS: Some reverses of the Jensen inequality with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a72
Burbea J, Rao CR: On the convexity of some divergence measures based on entropy functions. IEEE Trans. Inf. Theory 1982, 28(3):489–495. 10.1109/TIT.1982.1056497
Havrda ME, Charvát F: Quantification method of classification processes: concept of structural α -entropy. Kybernetika 1967, 3: 30–35.
Lin J: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37(1):145–151. 10.1109/18.61115
Grosse I, Bernaola-Galvan P, Carpena P, Roman-Roldan R, Oliver J, Stanley HE: Analysis of symbolic sequences using the Jensen-Shannon divergence. Phys. Rev. E, Stat. Nonlinear Soft Matter Phys. 2002., 65(4): Article ID 041905. doi:10.1103/PhysRevE.65.041905
Kullback S, Leibler RA: On information and sufficiency. Ann. Math. Stat. 1951, 22: 79–86. 10.1214/aoms/1177729694
Csiszar I: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hung. 1967, 2: 299–318.
Shannon CE: A mathematical theory of communications. Bell Syst. Tech. J. 1948, 27: 379–423. 623–565
Menendez ML, Pardo JA, Pardo L: Some statistical applications of generalized Jensen difference divergence measures for fuzzy information systems. Fuzzy Sets Syst. 1992, 52: 169–180. 10.1016/0165-0114(92)90047-8
Arvey AJ, Azad RK, Raval A, Lawrence JG: Detection of genomic islands via segmental genome heterogeneity. Nucleic Acids Res. 2009, 37(16):5255–5266. 10.1093/nar/gkp576
Gómez RM, Rosso OA, Berretta R, Moscato P: Uncovering molecular biomarkers that correlate cognitive decline with the changes of Hippocampus’ gene expression profiles in Alzheimer’s disease. PLoS ONE 2010., 5(4): Article ID e10153. doi:10.1371/journal.pone.0010153
Dragomir SS, Dragomir NM, Sherwell D: Sharp bounds for the Jensen divergence with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a47
Bullen PS Mathematics and Its Applications 560. In Handbook of Means and Their Inequalities. Kluwer Academic, Dordrecht; 2003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
EK, SSD, ITD and DS contributed equally in all stages of writing the paper. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Kikianty, E., Dragomir, S.S., Dintoe, I.T. et al. Approximations of Jensen divergence for twice differentiable functions. J Inequal Appl 2013, 267 (2013). https://doi.org/10.1186/1029-242X-2013-267
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1029-242X-2013-267