In this paper, we discuss the strong convergence rates and strong representation of the Kaplan-Meier estimator and the hazard estimator based on censored data when the survival and the censoring times form negatively associated (NA) sequences. Under certain regularity conditions, strong convergence rates are established for the Kaplan-Meier estimator and the hazard estimator, and the Kaplan-Meier estimator and the hazard estimator can be expressed as the mean of random variables, with the remainder of order a.s.
MSC: 60F15, 60F05.
Keywords:NA sequence; random censorship model; Kaplan-Meier estimator; strong representation; strong convergence rate
1 Introduction and main results
Let be a sequence of true survival times. Random variables (r.v.s) are not assumed to be mutually independent; it is assumed, however, that they have a common unknown continuous marginal distribution function (d.f.) such that . Let the r.v.s be censored on the right by the censoring r.v.s , so that one observes only , where
Here and in the sequel, is the indicator random variable of the event A. In this random censorship model, the censoring times , , are assumed to have the common distribution function such that ; they are also assumed to be independent of the r.v.s ’s. The problem at hand is that of drawing nonparametric inference about F based on the censored observations , . For this purpose, define two stochastic processes on as follows:
the number of uncensored observations less than or equal to t, and
the number of censored or uncensored observations greater than or equal to t. The following nonparametric estimation of F due to Kaplan and Meier  is widely used to estimate F on the basis of the data :
We have then
Another question of interest in survival analysis is the estimation of the hazard function h defined as follows when it is further assumed that F has a density f:
There is extensive literature on the Kaplan-Meier and the hazard estimator and for censored independent observations. We refer to papers by Breslow and Crowley , Foldes and Rejto  and Gu and Lai . Martingale methods for analyzing properties of are described in the monograph by Gill . However, the censored dependent data appear in a number of applications. For example, repeated measurements in survival analysis follow this pattern, see Kang and Koehler  or Wei et al.. In the context of censored time series analysis, Shumway et al. considered (hourly or daily) measurements of the concentration of a given substance subject to some detection limits, thus being potentially censored from the right. Ying and Wei , Lecoutre and Ould-Saïd , Cai  and Liang and Uña-Álvarez  studied the convergence of for the stationary α-mixing data.
The main purpose of this paper is to study the strong convergence rates and strong representation of the Kaplan-Meier estimator and the hazard estimator based on censored data when the survival and the censoring times form the NA (see the following definition) sequences. Under certain regularity conditions, we find strong convergence rates of the Kaplan-Meier and hazard estimator, and the expression of the Kaplan-Meier estimator and the hazard estimator as the mean of random variables, with the remainder of order a.s.
This definition was introduced by Joag-Dev and Proschan . A statistical test depends greatly on sampling. The random sampling without replacement from a finite population is NA, but is not independent. NA sampling has wide applications such as in multivariate statistical analysis and reliability theory. Because of the wide applications of NA sampling, the limit behaviors of NA random variables have received more and more attention recently. One can refer to Joag-Dev and Proschan  for fundamental properties, Matula  for the three series theorem, and Wu and Jiang [15,16] for the strong convergence.
We give two lemmas, which are helpful in proving our theorems.
Lemma 1.1 (Yang , Lemma 1)
Proof Similar to the proof of Lemma 4 in Yang , we can prove Lemma 1.2. □
For positive reals z and t, and δ taking value 0 or 1, let
Theorem 1.4Assume that the conditions of Theorem 1.3 hold. Then
Proof of Theorem 1.3 It is easy to see from Property P7 of Joag-Dev and Proschan  that and are also two sequences of NA r.v.s. Therefore
Now, by (1.1) and (1.2), let us write
Thus, (1.5) holds.
Now we prove (1.6). By (1.3) and (1.4),
Thence, the combination (1.5), (1.6) holds. This completes the proof of Theorem 1.3. □
Proof of Theorem 1.4 By (2.1),
Thus, by the combination of (2.3),
Using the bound and the Borel-Cantelli lemma, we deduce that a.s. The estimation of is similar noting that for all x and y. Therefore, by (2.6)-(2.9), (1.8) holds. (1.9) follows from (2.5) and (1.8). □
The authors declare that they have no competing interests.
QW conceived of the study and drafted, complete the manuscript. PC participated in the discussion of the manuscript. QW and PC read and approved the final manuscript.
Qunying Wu, Professor, Doctor, working in the field of probability and statistics.
Supported by the National Natural Science Foundation of China (11061012), project supported by Program to Sponsor Teams for Innovation in the Construction of Talent Highlands in Guangxi Institutions of Higher Learning ( 47), and the Support Program of the Guangxi China Science Foundation (2012GXNSFAA053010, 2013GXNSFDA019001).
Kaplan, EM, Meier, P: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc.. 53, 457–481 (1958). Publisher Full Text
Breslow, N, Crowley, J: A large sample study of the life table and product limit estimates under random censorship. Ann. Stat.. 2, 437–453 (1974). Publisher Full Text
Földes, A, Rejtö, L: A LIL type result for the product limit estimator. Z. Wahrscheinlichkeitstheor. Verw. Geb.. 56, 75–84 (1981). Publisher Full Text
Gu, MG, Lai, TL: Functional laws of the iterated logarithm for the product-limit estimator of a distribution function under random censorship or truncation. Ann. Probab.. 18, 160–189 (1990). Publisher Full Text
Ying, Z, Wei, LJ: The Kaplan-Meier estimate for dependent failure time observations. J. Multivar. Anal.. 50, 17–29 (1994). Publisher Full Text
Lecoutre, JP, Ould-Sad, E: Convergence of the conditional Kaplan-Meier estimate under strong mixing. J. Stat. Plan. Inference. 44, 359–369 (1995). Publisher Full Text
Cai, ZW: Estimating a distribution function for censored time series data. J. Multivar. Anal.. 78, 299–318 (2001). Publisher Full Text
Liang, HY, Uña-Álvarez, J: A Berry-Esseen type bound in kernel density estimation for strong mixing censored samples. J. Multivar. Anal.. 100, 1219–1231 (2009). Publisher Full Text
Joag-Dev, K, Proschan, F: Negative association of random variables with applications. Ann. Stat.. 11(1), 286–295 (1983). Publisher Full Text
Matula, PA: A note on the almost sure convergence of sums of negatively dependent random variables. Stat. Probab. Lett.. 15, 209–213 (1992). Publisher Full Text
Wu, QY, Jiang, YY: A law of the iterated logarithm of partial sums for NA random variables. J. Korean Stat. Soc.. 39(2), 199–206 (2010). Publisher Full Text
Wu, QY, Jiang, YY: Chover’s law of the iterated logarithm for NA sequences. J. Syst. Sci. Complex.. 23(2), 293–302 (2010). Publisher Full Text