劉新樂
(河南理工大學 數(shù)學與信息科學學院,河南 焦作 454000)
?
缺失縱向數(shù)據(jù)下半?yún)?shù)回歸模型的CC估計法
劉新樂
(河南理工大學 數(shù)學與信息科學學院,河南 焦作 454000)
缺失縱向數(shù)據(jù);半?yún)?shù)回歸模型;CC方法;相合性;漸近正態(tài)性
近年來,縱向數(shù)據(jù)模型成為統(tǒng)計學的熱點課題之一,Diggle,etal.[1]系統(tǒng)地研究了縱向數(shù)據(jù)的線性模型和廣義線性模型.Hsiao[2]和Baltagi[3]討論了縱向數(shù)據(jù)在經(jīng)濟學中的應用;Detter&Munk[7],Martinussen&Scheike[11],[12]討論了縱向數(shù)據(jù)的非參數(shù)回歸模型,考慮縱向數(shù)據(jù)下半?yún)?shù)回歸模型
i=1,2,…n,j=1,2,…mi
(1)
令α=E(g(tij)),εij=g(tij)-α+eij,i≥1,j≥1,則模型(1)可變化為:
i=1,2,…n,j=1,2,…mi
i=1,2,…n, j=1,2,…mi
則再利用CC方法時α和β的最小二乘估計為:
故定義在CC方法下g(t)的最終估計為
(2)
i=1,2,…n,j=1,2,…mi
(3)
為了證明估計量的大樣本性質(zhì),我們做以下準備工作:
3.1 條件
條件C2: g(·)在閉區(qū)間[0,1]上滿足一階lipschitz條件;
條件C4: ∑=E[δ{X-E(δX)|Eδ}{X-E(δX)|Eδ}T]是正定陣,且∑∈(0,+∞);
條件C5: {k=kN,N≥1}為正整數(shù)列,{vNi,1≤i≤N}為非負實數(shù),且滿足:
(5)supN{kmax1
條件C6: (1)Eg2(t)<∞;
3.2 引理
引理2:若條件C1和條件C5成立,則對1≤j≤p有
證明 可結合文獻[4]中引理4計算可得,在此不作具體論述.
引理4:設條件4和條件5成立,則對任何1≤j≤p有
3.3 結論(定理的具體證明過程略)
定理1:若條件C3、條件C4、條件C5和條件C6成立,則
定理2: 如果條件C1—條件C5成立,
缺失概率、樣本個數(shù)與估計量精度的關系如下表所示:
缺失概率個體數(shù)n βBias^(^β)SD^(^β)MSE^(^β)p=0.45401.52280.02280.08470.0077801.52450.02450.06280.0045p=0.25401.52420.02420.06200.0044801.52330.02330.05020.0031
當缺失概率p=0.25、樣本容量為40時,函數(shù)g(t)=cos(2πt)的擬合圖形為:
當缺失概率p=0.25、樣本容量為80時,函數(shù)g(t)=cos(2πt)的擬合圖形為:
從上表可以看出,樣本容量越大所得到的偏差就越小,數(shù)據(jù)缺失越少,真值與估計值之間的偏差也越小,由此,我們可以看出這個方法去估計參數(shù)和非參數(shù)函數(shù)是可行的.
[1]Diggle P.J.,Liang K.Y.and Zegger S.L. Analysis of longitudinal data.Oxford University press.Newyerk.1994.
[2]Hsiao.C.. Analysis of panal data. Czmbridge University press . 1986.
[3]Baltagi,B.H..Econometric analtsis of panal data. England. John Wiley and Sons.1995.
[4]洪圣巖.一類半?yún)?shù)回歸模型的估計理論[J].中國數(shù)學,A,1991,21(12):1258-1272.
[5]高集體,陳希孺,趙林城.部分線性模型中估計的漸近正態(tài)性[J].數(shù)學學報,1994,37(2): 256-268.
[6]高集體,洪圣巖,梁華.部分線性模型中估計的收斂速度[J].數(shù)學學報,1995,38 (5): 658 -669.
[7]Detter & Munk.Testing heterosccdasticity in nonparamentic regression J-Roy.Statist. Soc.Ser.B.1998,60:693~708.
[8]趙林城,白志東.非參數(shù)回歸函數(shù)最近鄰估計的強相合性[J].中國科學A輯,1984,14(5):387-393.
[9]劉妍.缺失數(shù)據(jù)情形半?yún)?shù)回歸模型的二階段估計[D].廣西師范大學,2009.
[10]曾林蕊,朱仲義,茆詩松.半?yún)?shù)廣義線性模型的影響分析與異常點的檢驗[J].高校應用數(shù)學學報(A輯),2004,19(3):323-332.
[11]Martinussen and Scheike.A nonparametric dynamic additive regression model for longitudinal data.the Annals of Statistics,2000.28:1000~1025.
[12]Martinussen and Scheike.sampling adjusted analysis of dynamic additive regression model for longitudinal data.Scandinavian Journal of statistics,2001.28:303~323.
[責任編輯:張懷濤]
The CC Method of the Semi-parametric Regression Model Under the Missing Longitudinal Data
LIU Xin-le
(School of Mathematics and Information Science, Henan Polytechnic University, Jiaozuo 454000,China)
The issues of the missing data model and the longitudinal data model have been one of the hotspots of the statistics,but the study of the model of missing longitudinal data is very few.The semi-parametric regression model of missing longitudinal data is proposed in this thesis and the solutions is given: For missing longitudinal data, all items will be deleted in this thesis which contains lossing data using the CC method, and only remaining “full” sample. By the second stage estination method for statistical inference, the ultimate estimates of parametric and nonparametric vector are got by using the two stages estimate. And the asymptotic normal properties of these estimators is proved. And the data simulation shows that the estimation method is feasible.
Missing longitudinal data; Semiparametric regression model; Complete Case; Consistency;Asymptotic Normality
2016-07-26
河南理工大學校青年基金資助項目(72511/082);河南理工大學校示范教師教改專項資助項目(72307/001)
劉新樂(1980-),女,河南平頂山人,講師,從事數(shù)理統(tǒng)計方向研究。
A
1671-5330(2016)05-0071-04