• 
    

    
    

      99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

      AN INFORMATIC APPROACH TO A LONG MEMORY STATIONARY PROCESS?

      2023-04-25 01:41:36丁義明

      (丁義明)

      College of Science, Wuhan University of Science and Technology, Wuhan 430081, China;Hubei Province Key Laboratory of System Science in Metallurgical Process,Wuhan University of Science and Technology, Wuhan 430065, China

      E-mail: dingym@wust.edu.cn

      Liang WU (吳量)

      Center of Statistical Research, School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China

      E-mail: wuliang@swufe.edu.cn

      Xuyan XIANG (向緒言))?

      Hunan Province Cooperative Innovation Center for TCDDLEEZ, School of Mathematics and Physics Science, Hunan University of Arts and Science, Changde 415000, China

      E-mail: xyxiang2001@126.com

      Abstract Long memory is an important phenomenon that arises sometimes in the analysis of time series or spatial data.Most of the definitions concerning the long memory of a stationary process are based on the second-order properties of the process.The mutual information between the past and future Ip-f of a stationary process represents the information stored in the history of the process which can be used to predict the future.We suggest that a stationary process can be referred to as long memory if its Ip-f is infinite.For a stationary process with finite block entropy, Ip-f is equal to the excess entropy,which is the summation of redundancies that relate the convergence rate of the conditional (differential)entropy to the entropy rate.Since the definitions of the Ip-f and the excess entropy of a stationary process require a very weak moment condition on the distribution of the process,it can be applied to processes whose distributions are without a bounded second moment.A significant property of Ip-f is that it is invariant under one-to-one transformation;this enables us to know the Ip-f of a stationary process from other processes.For a stationary Gaussian process,the long memory in the sense of mutual information is more strict than that in the sense of covariance.We demonstrate that the Ip-f of fractional Gaussian noise is infinite if and only if the Hurst parameter is H ∈(1/2,1).

      Key words mutual information between past and future;long memory;stationary process;excess entropy;fractional Gaussian noise

      1 Introduction

      Long memory processes and long range dependence processes are synonymous notions[5,12,22]which play an important role in various fields,such as hydrology,geophysics,physics,finance,biology,medicine,climatology,environmental sciences,economics,telecommunications,etc..As was mentioned in[5],although long memory and related topics date from the late 19th century,it can properly be said that the notion only really started to attract the interest of a significant number of mathematical researchers(and,in particular,probabilists and statisticians)since the work of Mandelbrot and his colleagues,which laid the foundations for the fractional Brownian motion (FBM) model and its increments (such as the fractional Gaussian noise (FGN) model)– the classical models in the studies of long memory [5,28].Similar path-breaking roles can be attributed to Hurst[15]for hydrology,Dobrushin(and before,Kolmogorov[17])for physics,and Granger [13]for economics.

      A stationary process is a sequence of random variables whose probability law is time invariant.A stationary second-order moment process has long memory when the sum of the autocorrelation function (ACF) diverges,or there exists a pole at zero frequency of its power spectrum [4,5].That is to say,the ACF and the power spectrum of the long memory process both follow a power-law,while the underlying process has no characteristic timescale of decay.The correlation of the long memory process decays so fast that it cannot be distinguished from noise rapidly.This is in striking contrast to many standard stationary processes.The long memory phenomenon relates to the rate of decay of statistical dependence of a stationary process,with the implication that this decays more slowly than an exponential decay,which is typically a power-like decay.Some self-similar processes may exhibit long memory,but not all processes with long memory are self-similar[21].When the definitions of long memory are given,they vary from author to author(the econometric survey[14]mentions 11 different definitions).Different definitions of long memory are used for different applications.Most of the definitions of long memory that appear in the literature are based on the second-order properties of a stochastic process.Such properties include the asymptotic behavior of covariances,spectral density,and variances of partial sums.Reasons for the popularity of the second-order properties in this context are both historical and practical: second-order properties are relatively simple concepts and are easy to estimate from the data.

      Long memory is popularly defined from the aspect of a covariant stationary series{Xn}with spectral density via the divergence of the summation of autocovariance

      wherern=Cov(Xm,Xm+n).Conversely,for short memory.Note that correlations provide only limited information about the process if the process is “not very close” to being Gaussian,and rate of decay of correlations may change significantly after instantaneous one-to-one transformations of the process,though this is not valid for processes without a bounded second moment [22].The question then arises as to whether it is possible to develop a new approach to improve the definition of the long memory of a stationary process so that it is invariant under one-to-one transformation,can capture more dependence information,and can be used for the process without a bounded second moment.For stationary processes without a bounded second moment,some scholars use extreme value to describe long memory [11,19,20,22].Samorodnitsky suggested using notions from ergodic theory,including ergodic strong mixing,to describe the memory of a stationary process,because these are invariant under one-to-one transformation.The key step to this approach is to look for reasonable strong mixing conditions to significantly distinguish short and long memory stationary processes.Mixing coefficients [1,2]are also invariant under one-to-one transformation,but they are difficult to compute and still lack reasonable signs for distinguishing significantly long memory and short memory processes.It is well known that Shannon entropy is invariant under one-to-one transformation.One can expect to find suitable concepts in information theory for distinguishing short and long memory stationary processes.Mutual information is used to capture the dependence between two random variables that are independent if and only if the mutual information between those two random variables is zero.For a stationary processX={···,X-1,X0,X1,X2,···},we can regardXas two random variables: the past=···X-2X-1X0and the future=X1X2···.The mutual information betweenisIp-f(X),which represents the information between theof the stationary processX(see Section 3 for detailed definition regardingIp-f).Stationary processXusually admits a Shannon (differential) entropy ratehμ(X) (for continuous-valued stationary process,differential entropy rate may be-∞) and the conditional entropyH(Xn|X1,···,Xn-1)→hμ(X) (or the conditional differential entropyh(Xn|X1,···,Xn-1)→hμ(X)) asn →∞[6,16].In what follows,we try to demonstrate thatIp-fcan be used to distinguish long memory and short memory stationary processes: a stationary processXis long memory ifIp-f(X)=+∞,and it is short memory ifIp-f(X)<+∞.The mutual information description of long memory is also related to the ten key challenges of the post-Shannon era raised by Huawei [29].Such an approach is helpful for the following reasons:

      1) The definition of mutual informationIp-f(X)requires a weak moment condition rather than the second moment condition<+∞,therefore it can be used to detect the long memory behavior of a stationary process with heavy tail distribution;

      2)Ip-f(X)can distinguish short and long memory stationary processes asin a process with a bounded second moment (see Section 3 for details);

      3)Ip-f(X) is invariant under one-to-one transformation (Theorem 3.8);

      4) This is closely related to a second moment characterization if the stationary process is Gaussian (Theorem 3.11).For fractional Gaussian noise,Ip-f(X) is infinite if and only if the Hurst parameter isH ∈(1/2,1) (Theorem 3.13);

      5) For a stationary process with finite block entropy,Ip-f(X)is equal to the excess entropyE(X) ofX,which is an intuitive measure of information stored inX(Theorem 4.6) [7,27].WhetherIp-f(X) is finite or not is up to the convergence rate of the conditional (differential)entropy and the (differential) entropy ratehμ(X).

      The rest of this paper is organized as follows: in Section 2,we recall some basic concepts about information theory.In Section 3,we give the definition of long memory using mutual information for stationary processes,and show that the mutual information is invariant under one-to-one transformation.Furthermore,we illustrate howIp-fis related to covariance when the stationary process is Gaussian,and prove that,for fractional Gaussian noise,Ip-f=+∞if and only if the Hurst parameter isH ∈(1/2,1).In Section 4,we demonstrate that,for a stationary process with finite block entropy,Ip-fis equal to its excess entropy.

      2 Basic Quantities of Information Theory

      We recall some basic concepts and theorems about information theory from the books [6]and [16].

      The Shannon entropyH(X)of a discrete random variableX,taking valuesx ∈S,is defined as

      where the probability thatXtakes on the particular valuexis written asp(x)≡P(X=x).

      Shannon entropy measures the uncertainty of a discrete random variable.

      The joint Shannon entropyH(X,Y) of two discrete random variables (X,Y) is defined as

      wherep(x,y) is the joint distribution of (X,Y).

      The joint Shannon entropy ofndiscrete random variablesX1,X2,···,Xnis

      wherep(x1,x2,···,xn) is the joint distribution of (X1,X2,···,Xn).

      The conditional Shannon entropyH(X|Y),which is the entropy of a discrete random variableXthat is conditional on the knowledge of another discrete random variableY,is

      WhenXis a continuous random variable with densityf(x),the differential entropyh(X)ofXis defined as

      whereSis the support set of the random variable.

      The differential entropyh(X)may be negative and does not work as an uncertainty measure,but the difference ofh(X)-h(Y) indicates the difference between the uncertainties ofXandY.

      The joint entropy ofncontinuous random variablesX1,X2,···,Xnwith densityfis defined as

      Similarly,ifXandYare continuous random variables with a joint density functionf(x,y),the joint differential entropyh(X,Y) of (X,Y) is defined as

      and the conditional differential entropyh(X|Y) as

      For both Shannon entropy and differential entropy,we have the following properties[6,16]:

      1)H(X,Y)=H(X)+H(Y|X)=H(Y)+H(X|Y),h(X,Y)=h(X)+h(Y|X)=h(Y)+h(X|Y);

      2) Conditioning reduces entropy

      3) Chain rules

      A stochastic processX={Xi,i ∈Z} is an indexed sequence of random variables.A stochastic process is (strictly) stationary if the joint probability distribution does not change when shifted in time,i.e.,the distribution of (Xi1+s,Xi2+s,···,Xin+s) is independent ofsfor any positive integernandi1,i2,···,in ∈Z+.

      LetX={Xi,i ∈Z} be a stationary stochastic process.Forn=1,2,···,then-block entropy ofXisHX(n) :=H(X1,X2,···,Xn) ifXis a discrete-valued process,orhX(n) :=h(X1,X2,···,Xn) ifXis a continuous-valued process.For convenience,HX(n) (orhX(n)) is denoted byH(n) (orh(n)),if no confusion occurs,andH(0)=0 (orh(0)=0).

      We say that a stationary processXadmits finite block entropy if all of the block entropiesH(n) (orh(n)) are finite,i.e.,0≤H(n)<∞(or-∞

      In this paper,we focus on a stationary process with finite block entropy.

      The following lemma collects some properties of the block (differential) entropy sequence{H(n)} or{h(n)} of a stationary processX:

      Lemma 2.1LetX={Xn,n ∈Z} be a stationary process that admits finite block entropy.We have the following properties:

      1) Nonincreasing entropy gain ?H(n):=H(n)-H(n-1)=H(Xn|Xn-1,···,X1) and?h(n):=h(n)-h(n-1)=h(Xn|Xn-1,···,X1)) are nonincreasing;

      2) Subadditivity for all nonnegative integersmandn,

      3) Monotonicity of entropy per element both{H(n)/n}and{h(n)/n}are nonincreasing.

      ProofWithout loss of generality,we only prove the results for Shannon entropy;the differential entropy cases are similar.

      1) By the chain rules,we have thatIt follows that ?H(n) :=H(n)-H(n-1)=H(Xn|Xn-1,···,X1),which is nonincreasing due to the reduced entropy by condition.

      2) Sinceh(n) is nonincreasing,by the chain rule,we know that

      3) By the stationarity ofX,the chain rule and the nonincreasing entropy gain,we have that

      Remark 2.21) SinceH(Xn|Xn-1,···,X1)≥0,the entropy gain ?H(n)is nonnegative,but ?h(n)=h(Xn|Xn-1,···,X1) is not always nonnegative.

      2) IfXis a discrete-valued stationary process,Xadmits finite block entropy if and only ifH(X1)=H(1)<+∞,because the fact that{H(n)} is subadditive implies thatH(n)≤nH(1)<+∞.In particular,if E|X1|δ<+∞for someδ>0,thenH(X1)<+∞[3,9].ThusH(X1)<+∞is weaker than the second moment condition<+∞.For continuous-valued process,we should avoid the case ofh(n)=-∞,so the condition of finite block entropy is required.

      3) Suppose thatXis a continuous-valued stationary process,and thatXdoes not admit finite block entropy.Letkbe the minimal positive integerksuch thath(k)=-∞.Thenh(n)=-∞forn ≥k.In fact,by the subadditivity of{h(n)},h(mk)≤mh(k)=-∞indicates thath(mk)=-∞.Ifn=mk+s,m ≥1,0

      3 Long Memory and Mutual Information

      In this section,we give the definition of long memory by using mutual information,and discuss the rationale behind this definition.

      3.1 Definition of Long Memory

      The mutual information between two random variablesXandYis defined as follows [16]:

      Definition 3.1LetXandYbe random variables taking values in X and Y.LetμXandμYbe the probability measures ofXandYon the measurable spaces (X,B(X)) and(Y,B(Y)).μXYis the joint probability measure ofXandYon the measurable space (X×Y,B(X)×B(Y)).μX×μYis the product probability measure ofXandYon the measurable space (X×Y,B(X)×B(Y)).The mutual informationI(X;Y) betweenXandYis defined as

      ifμXY ?μX×μY(i.e.μXYis absolutely continuous with respect toμX×μY);otherwiseI(X;Y) is infinite.

      The mutual informationI(X;Y) is the relative entropy (or Kullback-Leibler divergence)between the joint probability measureμXYand the product probability measureμX×μY.

      Suppose thatXandYare two discrete random variables with a joint distributionp(x,y)and marginal distributionsp(x) andp(y),that the Radon-Nikodym derivative is given by

      and thatI(X;Y) can be written as

      It follows that

      For continuous random variablesXandYwith a joint density functionf(x,y)and marginal density functionsf(x) andf(y),

      LetP={P1,···,Pk} be a finite partition of the image of the random variableX.The quantization ofXwith partitionPis the discrete random variable [X]Pdefined by

      Thus the joint distribution of two quantizations [X]Pand [Y]Qis defined as

      whereQ={Q1,···,Qs} is a finite partition of the image of random variableY.

      An equivalent definition of mutual information using quantization[6]gives the relationship between the mutual information of discrete random variables and that of continuous random variables.

      Definition 3.2The mutual information between two random variablesXandYis defined as

      wherePandQmean finite partitions.

      We collect some useful properties about mutual information which immediately follow from Theorem 1.6.3 of [16].

      Lemma 3.3Suppose thatX,Y,Zare three random variables.Then the following properties hold:

      1)I(X;Y)=I(Y;X);

      2)I(X;Y)≥0;

      3)I(X;(Y,Z))≥I(X;Z).

      The mutual information of a stationary process between the past and future is defined as follows (see [18]for the stationary Gaussian process):

      Definition 3.4LetX={Xn,n ∈Z} be a stationary process with finite entropyH(n)(orh(n)) for eachn ∈Z+.LetIp-f(X,n) :=I(X-(n-1)···X0;X1···Xn) be the mutual information between the past and future with lengthn.The mutual information between the past and futureIp-f(X) is defined as

      For convenience,Ip-f(X) andIp-f(X,n) are denoted byIp-fandIp-f(n) as long as no confusion occurs.

      Remark 3.51) For convenience,we set thatIp-f(0)=0.

      2) By Lemma 3.3 and the stationarity ofX,Ip-f(n)=I(X1···Xn;Xn+1···X2n).This is nonnegative and nondecreasing,soIp-fis infinite or a nonnegative constant.

      3) The mutual information between the past and future can be also defined aswhereIp-f(X,n,m) :=I(X1···Xn;Xn+1···Xn+m).Following from the monotonicity of the mutual information (see the third claim of Lemma 3.3),this definition is equivalent to Definition 3.4.

      4) The definition of the mutual informationIp-frequires finite entropy,which only needs a weak moment condition rather than the second moment condition.For more details on this,for a random variableX,if E|X|δ<+∞for someδ>0,thenH(X)<+∞,since

      where Γ is the Gamma function [3,9,26].

      Now we distinguish the long memory and short memory a stationary process by usingIp-f.

      Definition 3.6A stationary process is long memory ifIp-fis infinite,and it is short memory ifIp-fis finite.

      The definition of long memory from the perspective ofIp-fwas discussed in the case of a stationary Gaussian process by Li [18].

      The information gainSnis defined as follows: forn ≥1,letSn:=?Ip-f(n)=Ip-f(n)-Ip-f(n-1) be thenth information gain.SinceIp-f(n) is non-decreasing,Sn ≥0.We have that

      In practice,Ip-fis usually approximated byIp-f(N) for some large positive integerN.In this case,the acquired information is,and the missed information is

      IfIp-f<+∞,asN →+∞,the ratio of the acquired information to the missed information is

      IfIp-f=+∞,no matter whatNis,the ratio of the acquired information to the missed information is

      One can see that there is a significant difference on the asymptotic behavior of{AMR(N)}between the stationary process withIp-f<+∞andIp-f=+∞.

      3.2 Invariance Under One-to-One Transformation

      In what follows,we show that the mutual informationIp-fis invariant under one-to-one transformation.

      Lemma 3.7Letg1andg2be two one-to-one transformations.XandYare two random variables.We have thatI(X;Y)=I(g1(X);g2(Y)).

      ProofLetP=(P1,···,Pk) andQ=(Q1,···,Qm) be partitions of the ranges of the random variablesXandY,respectively.Sinceg1andg2are one-to-one transformations,g1(P)=(g1(P1),···,g1(Pk)) andg2(Q)=(g2(Q1),···,g2(Qm)) are partitions of the ranges of random variablesg1(X) andg2(Y),respectively.

      Observe that,for alli ∈{1,2,···,k} andj ∈{1,2,···,m},

      By eq.(3.1) and eq.(3.2),we know that

      Take the supremum overPandQ,I(X;Y)≤I(g1(X);g2(Y)).

      By symmetry arguments,I(g1(X);g2(Y))≤I(X;Y).

      Lemma 3.7 is proven.

      Theorem 3.8LetX={Xn,n ∈Z} be a stationary stochastic process,and letgbe a one-to-one transformation.Then the mutual information between the past and futureIp-fof the processY:=g(X)={g(Xn),n ∈Z} is equal to that ofX.

      ProofSincegis a one-to-one transformation,by Lemma 3.7,we have that

      which implies thatIp-f(X,n)=Ip-f(Y,n) for alln ≥0.

      As a result,the stationary processesXandYadmit the sameIp-f.

      The significance of the definition of long memory lies in consistency: if there is a one-to-one correspondence between two stationary processes,then either both of them are long memory or neither of them is long memory.

      3.3 Stationary Gaussian Process

      In this subsection,we discuss the relationship between the definitions of long memory in the sense of mutual information and of covariance for the stationary Gaussian process.

      LetX={Xn,n ∈Z} be a zero-mean stationary Gaussian stochastic process.Xcan be completely characterized by its correlation functionrk,j=rk-j=E[XkXj],or equivalently by its power spectral densityf(λ),which is the Fourier transform of the covariance function

      Set that

      for any integerkif it is well defined.Here{bk} are referred to as cepstrum coefficients [18].

      The following Lemma 3.9 was proven by Li in [18]:

      Lemma 3.9LetX={Xn,n ∈Z} be a stationary Guassian process [18].

      1)Ip-fis finite if and only if the cepstrum coefficients satisfy the condition that<∞.In this case,

      2) If the spectral densityf(λ) is continuous,andf(λ)>0,thenIp-fis finite if and only if the autocovariance functions satisfy the condition that

      Lemma 3.10Let{c(n)} be a decreasing positive series.Then

      ProofSince{c(n)} is a decreasing positive series,converges{nc(n)}→0.Thus{nc(n)}is bounded by a constantM.Therefore,

      Remember that for a stationary Gaussian processXwith autocovariance{rn},Xis longmemory ifThe following result shows the relationship betweenandIp-ffor a stationary Gaussian process:

      Theorem 3.11LetX={Xn,n ∈Z} be a stationary Gaussian process with decreasing autocovariance{|rn|}.Suppose that the spectral densityf(λ) is continuous and thatf(λ)>0.Then

      In other words,the fact thatXis not long memory in the sense of covariance implies that it is also not long memory in the sense of mutual information.

      The proof of Theorem 3.11 immediately follows from Lemmas 3.9 and 3.10.

      Remark 3.12Setting thata(n)=,then we have thatbut also thatThe converse implication in Lemma 3.10 is not true.Thus for the process considered in Theorem 3.11,long memory in the sense of mutual information is stricter than for that in the sense of covariance.

      3.4 Fractional Gaussian Noise

      Fractional Brownian motion (FBM) has been widely applied to a large number of natural shapes and phenomena.An FBM with the Hurst parameterH ∈(0,1)is a centered continuoustime Guassian processBH(·) with the covariance function

      fors,t ≥0.BHreduces to an ordinary Brownian motion forH=1/2.

      The incremental process of an FBM is a stationary discrete-time process and is called fractional Gaussian noise(FGN).The auto-covariance function of FGNX={Xk:k=0,1,···}can be derived as follows:

      It is plain to see that

      as|k|→∞.Of course,ifH=1/2,thenρk=0 for allk ≥1 (a Brownian motion has independent increments).One can conclude that the summability of correlations<+∞) holds when 01/2 has become commonly accepted as having long memory,and a lack of the summability of correlations has became popular as part of the definition of long memory.

      The following result shows that long memory in the sense of mutual information is equivalent to that in the sense of covariance for FGN:

      Theorem 3.13LetX={Xn,n ∈Z} be the (discrete) increment process of a fractional Brownian motion with Hurst parameterH ∈(0,1) andH≠1/2.Then

      Remark 3.14This theorem shows that,for fractional Gaussian noise,the informatic characterization of long memory is identical to second moment characterization.

      ProofThe spectral density of the increment of fractional Brownian motionBH(t) (fractional Guassian noise) was obtained by Sinai [24]as

      where Γ(.) denotes the Gamma function and

      for-π ≤λ ≤π.

      This spectral density can be rewritten as

      whereCis a positive constant.

      It can be seen that the spectral density of FGN is positive.

      The spectral densityf(λ) is proportional to|λ|1-2Hnearλ=0.Thus,whenH ∈(0,1/2),f(λ) is continuous,but whenH ∈(1/2,1),it is not continuous atλ=0.

      Notice that ifH=1/2,the FGN is the increment of classical Brownian motion,and it follows thatIp-f=0.

      The theorem will be proven in two steps.

      ?Step 1:H ∈(0,1/2)=?Ip-f<∞.

      WhenH ∈(0,1/2),the spectral density of FGN forH ∈(0,1/2)is positive and continuous,and by eq.(3.4),Following on from Theorem 3.11,we have thatIp-f<∞.We have proven thatH ∈(0,1/2)Ip-f<∞.

      ?Step 2:H ∈(1/2,1)Ip-f=∞.

      Since the spectral density of FGN forH ∈(1/2,1) is not continuous whenλ=0,we prove it by the first claim of Lemma 3.9,which does not require continuous spectral density.

      Now we estimate the mutual information of FGN via the logarithm of the spectral density.We have that

      logf(λ) is an even function on [-π,π],i.e.,logf(λ)=logf(-λ).Forn≠ 0,we obtain the following decomposition:

      Forb1(n),we get that

      Thusb1(n)~asngoes to infinity.

      For the twice continuously differentiable functiong,the Fourier coefficient of ordernbehaves like[25].Observe that log[(1+g1(λ))/2]is twice differentiable,so there exists a positive constantM1<∞such that

      Now we estimateb3(n).Denote that

      We have that

      Since whenH>1/2,|x|2H-1,|x|2H,|x|2H+1,AH(x),(x),(x) are continuous functions on [-π,π],(x) are also continuous functions on [-π,π].We conclude that

      for some positive constantM2<∞.

      Combining the estimations ofb1(n),b2(n),b3(n),

      It follows that there exists a positive integerN0such that|b(n)|≥forn>N0.

      Hence,by Lemma 3.9,the mutual information is

      Remark 3.15From the proof of Theorem 3.13,it can be seen that,forH>1/2,|b(n)|~asn →+∞,which implies that

      4 Excess Entropy

      In this section,we try to relateIp-fwith excess entropy,which is an intuitive measure of memory stored in a stationary stochastic process,so that we can obtain a calculation ofIp-f.

      First we recall the definition of entropy rate of a stochastic processX={Xn} [16].

      Definition 4.1LetX={Xn,n ∈Z} be a stochastic process.

      1) Assume that eachXnis a discrete random variable.The Shannon entropy rate ofXis defined by

      when the limit exists.

      2) Assume that (X1,···,Xn) has a continuous joint distribution for eachn ∈Z+.The differential entropy rate ofXis defined by

      when the limit exists.

      It is known that for a stationary process with finite block entropy,the Shannon(differential)entropy rate always exists [6,16].In what follows,when we mention the entropy ratehμof a stationary processXwith finite block entropy,hμis a Shannon entropy rate ifXis a discretevalued process,and it is a differential entropy rate ifXis a continuous-valued process.

      IfXis a discrete-valued stationary process,by Lemma 2.1,?H(n)=H(n)-H(n-1) is nonnegative and nonincreasing.Then the limit of ?H(n) exists and is finite,it is equal to the entropy ratehμbecause

      IfXis a continuous-valued stationary process,by Lemma 2.1,?h(n)=h(n)-h(n-1)is nonincreasing and maybe negative.Because the entropy ratehμis equal to,hμexists and may be-∞[23].Details of the differential entropy rate can be found in [10,16].

      We show an example for different values ofhμin the case of a continuous-valued stationary process.

      Example 1Supposing that{Xn} is a stationary Gaussian process,we have the joint entropy [6]

      whereK(n)is the covariance matrix with elements of=r(i-j)=E(Xi-EXi)(Xj-EXj).Thus it is Toeplitz with entriesr(0),r(1),···,r(n-1) along the top row.The density of eigenvalues ofK(n) tends to the spectrum of the process asn →∞.It has been shown by Kolmogorov that the differential entropy rate of a stationary Gaussian process can be given by

      whereS(λ) is the power spectral density of the stationary Gaussian processX.On the other hand,

      A significant property of the entropy rate is the AEP (asymptotic equipartition property),also known as the the Shannon-McMillan-Breiman theorem,which states that ifhμis the entropy rate of a finite-valued stationary ergodic process{Xn},then

      with probability 1.The entropy ratehμquantifies the irreducible randomness in sequences produced by a stationary source;the randomness that remains after the correlations and structures in longer and longer sequence blocks are taken into account.

      For a discrete-valued stationary processX,by the definition of the entropy rate,

      However,the valuehμindicates nothing about howH(n)/napproaches this limit.Moreover,there may be sublinear terms inH(n).For example,one may haveH(n)~nhμ+corH(n)~nhμ+logn.The sublinear terms inH(n) and the manner in whichH(n) converges to its asymptotic form may reveal important structural properties about a stationary process.

      Definition 4.2LetX={Xn,n ∈Z} be a stationary process with finite block entropy,andhμbe the entropy rate ofX.Then,

      1) ifXis a discrete-valued process,the excess entropy ofXis

      2) ifXis a continuous-valued process,the excess entropy ofXis

      Remark 4.3This definition follows from [8].Note that for discrete-valued stationary process andn ∈Z+,

      Then{?H(n)-hμ}nis a monotonically nonincreasing and nonnegative sequence and converges to 0.Moreover,the excess entropy isE=+∞or a nonnegative constant.

      For continuous-valued stationary process andn ∈Z+,

      If the entropy ratehμ>-∞,{?h(n)-hμ}nis monotonically nonincreasing and nonnegative sequence and converges to 0.

      ?H(n)-hμ=H(Xn|X1,···,Xn-1)-hμis referred to as a per-symbol redundancyr(n),because it tells us how much additional information must be gained about the process in order to reveal the actual per-symbol uncertaintyhμ.In other words,the excess entropyEis the summation of per-symbol redundancy [8].Note that for a stationary process with finite block entropy and a finite entropy rate,the conditional entropy isH(Xn|Xn-1,···,X1)(orh(Xn|Xn-1,···,X1)) converges decreasingly tohμ),the excess entropyE<+∞if the rate of convergence is fast,andE=+∞if the rate of convergence is slow.Substituting?H(n)=(H(n)-H(n-1)) into the definition of the excess entropy,we know that

      If the excess entropy is finite,we obtain thatH(n)≈nhμ+Easn →∞.

      Using the notion of entropy rate,one can see that the past has little to say about the future.

      Proposition 4.4IfXis a stationary process with finite block entropy,and the entropy ratehμ>-∞,then

      Thus,the dependence between adjacentn-blocks of a stationary process with bounded entropy rate does not grow linearly withn.

      ProofSuppose thatXis a discrete-valued process.By definition,

      SinceXis stationary,we obtain thatIp-f(n)=2H(n)-H(2n).It follows that

      IfXis a continuous-valued process with bounded entropy ratehμ>-∞,the proof is similar.

      We give two examples for values of excess entropyE.

      Example 2For independent and identical distribution discrete-valued processes,the entropy rate ishμ=H(1),sinceH(n)=nH(1),and thus the excess entropy isE=0.

      Example 3For an irreducible positive recurrent Markov chainXdefined on a countable number of states,given the transition matrixPij,the entropy rate ofXis given by

      whereuiis the stationary distribution of the chain.By the Markovian property,the excess entropy of the Markov chain is

      Finally we discuss the relationship between the excess entropy and the mutual information.One useful Lemma is given below.

      Lemma 4.5Let{an,n ∈Z+} be a nonincreasing positive series and letSuppose thatThenAnis convergent if and only ifBnis convergent,and these have the same limit when they are convergent.

      Suppose thatBnis convergent.Noticing that

      it follows that?ε>0,?N>0,?n,s ∈N,n>N,

      As a result,

      We obtain,forn>N,that

      The lemma is proven.

      The following result shows that the excess entropy and the mutual information are identical for a stationary process with finite block entropy:

      Theorem 4.6LetX={Xn,n ∈Z} be a stationary process with finite block entropy.Then the excess entropy and the mutual information are identical:

      ProofWe prove the result for a discrete-valued stationary process and a continuousvalued process,respectively.

      First,we suppose thatXis a discrete-valued process.Denote that

      Denote thatan=?H(n)-hμ,and by the definition of entropy rate,we know thatan ≥0,an ≥an+1for a positive integern,and thatFurthermore,we have that

      We conclude that

      In fact,the first inequality follows from{an} is nonnegative,and the second inequality follows from the fact that{an} is nonincreasing.

      SinceEnis the partial summation of the nonnegative series{an},Enis nondecreasing.Notice that

      and that bothIp-f(n) andDnare nondecreasing.It follows that the following three limits exist:

      By (4.2) and Lemma 4.5,EnandDnare convergent at the same time,and have the same limit.Furthermore,we have that

      WhenEnandDnare not convergent at the same time,sinceEnandDnare nondecreasing,by (4.2),we have that

      The theorem is proven for a discrete-valued process.

      Now we suppose thatXis a continuous-valued stationary process.

      If the differential entropy ratehμis finite,i.e.,hμ>-∞,set that=?h(n)-hμ,n=1,2,···.Then{} is a nonnegative and nonincreasing sequence that converges to 0.By the same argument as for the discrete-valued process,one can show thatIp-f=E.

      If the differential entropy ratehμis infinite,i.e.,hμ=-∞,we have that=?h(n)-hμ=∞forn=1,2,···,becauseXadmits finite block entropy.By the definition of excess entropy,

      On the other hand,setting thatare the past and future ofX,by the third claim in Lemma 2,

      Since the differential entropy rate ishμ(X)=h(X1|)=-∞,we conclude that

      Hence,Ip-f=E=+∞.

      The proof for continuous-valued process is complete.

      Remark 4.71) The equalityE=Ip-fis claimed in [8]for a stationary process with a discrete state space;here an heuristic “proof” was also given.The proof is simple so is omitted here.

      2) The definition of excess entropyEdepends on the entropy ratehμ.The equationE=Ip-f=Dprovides two series to approximateIp-fand the excess entropy{2H(n)-H(2n)}n,as well as{nH(n-1)-(n-1)H(n)}n,which enables us to obtain the lower bound of the excess entropyEwithout knowing the entropy ratehμ.

      3) For a continuous-valued stationary process with finite block entropy,ifhμ=-∞,thenIp-f=E=+∞.This is always long memory.

      5 Conclusion

      The finiteness or infiniteness of mutual information between past and futureIp-fcan be regarded as a sign between the short memory and long memory stationary processes.For a stationary process with finite block entropy,Ip-fis the same as for the excess entropyE,which provides a good approximation ofIp-f.The definition ofIp-fand the excess entropy of a stationary process require a very weak moment condition on the distribution of the process,and can be applied to processes with distributions without a bounded second moment.A significant property ofIp-fis that it is invariant under one-to-one transformation.The invariance enables us to know theIp-fof a stationary process from theIp-fof other processes.Since conditional entropy can capture the dependence between random variables well,Ip-fand excess entropy are relevant for capturing the dependence of a stationary process whose distribution far from a Gaussian distribution.For stationary Gaussian processes,the long memory in the sense ofIp-fis a bit more strict than for that in the sense of covariance.For fractional Gaussian noise,theIp-f=∞if and only ifH ∈(,1).An important problem here is to provide an effective algorithm for approximatingIp-for the excess entropyE,which is essential in future of applications.It would also be interesting to use an informatic approach to consider the long memory behaviors of harmonizable processes and measure preserving transformations.

      Conflict of InterestYiming Ding is an editorial board member for Acta Mathematica Scientia and was not involved in the editorial review or the decision to publish this article.All authors declare that there are no competing interests.

      特克斯县| 花垣县| 马尔康县| 宜良县| 左云县| 八宿县| 甘洛县| 靖西县| 东阿县| 麻城市| 阜阳市| 望都县| 卢氏县| 开原市| 高台县| 博爱县| 崇阳县| 旬邑县| 九江县| 大安市| 漠河县| 广东省| 周至县| 大洼县| 湘乡市| 仙游县| 巴青县| 西丰县| 柯坪县| 临汾市| 蒙阴县| 富顺县| 四子王旗| 年辖:市辖区| 宜城市| 合山市| 洪湖市| 齐齐哈尔市| 博罗县| 舒城县| 丰宁|