기능 데이터 분석

기능 데이터 분석(FDA)은 연속체에 걸쳐 변화하는 곡선, 표면 또는 기타 모든 것에 대한 정보를 제공하는 데이터를 분석하는 통계학의 한 분야다.가장 일반적인 형태에서 FDA 프레임워크에서는 기능 데이터의 각 샘플 요소를 무작위 함수로 간주한다.이러한 기능이 정의되는 물리적 연속체는 종종 시간이지만 공간 위치, 파장, 확률 등이 될 수도 있다.본질적으로 기능 데이터는 무한한 차원이다.이러한 데이터의 높은 내적 차원성은 이론과 계산에 대한 도전을 가져오는데, 이러한 도전은 기능 데이터를 샘플링하는 방법에 따라 달라진다.그러나 데이터의 높거나 무한한 치수 구조는 풍부한 정보의 원천이며 연구와 데이터 분석에는 흥미로운 과제가 많다.

역사

기능적 데이터 분석은 1940년대와 1950년대에 그레난더와 카루넨에 의해 다시 작동되기 시작했다.^[1]^[2]^[3]^[4]그들은 정사각형 통합형 연속 시간 확률적 과정을 현재 카루넨-로이브 분해로 알려진 고유 성분으로 분해하는 것을 고려했다.기능적 주성분 분석의 엄격한 분석은 1970년대에 클레프, 도수아, 푸스에 의해 고유값의 점증적 분포에 대한 결과를 포함하여 수행되었다.^[5]^[6]보다 최근 1990년대와 2000년대에 이 분야는 응용 분야에 더 초점을 맞추고 밀도 있고 희박한 관찰 계획의 영향을 이해했다.기능적 데이터 분석(Functional Data Analysis)이라는 용어는 제임스 오에 의해 만들어졌다. 램지.^[7]

수학적 형식주의

무작위 함수는 힐버트 공간에서 값을 취하는 무작위 요소 또는 확률적 과정으로 볼 수 있다.전자는 수학적으로 편리한 반면 후자는 적용된 관점에서 어느 정도 더 적합하다.이 두 가지 접근방식은 무작위 함수가 연속적이고 평균 제곱 연속성이라는 조건이 충족되면 일치한다.^[8]

힐베르트 랜덤 변수

In the Hilbert space viewpoint, one considers an $H$ -valued random element $X$ , where $H$ is a separable Hilbert space such as the space of square-integrable functions $L^{2}[0,1]$ . Under the integrability condition that $\mathbb {E} \|X\|_{L^{2}}^{2}=\mathbb {E} (\int _{0}^{1}|X(t)|^{2}dt)<\infty$ $\mathbb {E} \ X\ _{L^{2}}^{2}=\mathbb {E} (\int _{0}^{1} X(t) ^{2}dt)<\infty$ , one can define the mean of $X$ as the unique element $\mu \in H$ satisfying

\displaystyle \mathb {E} \langle X, h\rangele =\langle \mu,h\rangele,\qquad h\in H.}

This formulation is the Pettis integral but the mean can also be defined as Bochner integral $\mu =\mathbb {E} X$ . Under the integrability condition that $\mathbb {E} \ X\ _{L^{2}}^{2}$ is finite, the covariance operator of $X$ is a linear operat또는 ${\mathcal {C}}:H\to H$ : ${\mathcal {C}}:H\to H$ → ${\mathcal {C}}:H\to H$ ${\displaystyle {\mathcal{C}:$ 관계에 ${\mathcal {C}}:H\to H$ 의해 고유하게 정의된 $H\to H}$

{\mathcal {C}h=\mathb {E}[\langle h,X-\mu \rangele (X-\mu )],\qquad h\in H,

${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ 텐서 형태로 C ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ = ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ [ ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ ( X - ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ ) ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ ( X ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ - ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ ) ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ ${\displaystyle {\mathcal{C}=\mathb {E}[(X-\mu )\otimes (X-\mu$ ${\mathcal {C}}=\mathbb {E} [(X-\mu )\otimes (X-\mu )]$ 스펙트럼 정리는 카루넨-로이브 분해로서 $X$ $X$ $X$ 을(를) 분해할 수 있다.

X=\mu +\sum _{i=1}^{\npty }\langle X,\varphi _{i}\angle \varphi _{i}}

${\mathcal {C}}$ 서 $\varphi _{i}$ $\varphi _{i}$ ${\$ 는 $\varphi _{i}$ C {\ $displaystyle {\$ mathcal{ $C$ ${\mathcal {C}}$ 의 $비부정$ 고유값에 해당하는 ${\mathcal {C}}$ ${\$ 의 고유 벡터다.이 무한 시리즈를 유한한 순서로 잘라내는 것은 기능적 주성분 분석을 뒷받침한다.

확률적 과정

Hilbertian의 관점은 수학적으로 편리하지만 추상적이다; $L^{2}[0,1]$ 2 [ $L^{2}[0,1]$ $L^{2}[0,1]$ $L^{2}[0,1]$ ${\$ displaystyle H $}$ 과 $H$ $L^{2}[0,1]$ (와) 같은 H ${\displaystyle$ H}과(와) {\ $displaystyle L^{2}[0,1]}}$ 과 $($ 와)와 같은 일반적인 선택사항은 기능이 아닌 동등성 등급으로 구성되기 때문에 위의 $X$ 조차 하지 $X$ 않는다.s. 확률적 공정 관점은 $X$ $X$ 을(를) 변수의 집합으로 본다 $X$ .

\{X(t)\}_{t\in [0,1]}}}

단위 간격(또는 보다 일반적인 구간 ${\mathcal {T}}$ ${\$ 에 의해 색인화됨. ${\mathcal {T}}$ 평균 및 공분산 함수는 다음과 같이 포인트 방식으로 정의된다.

\mu(t)=\mathb {E}X(t),\qquad \Sigma(s,t)={\textrm {Cov}(X),X(t)),\qquad s,t\in [0,1]

$t\in [0,1]$ $\mathbb {E} [X(t)^{2}]<\infty$ E [ $\mathbb {E} [X(t)^{2}]<\infty$ ( $\mathbb {E} [X(t)^{2}]<\infty$ ) $\mathbb {E} [X(t)^{2}]<\infty$ $\mathbb {E} [X(t)^{2}]<\infty$ < $\mathbb {E} [X(t)^{2}]<\infty$ ${\displaystyle \mathb {E}[X(t)^{2}]<\inflt })$ 모든 t $t\in [0,1]$ [ $t\in [0,1]$ $t\in [0,1]$ $t\in [0,1]$ ${\displaystyty t\$ in $[0,1]$ 에 해당 $\mathbb {E} [X(t)^{2}]<\infty$ ). $t\in [0,1]$

평균 제곱 연속성 하에서 $\mu$ $\mu$ 및 $\mu$ $\Sigma$ ${\$ $displaystyle \Sigma$ $}$ 은 $($ 는) 연속 함수로서 $\Sigma$ 공분산 연산자 ${\mathcal {C}}:H\to H$ : ${\mathcal {C}}:H\to H$ → ${\mathcal {C}}:H\to H$ ${\displaystyle {\mathcal}을($ 는 $\Sigma$ ) 정의한다 ${\mathcal {C}}:H\to H$ $H\to H}$ 을 ${\mathcal {C}}:H\to H$ (를) 부여한 b

{\displaystyle({\mathcal{C}f)(t)=\int _{0}^{1}\Sigma(s,t)f\,\mathrm {d}s.}

(1)

스펙트럼 정리는 ${\mathcal {C}}$ ${\$ 에 적용되며 ${\mathcal {C}}$ 고유페어 $(\lambda _{j},\varphi _{j})$ $(\lambda _{j},\varphi _{j})$ , $(\lambda _{j},\varphi _{j})$ j $(\lambda _{j},\varphi _{j})$ ) ${\displaystyle(\lambda _{j}\varphi _{j$ 을 생성하여 텐서 제품 표기법 ${\mathcal {C}}$ ${\$ 에 ${\mathcal {C}}$ 기록한다.

{\mathcal{C}=\sum _{j=1}^{\inful }\lambda _{j}\varphi _{j}\otimes \varphi _{j}.

더욱이 ${\mathcal {C}}f$ ${\mathcal {C}}f$ ${\mathcal {C}f$ 은(는) $f\in H$ $f\in H$ $f\in H$ H $f\in H$ 에 대해 연속적이므로 ${\mathcal {C}}f$ $f\in H$ $\varphi _{j}$ { $\varphi _{j}$ j {\ $displaystyle \varphi_{j$ }은 $\varphi _{j}$ 연속적이다.머서의 정리에는 다음과 같이 되어 있다.

{\displaystyle \sup _{s,t\in [0,1]\왼쪽 \Sigma(s,t)-\sum _{j=1}^{K}\lambda _{j}\varphi _{j}(t)\right \to 0,\qquad K\to.

Finally, under the extra assumption that $X$ has continuous sample paths, namely that with probability one, the random function $X:[0,1]\to \mathbb {R}$ is continuous, the Karhunen-Loève expansion above holds for $X$ and the Hilbert space machinery can be sub가감 없이 응용한Kolmogorov 연속성 정리를 사용하여 샘플 경로의 연속성을 나타낼 수 있다.

기능 데이터 설계

Functional data are considered as realizations of a stochastic process $X(t),\ t\in [0,1]$ that is an $L^{2}$ process on a bounded and closed interval $[0,1]$ with mean function ${\displaystyle \mu (t)=\mathbb$ ${E}(X(t)}$ 및 $\mu (t)=\mathbb {E} (X(t))$ 공분산 함수 $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ , t $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ) $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ = $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ( $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ) $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ , X $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ( ) , $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ( $\Sigma (s,t)={\textrm {Cov}}(X(s),X(t))$ ) ${\displaystyle \Sigma(s,t)={\rm {Cov}(X),$ X $(t)}$ .i번째 과목에 대한 프로세스의 $X_{i}(\cdot )$ 은 X $X_{i}(\cdot )$ ( $X_{i}(\cdot )$ ) $X_{i}(\cdot )$ {\ $displaystyle X_{i}(\cdot$ $X_{i}(\cdot )$ 이고, $n$ 은 n $n$ 개의 $n$ 독립 과목으로 구성된 것으로 가정한다.샘플링 일정은 대상별로 다를 수 있으며 $T_{i1},...,T_{iN_{i}}$ $T_{i1},...,T_{iN_{i}}$ $T_{i1},...,T_{iN_{i}}$ , $T_{i1},...,T_{iN_{i}}$ . . , $T_{i1},...,T_{iN_{i}}$ $T_{i1},...,T_{iN_{i}}$ $T_{i1},...,T_{iN_{i}}$ ${\$ 로 표시된다. $,T_{iN_{i}}$ i-th 과목의 경우 $T_{i1},...,T_{iN_{i}}$ .The corresponding i-th observation is denoted as ${\textbf {X}}_{i}=(X_{i1},...,X_{iN_{i}})$ , where $X_{ij}=X_{i}(T_{ij})$ . In addition, the measurement of $X_{ij}$ is assumed to have random noise $\epsilon _{ij}$ with $\mathbb {E} (\epsilon _{ij})=0$ and ${\textrm {Var}}(\epsilon _{ij})=\sigma _{ij}^{2}$ , which are independent across $i$ and $j$ .

1. 임의의 고밀도 그리드에서 잡음 없이 완전히 관측된 기능

측정 $Y_{it}=X_{i}(t)$ = $Y_{it}=X_{i}(t)$ i ( $Y_{it}=X_{i}(t)$ ) $Y_{it}=X_{i}(t)$ {\ $displaystyle Y_{it}=$ $t\in {\mathcal {I}},\,i=1,\ldots ,n$ ${i}(t)}$ $t\in {\mathcal {I}},\,i=1,\ldots ,n$ t $t\in {\mathcal {I}},\,i=1,\ldots ,n$ I $t\in {\mathcal {I}},\,i=1,\ldots ,n$ , $t\in {\mathcal {I}},\,i=1,\ldots ,n$ $t\in {\mathcal {I}},\,i=1,\ldots ,n$ {\ $displaysty t\in$ {\ $mathcal {I},\i=1,\ldots,n}$ 에 대해 사용 $Y_{it}=X_{i}(t)$ 가능

종종 비현실적이지만 수학적으로 편리하다.

실생활의 예:티케이터 스펙트럼 데이터.^[7]

2. 노이즈 측정으로 촘촘하게 샘플링한 기능(센스 설계)

측정 $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ = $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ i $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ ) $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ + $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ i $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ ${\$ $T_{ij}$ 서 $T_{ij}$ j $T_{ij}$ {\ $displaystyty T_{ij}}}$ 는 일반 그리드에 기록된다 $T_{ij}$ .

$T_{i1},\ldots ,T_{iN_{i}}$ $T_{i1},\ldots ,T_{iN_{i}}$ $T_{i1},\ldots ,T_{iN_{i}}$ , $T_{i1},\ldots ,T_{iN_{i}}$ … $T_{i1},\ldots ,T_{iN_{i}}$ , $T_{i1},\ldots ,T_{iN_{i}}$ $T_{i1},\ldots ,T_{iN_{i}}$ i {\ $displaystyle T_$ {i1 $},\ldots,$ T_{ $iN_{$ i $T_{i1},\ldots ,T_{iN_{i}}$ $N_{i}\rightarrow \infty$ → $∞$ {\ $displaystyle N_{i}\오른쪽$ 화살표 $\}$ 이 일반적인 기능 데이터에 적용된다 $N_{i}\rightarrow \infty$ .

실생활의 예:버클리 성장 연구 데이터 및 주식 데이터

3. 노이즈 측정(종도 데이터)으로 함수를 희박하게 샘플링함수

Measurements $Y_{ij}=X_{i}(T_{ij})+\varepsilon _{ij}$ , where $T_{ij}$ are random times and their number $N_{i}$ per subject is random and finite.

실생활 예시: 에이즈 환자의 CD4 카운트 데이터.^[9]

기능주성분 분석

FPCA(기능 주성분 분석)는 FDA에서 가장 보편적인 도구로, 부분적으로는 FPCA가 본질적으로 무한 차원 기능 데이터를 유한 차원 무작위 점수 벡터로 치수 축소를 용이하게 하기 때문이다.구체적으로는 $X$ $X$ 에서 공분산 연산자의 고유특성으로 구성된 기능적 기준으로 $X_{i}(t)$ X $X_{i}(t)$ ( t ) $X_{i}(t)}$ 에 기초하여 관측된 무작위 궤적을 확장함으로써 치수 감소를 달성한다 $X$ 공분산 ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ C : ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ [ ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ ] ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ → ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ [ 0 ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ 을 고려한다. ${\displaystyle {\mathcal{C}:$ $L^{2}[0,1]\오른쪽$ 화살표 $L^{2}[0,1]}$ 는 ${\mathcal {C}}:L^{2}[0,1]\rightarrow L^{2}[0,1]$ 힐버트 공간의 콤팩트한 연산자다.

By Mercer's theorem, the kernel of ${\mathcal {C}}$ , i.e., the covariance function $\Sigma (\cdot ,\cdot )$ , has spectral decomposition ${\displaystyle \Sigma (s,t)=\sum _{k=1}^{\infty }\lambda _{k}\varphi _{k}(s)\va$ $rphi _{k}(t)}$ , where the series convergence is absolute and uniform, and $\lambda _{k}$ are real-valued nonnegative eigenvalues in descending order with the corresponding orthonormal eigenfunctions $\varphi _{k}(t)$ . By the Karhunen–Loève theorem, the FPCA expansion of an underlying random trajectory is $X_{i}(t)=\mu (t)+\sum _{k=1}^{\infty }A_{ik}\varphi _{k}(t)$ , where $A_{ik}=\int _{0}^{1}(X_{i}(t)-\mu (t))\varphi _{k}(t)dt$ are the functional주성분(FPC), 때로는 점수라고도 한다.The Karhunen–Loève expansion facilitates dimension reduction in the sense that the partial sum converges uniformly, i.e., $\sup _{t\in [0,1]}\mathbb {E} [X_{i}(t)-\mu (t)-\sum _{k=1}^{K$ $A_{ik}\varphi _{k}(t)^{2}\{2$ }\오른쪽 $화살표$ 0}을(를) $K\rightarrow \infty$ K → $【\디스플레이 스타일$ K\ $오른쪽$ 화살표 $\ft}}}$ 로 $\sup _{t\in [0,1]}\mathbb {E} [X_{i}(t)-\mu (t)-\sum _{k=1}^{K}A_{ik}\varphi _{k}(t)]^{2}\rightarrow 0$ (를) 표시하므로 $K\rightarrow \infty$ $K$ ${\디스플레이$ 스타일 K $}$ 이(가) 충분히 큰 부분 합은 무한대의 근사치를 잘 산출한다 $K$ .따라서 $X_{i}$ ${\$ 의 정보는 $무한$ 치수에서 K $K$ $A_{i}=(A_{i1},...,A_{iK})$ A $A_{i}=(A_{i1},...,A_{iK})$ = $A_{i}=(A_{i1},...,A_{iK})$ ( $A_{i}=(A_{i1},...,A_{iK})$ $A_{i}=(A_{i1},...,A_{iK})$ , $A_{i}=(A_{i1},...,A_{iK})$ . $A_{i}=(A_{i1},...,A_{iK})$ . $A_{i}=(A_{i1},...,A_{iK})$ $A_{i}=(A_{i1},...,A_{iK})$ ) ${\displaystyle A_{i}=(A_{i1},...,...)$ 로 축소된다 $X_{i}$ . $근사$ 공정이 있는 A_ ${iK}}:$

X_{i}^{(K)}=\mu(t)+\sum _{k=1}^{K}A_{ik}\varphi _{k}(t)

(2)

다른 인기 있는 베이스로는 스플라인, 푸리에 시리즈, 웨이블렛 베이스 등이 있다.FPCA의 중요한 적용은 변동 모드와 기능적 주성분 회귀 분석을 포함한다.

기능 선형 회귀 모형

기능적 선형 모델은 벡터 반응을 벡터 공변량과 연결하는 전통적인 다변량 선형 모델의 확장으로 볼 수 있다.스칼라 $Y\in \mathbb {R}$ $Y\in \mathbb {R}$ $Y\in \mathbb {R}$ R ${\$ 및 $Y\in \mathbb {R}$ 벡터 $X\in \mathbb {R} ^{p}$ X $X\in \mathbb {R} ^{p}$ $X\in \mathbb {R} ^{p}$ $X\in \mathbb {R} ^{p}$ ${\$ 을(를) 갖는 전통적인 선형 모델은 다음과 같이 표현할 수 있다 $X\in \mathbb {R} ^{p}$ .

Y=\beta _{0}+\langle X,\beta \brangle +\barepsilon =\beta _{0}+X_{1}\dots +X_{p}\beta _{p}\p}\beta _{p}+\varepsilon}}}}}}}}}}}}}}}}}

(3)

where $\langle \cdot ,\cdot \rangle$ denotes the inner product in Euclidean space, $\beta _{0}\in \mathbb {R}$ and $\beta \in \mathbb {R} ^{p}$ denote the regression coefficients, and $\varepsilon$ is a zero mean finite분산 랜덤 오차(평균).기능적 선형 모델은 반응에 따라 두 가지 유형으로 나눌 수 있다.

스칼라 반응이 있는 기능 회귀 모형

Replacing the vector covariate $X$ and the coefficient vector $\beta$ in model (3) by a centered functional covariate $X^{c}(t)=X(t)-\mu (t)$ and coefficient function $\beta =\beta (t)$ for $t\in [0,1]$ ${\displaystyle t\in [0$ $,$ $1]},$ Hilbert $t\in [0,1]$ $L^{2}$ L $L^{2}$ {\ $displaystyle$ L $^{2$ 기능적 선형 모델에 도달함

Y=\beta _{0}+\langle X^{c},\beta \brangle +\barerpsilon =\beta _{0}^{0}}}X^{c}(t)\dt+\varepsilon.

(4)

The simple functional linear model (4) can be extended to multiple functional covariates, $\{X_{j}\}_{j=1}^{p}$ , also including additional vector covariates $Z=(Z_{1},\cdots ,Z_{q})$ , where $Z_{1}=1$ , by

Y=\langle Z,\theta \rangle +\sum _{j=1}{p}\int _{0}^{1}X_{j}^{j}(t)\beta _{j}\{j}(t)\,dt+\varepsilon ,

(5)

where $\theta \in \mathbb {R^{q}}$ is regression coefficient for $Z$ , the domain of $X_{j}$ is $[0,1]$ , $X_{j}^{c}$ is the centered functional covariate given by $X_{j}^{c}(t)=X_{j}(t)-\mu _{j}(t)$ $X_{j}^{c}(t)=X_{j}(t)-\mu _{j}(t)$ , and $\beta _{j}$ is regression coefficient function for $X_{j}^{c}$ , for $j=1,\ldots ,p$ . Models (4) and (5) have been studied extensively.^[10]^[11]^[12]

기능적 반응을 갖는 기능적 회귀 모형

. 두개의 주요 모델 이 설치를 고려하고 있[0,1]{\displaystyle[0,1]}과, 다수의 기능 covariates Xj에 대한 기능 반응 Y(s){Y(s)\displaystyle}({\displaystyle X_{j}(t)}, 터∈[0,1]{\displaystyle t\in[0,1]}, j=1,…,{\displaystyle j=1,\ldots ,p}를 생각해 보자.[13][7]일반적으로 기능 선형 모델(FLM)이라고 하는 이 두 모델 중 하나는 다음과 같이 쓸 수 있다.

Y(s)=\alpha _{0}s(s)+\sum _{j=1}{p}\int _{0}^{1}^{1}\lpha _{j}X_{j}^{c}(t)\,dt+\varepsilon(s),\{\{{{text for}}\sin [0,1

(6)

where $\alpha _{0}(s)$ is the functional intercept, for $j=1,\ldots ,p$ , $X_{j}^{c}(t)=X_{j}(t)-\mu _{j}(t)$ is a centered functional covariate on $[0,1]$ , $\alpha _{j}(s,t)$ $\alpha _{j}(s,t)$ ) $\process _{j}(s,t)$ 은 각각 같은 도메인을 가진 해당 기능 기울기이며 $\alpha _{j}(s,t)$ , $\varepsilon (s)$ $\varepsilon (s)$ ${\displaystyle \varepsilon(s)$ 은 평균 0과 유한 분산을 갖는 랜덤 공정이다 $\varepsilon (s)$ .^[13]In this case, at any given time $s\in [0,1]$ , the value of $Y$ , i.e., $Y(s)$ , depends on the entire trajectories of $\{X_{j}(t)\}_{j=1}^{p}$ . Model (6) has been studied extensively.^[14]^[15]^[16]^[17]^[18]

척도 내 기능 회귀 분석

$X_{j}(\cdot )$ $X_{j}(\cdot )$ j ( $X_{j}(\cdot )$ ) $X_{j}(\cdot )$ 을(를) 상수함수로 $X_{j}(\cdot )$ 취하면 모델(6)의 특수한 사례가 발생한다.

Y(s)=\alpha _{0}+\sum _{j=1}^{p}X_{j}\alpha _{j}+\varepsilon(s),\ {\text{for}\s\in [0,1},},},

기능적 반응과 스칼라 공변량을 갖는 기능적 선형 모형이다.

동시 회귀 모형

이 모델은...

Y(s)=\beta _{0}s(s)+\sum _{j=1}^{p}\beta _{j}X_{j}+\varepsilon(s)\\\\\\\{\text{for}\in [0,1],

(7)

where $X_{1},\ldots ,X_{p}$ are functional covariates on $[0,1]$ , $\beta _{0},\beta _{1},\ldots ,\beta _{p}$ are the coefficient functions defined on the same interval and $\varepsilon (s)$ 평균 분산이 0이고 분산이 유한한 랜덤 공정으로 가정한다.^[13]This model assumes that the value of $Y(s)$ depends on the current value of $\{X_{j}(s)\}_{j=1}^{p}$ only and not the history $\{X_{j}(t):t\leq s\}_{j=1}^{p}$ or future value.따라서, "변동-코효율적" 모델로도 언급되는 "동류 회귀 모델"이다.또한, 다양한 추정 방법이 제안되었다.^[19]^[20]^[21]^[22]^[23]^[24]

기능 비선형 회귀 분석 모형

고전적 기능 선형 회귀 모형(FLM)의 직접 비선형 확장에는 여전히 선형 예측 변수가 포함되지만, 이 예측 변수를 기존의 선형 모델에서 일반화된 선형 모델의 개념과 유사하게 비선형 연결 함수와 결합한다.기능적 데이터에 대한 완전한 비모수 회귀 모델을 향한 개발은 차원성의 저주와 같은 문제에 직면한다."곡선"과 측정지표 선택 문제를 우회하기 위해 일부 구조적 제약이 따르지만 지나치게 유연성을 침해하지 않는 비선형 기능 회귀 모델을 고려하도록 동기를 부여한다.사람들은 기능적 선형 모델보다 더 유연하면서도 다항식 수렴 속도를 유지하는 모델을 원한다.이러한 모델은 기능적 선형 모델에 대한 진단이 실제 상황에서 자주 발생하는 적합성 결여를 나타내는 경우에 특히 유용하다.특히 기능적 다항식 모델, 기능적 단일 및 다중 지수 모델, 기능적 가법적 모델은 기능적 비선형 회귀 모델의 3가지 특수한 경우다.

기능 다항식 회귀 분석 모형

기능적 다항식 회귀 모형은 다항식 회귀 모형으로 선형 회귀 모형을 확장하는 것과 유사하게 스칼라 반응을 갖는 기능적 선형 모형(FLM)의 자연스러운 확장으로 볼 수 있다.스칼라 반응 $Y$ $Y$ 및 $Y$ 도메인 $[0,1]$ [ $0$ 1]이 [0, $]$ 인 $X(\cdot )$ 기능 $X(\cdot )$ X $(\cdot$ ){\ $displaystyle$ [0 $,1]$ 이 있고 $[0,1]$ 해당 중심 예측 변수가 $X^{c}$ $X^{c}$ ${\\$ 을 처리하는 경우 $X^{c}$ 기능 다항식 레지스트리 계열에서 가장 단순하고 가장 두드러지는 멤버sion 모델은 다음과 같이 주어진 2차 기능 회귀^[25] 분석이다.

\mathbb {E} (Y X)=\alpha +\int _{0}^{1}\beta (t)X^{c}(t)\,dt+\int _{0}^{1}\int _{0}^{1}\gamma (s,t)X^{c}(s)X^{c}(t)\,ds\,dt

where

X^{c}(\cdot )=X(\cdot )-\mathbb {E} (X(\cdot ))

is the centered functional covariate,

\alpha

is a scalar coefficient,

\beta (\cdot )

and

\gamma (\cdot ,\cdot )

are coefficient functions 도메인

[0,1]

[

[0,1]

[0,1]

[0,1]

{\displaystyle [0,1]

과

[0,1]

[0,1]\times [0,1]

[

[0,1]\times [0,1]

[0,1]\times [0,1]

[0,1]\times [0,1]

[

[0,1]\times [0,1]

]

{\displaystyle [0,1]\displaysty

[

0,1]

을 각각 포함하는

[0,1]\times [0,1]

위의 기능적 2차 회귀 모델이 FLM과 공유하는 파라미터 함수 β 외에도 파라미터 표면 γ이 특징이다.스칼라 반응을 가진 FLM과 유사하게 기능 다항식 모델의 추정은 중심 공변량

{\

와

X^{c}

계수 함수

\beta

{\displaystyle \beta },

\gamma

\gamma

을 모두 정형 기준으로

\gamma

확장하여 얻을 수 있다.^[25]^[26]

기능적인 단일 및 다중 인덱스 모델

기능적 다중 지수 모델은 다음과 같이 제공되며, 기호는 앞에서 설명한 것과 같이 통상적인 의미를 갖는다.

{\displaystyle \mathb {E}(Y X)=g\left(\int _{0}^{1}X^{1}(t)\{1(t)\{1(t)\ldots _{0}^{0}X^{c(t)\{p}(t\right)

여기서 g는 p-차원 영역에 정의된 (알 수 없는) 일반적인 평활함수를 나타낸다.사례 p

p=1

=

p=1

{\displaystyle

p

=1

}은(는

)

기능적인 단일 인덱스 모델을 생성하는

p=1

반면 다중 인덱스 모델은

p>1

p

p>1

> 1

p>1

에 해당하지만

p>1

p>1

> 1

{\displaystyle

p

>1}

의 경우 이 모델은 차원성의 저주로 인해 문제가 있다.

p>1

>

p>1

{\displaystyle p>1

}과

p>1

(

와) 상대적으로 표본 크기가 작은 경우, 이 모델에 의해 주어진 추정기는 분산이 큰 경우가 많다.^[27]^[28]

기능적 첨가 모델(FAM)

For a given orthonormal basis $\{\phi _{k}\}_{k=1}^{\infty }$ on $L^{2}[0,1]$ , we can expand $X^{c}(t)=\sum _{k=1}^{\infty }x_{k}\phi _{k}(t)$ on the domain ${\displaystyl$ $e [0,1]}$ .

따라서 스칼라 반응을 갖는 기능적 선형 모델(3)은 다음과 같이 작성할 수 있다.

\mathb {E}(Y X)=\mathb {E}+\sum _{k=1}^{k=1}{{k=1}^{\infit }\beta _{k}x_{k}.

FAM의 한 형태는 위의 식

(

:

\beta _{k}x_{k}

\beta _{k}x_{k}

x k {\displaystyle

\beta _{

k

x_

}}})에서 x k {\displaystyle

x_{k}}}

의 선형함수를 일반 평활함수

f_{k}

k

f_{k}

{\

displaystystyle f_{k

에 대한 다중 선형 회귀 모델을 확장하는 것과 유사하게 하여 얻는다.다음과 같이 표현된다.

\mathb {E}(Y X)=\mathb {E}+\sum _{k=1}^{{k}(x_{k}),

어디 fk{\displaystyle f_{k}}가 E(fk()k)k∈ N{\displaystylek\in \mathbb{N}에)=0{\displaystyle \mathbb{E}(f_{k}({k}x_))=0}}.[13][7]은 일반적인 부드러운 기능 fk{\displaystyle f_{k}에 이 제약 조건}의 identifiability을 보장한다는 thes의 추정치이다.eadditive 성분 함수는 절편

\mathbb {E} (Y)

E

\mathbb {E} (Y)

(

\mathbb {E} (Y)

)

\mathb {E} (Y)

의 절편 항을 방해하지 않는다

\mathbb {E} (Y)

또 다른 형태의 FAM은 다음과 같이 표현되는 연속 첨가 모델이다.^[29]

\mathb {E}(Y X)=\mathb {E}+\int_{0}^{1}g(t,X(t)dt

for a bivariate smooth additive surface

g:[0,1]\times \mathbb {R} \longrightarrow \mathbb {R}

which is required to satisfy

\mathbb {E} [g(t,X(t))]=0

for all

t\in [0,1]

, in order to ensure identifiability.

일반화 함수 선형 모형

스칼라 반응을 갖는 FLM의 명확하고 직접적인 확장(3)은 일반화된 선형 모델(GLM)에 유추하여 일반화된 기능 선형 모델(GFLM)^[30]로 이어지는 연결 함수를 추가하는 것이다.GFLM의 세 가지 구성 요소는 다음과 같다.

선형 예측 변수 $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ = $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ 0 $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ + $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ 1 $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ ( $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ ) $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ ( t $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ ) d $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ ${\displaystyle \eta =\beta _{0}+\int _{0}^{$ 1} $X^{c^{c}(t$ )\,dt $\eta =\beta _{0}+\int _{0}^{1}X^{c}(t)\beta (t)\,dt$ [시스템 성분]
분산 함수 ${\text{Var}}(Y|X)=V(\mu )$ X ${\text{Var}}(Y|X)=V(\mu )$ ) ${\text{Var}}(Y|X)=V(\mu )$ = V ${\text{Var}}(Y|X)=V(\mu )$ ( ${\text{Var}}(Y|X)=V(\mu )$ ) ${\displaystyle {\textVar}(Y$ X $)=V(\mu$ ${\text{Var}}(Y|X)=V(\mu )$ 여기서 $\mu =\mathbb {E} (Y|X)$ = E $\mu =\mathbb {E} (Y|X)$ X $\mu =\mathbb {E} (Y|X)$ ) ${\display$ style $\mu =\mathb {E}(Y X)}$ 은 조건부 평균이다 $\mu =\mathbb {E} (Y|X)$ . [랜덤 구성요소]
$조건부$ 평균 $\mu$ $\mu$ 과 $\mu$ (와) 선형 예측 변수 $\eta$ $\eta$ 을 $\eta$ (를) $\mu =g(\eta )$ = $\mu =g(\eta )$ ( $\mu =g(\eta )$ )로 연결하는 $g$ $\mu =g(\eta )$ 연결 $함수$ g {\ $displaystystyle$ $\mu =g(\eta )}.$ [시스템 구성 요소]

기능 데이터의 클러스터링 및 분류

벡터 값 다변량 데이터의 경우 k-평균 분할 방법과 계층적 클러스터링은 두 가지 주요 접근법이다.벡터 값 다변량 데이터에 대한 이러한 고전적 클러스터링 개념은 기능 데이터로 확장되었다.기능 데이터의 클러스터링의 경우 계층적 클러스터링 방법보다 k-평균 클러스터링 방법이 더 인기가 있다.기능 데이터에 대한 k-평균 군집화의 경우, 평균 함수는 대개 군집 중심으로 간주된다.공분산 구조도 고려했다.^[31]k-평균형 군집화 외에도 혼합물 모델에 기초한 기능^[32] 군집화는 벡터 값 다변량 데이터를 군집화하는 데도 널리 사용되며 기능 데이터 군집화까지 확대됐다.^[33]^[34]^[35]^[36]^[37]더욱이 베이시안 계층적 클러스터링은 모델 기반의 기능적 클러스터링 개발에도 중요한 역할을 한다.^[38]^[39]^[40]^[41]

기능 분류는 기능적 회귀 분석이나 기능적 판별 분석에 기초하여 그룹 멤버십을 새로운 데이터 객체에 할당한다.기능 회귀 모형에 기초한 기능 데이터 분류 방법은 반응으로 클래스 수준을 사용하고 관측된 기능 데이터 및 기타 공변량을 예측 변수로 사용한다.회귀 기반 기능 분류 모델, 기능 일반화 선형 모형 또는 보다 구체적으로 이항 반응에 대한 기능 로지스틱 회귀 분석과 같은 기능 이항 회귀 분석이 일반적으로 사용되는 분류 접근법이다.보다 일반적으로 FPCA 접근법에 기초한 일반화된 기능 선형 회귀 모델을 사용한다.^[42]기능적 선형 판별 분석(FLDA)도 기능 데이터의 분류 방법으로 고려되었다.^[43]^[44]^[45]^[46]^[47]밀도비율을 포함하는 기능 데이터 분류도 제안되었다.^[48]큰 표본 한계에서 제안된 분류자의 점근거동에 대한 연구는 특정 조건에서 오분류율이 0으로 수렴되는 것을 보여주는데, 이것은 "완벽한 분류"^[49]라고 일컬어졌던 현상이다.

시간 뒤틀림

동기부여

Illustration of the motivation of time warping in the sense of capturing cross-sectional mean.

시간 변동을 무시할 경우 횡단 평균의 구조물이 파괴됨.반대로 단면 평균의 구조는 시간 변동을 복원한 후에 잘 잡힌다.

진폭 변동 외에도 시간 변동은 기능 데이터에 나타난다고 가정할 수 있다.^[50]시간 변동은 특정 관심 사건의 주제별 타이밍이 주제별로 다를 때 발생한다.고전적인 예로는 버클리 성장 연구 데이터가 있는데,^[51] 여기서 진폭 변화는 성장률이고 시간 변동은 부버탈과 부버탈 전 성장 스퍼트가 발생한 아이들의 생물학적 연령의 차이를 설명한다.시간 변동이 있는 경우, 단면 평균 함수는 피크와 수조가 랜덤하게 위치하므로 효율적인 추정치가 아닐 수 있으며, 따라서 의미 있는 신호가 왜곡되거나 숨겨질 수 있다.

시간 뒤틀림(time warping)은 곡선 등록,^[52] 곡선 정렬 또는 시간 동기화라고도 하며 진폭 변동과 시간 변동을 식별하고 분리하는 것을 목표로 한다.If both time and amplitude variation are present, then the observed functional data $Y_{i}$ can be modeled as $Y_{i}(t)=X_{i}[h_{i}^{-1}(t)],t\in [0,1]$ , where $X_{i}{\overset {iid}{\sim }$ $X}$ 은 $X_{i}{\overset {iid}{\sim }}X$ $h_{i}{\overset {iid}{\sim }}h$ $h_{i}{\overset {iid}{\sim }}h$ 로 h i ~ i $h_{i}{\overset {iid}{\sim }}h$ $h_{i}{\overset {iid}{\sim }}h$ 은 $h_{i}{\overset {iid}{\sim }}h$ 누적분포함수에 해당하는 잠진시간 뒤틀림 함수다.시간 $h$ 뒤틀림 함수 $h$ $h$ 은(는) 변환 불가능한 것으로 간주되며 $\mathbb {E} (h^{-1}(t))=t$ h $\mathbb {E} (h^{-1}(t))=t$ - $\mathbb {E} (h^{-1}(t))=t$ ( t $\mathbb {E} (h^{-1}(t))=t$ ) $\mathbb {E} (h^{-1}(t))=t$ = $\mathbb {E} (h^{-1}(t))=t$ ${\displaystyle \mathb {E}(h^{-1(t)=t$ .

위상 변동을 규정하는 가장 간단한 워핑 함수 계열의 경우는 선형 $h(t)=\delta +\gamma t$ 으로, 대상별 이동과 축척에 의해 기본 템플릿 함수의 시간을 뒤틀어 주는 $h(t)=\delta +\gamma t$ t $h(t)=\delta +\gamma t$ ) = $h(t)=\delta +\gamma t$ + $h(t)=\delta +\gamma t$ $h(t)=\delta +\gamma t$ ${\displaysty h(t)=\delta +\gamma t}$ 이다 $h(t)=\delta +\gamma t$ 보다 일반적인 종류의 뒤틀림 함수는 그 자체에 대한 도메인의 차이점형성, 즉 느슨하게 말하면 함수와 그 역수가 모두 매끄럽게 되도록 콤팩트한 도메인을 그 자체에 매핑하는 변환 불가능한 함수의 한 부류를 포함한다.선형 변환 집합은 차이점 유형 집합에 포함되어 있다.^[53]시간 뒤틀림에서 한 가지 난제는 진폭과 위상 변동의 식별 가능성이다.이 비식별성을 깨기 위해서는 구체적인 가정이 필요하다.

방법들

초기 접근법에는 음성 인식과 같은 응용 프로그램에 사용되는 동적 시간 뒤틀림(DTW)이 포함된다.^[54]시간 뒤틀림을 위한 또 다른 전통적인 방법은 랜드마크 등록인데,^[55]^[56] 이것은 피크 위치 같은 특별한 특징들을 평균적인 위치에 맞추는 것이다.그 밖에 쌍방향 뒤틀림,^[57] L ${\mathcal {L}}^{2}$ ${\$ 거리를 ${\mathcal {L}}^{2}$ ^[53] 이용한 등록, 탄력 뒤틀림 등이 관련된다.^[58]

동적 시간 뒤틀림

템플릿 함수는 단면 평균에서 시작하여 뒤틀린 곡선에 대해 단면 평균을 등록하고 재계산하는 반복 프로세스를 통해 결정되며, 몇 번의 반복 후에 수렴이 예상된다.DTW는 동적 프로그래밍을 통해 비용 기능을 최소화한다.DTW에서 매끄럽지 않은 상이한 뒤틀림이나 탐욕스러운 계산의 문제는 비용 함수에 정규화 용어를 추가하면 해결할 수 있다.

랜드마크 등록

랜드마크 등록(또는 형상 정렬)은 잘 표현된 형상이 모든 표본 곡선에 존재한다고 가정하고 그러한 형상의 위치를 골드 표준과 같이 사용한다.기능 또는 파생상품의 피크 또는 수조 위치와 같은 특수 기능은 템플릿 기능의 평균 위치에 정렬된다.^[53]그런 다음 평균 위치에서 대상별 위치로 매끄럽게 변환하여 뒤틀림 기능을 도입한다.랜드마크 등록의 문제점은 데이터의 소음 때문에 특징이 없거나 식별이 어려울 수 있다는 것이다.

확장

지금까지 우리는 1차원 시간영역에 정의된 스칼라 값비싼 확률적 과정, $\{X(t)\}_{t\in {\mathcal {T}}}$ { $($ ) $}$ t $\{X(t)\}_{t\in {\mathcal {T}}}$ T ${\$ \{X $(t)\}_{t\in$ {\ $mathcal{T$ }}}}을 $($ 를) 고려했다 $\{X(t)\}_{t\in {\mathcal {T}}}$

$X(\cdot )$ $X(\cdot )$ 의 다차원 도메인 ${\displaystyle$ X $(\cdot )}$

$X(\cdot )$ ( $X(\cdot )$ ) $X(\cdot )$ 의 도메인은 $R^{p}$ p ${\$ R $^{p}}$ 에 있을 수 있으며 $X(\cdot )$ $R^{p}$ 예를 들어 데이터는 랜덤 표면의 표본일 수 있다.^[59]^[60]

다변량 확률 공정

확률적 프로세스의 범위 세트는 $R$ $R$ 에서 $R^{p}$ p ${\$ 까지 $R$ 연장할 $R^{p}$ ^[61]^[62]^[63] 수 있으며, 나아가 비선형 다지관,^[64] Hilbert 공간^[65] 및 최종적으로 미터법 공간까지 확장될 수 있다.^[59]

R 패키지

일부 패키지는 고밀도 및 종단 설계 모두에서 기능 데이터를 처리할 수 있다.

참고 항목

추가 읽기

Ramsay, J. O.와 Silverman, B.W. (2005) 기능 데이터 분석, 2차 에디션, 뉴욕: 스프링거, ISBN 0-387-40080-X
Horvath, L. 및 Kokoszka, P.(2012) 뉴욕: Springer, ISBN 978-1-4614-3654-6
Hing, T. 및 Eubank, R. (2015) 기능 데이터 분석의 이론적 기초, 선형 연산자 소개, 확률 및 통계에서의 Wiley 시리즈, John Wiley & Sons, Ltd, ISBN 978-0-470-01691-6
Morris, J. (2015) 기능 회귀 분석, 연간 통계 및 통계 적용 검토, Vol. 2, 321 - 359, https://doi.org/10.1146/annurev-statistics-010814-020413
Wang et al. (2016) 기능 데이터 분석, 연간 통계 및 통계 적용 검토, Vol. 3, 257-295, https://doi.org/10.1146/annurev-statistics-041715-033624

범주:회귀분석

참조

^ Grenander, U. (1950). "Stochastic processes and statistical inference". Arkiv för Matematik. 1 (3): 195–277.
^ Rice, JA; Silverman, BW. (1991). "Estimating the mean and covariance structure nonparametrically when the data are curves". Journal of the Royal Statistical Society. 53 (1): 233–243.
^ Müller, HG. (2016). "Peter Hall, functional data analysis and random objects". Annals of Statistics. 44 (5): 1867–1887.
^ Karhunen, K (1946). Zur Spektraltheorie stochastischer Prozesse. Annales Academiae scientiarum Fennicae.
^ Kleffe, J. (1973). "Principal components of random variables with values in a seperable hilbert space". Mathematische Operationsforschung und Statistik. 4 (5): 391–406.
^ Dauxois, J; Pousse, A; Romain, Y. (1982). "Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference". Journal of Multivariate Analysis. 12 (1): 136–154.
^ ^a ^b ^c ^d ^e Ramsay, J; Silverman, BW. (2005). Functional Data Analysis, 2nd ed. Springer.
^ Hsing, T; Eubank, R (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics.
^ Shi, M; Weiss, RE; Taylor, JMG. (1996). "An analysis of paediatric CD4 counts for acquired immune deficiency syndrome using flexible random curves". Journal of the Royal Statistical Society. Series C (Applied Statistics). 45 (2): 151–163.
^ Hilgert, N; Mas, A; Verzelen, N. (2013). "Minimax adaptive tests for the functional linear model". Annals of Statistics. 41: 838–869.
^ Kong, D; Xue, K; Yao, F; Zhang, HH. (2016). "Partially functional linear regression in high dimensions". Biometrika. 103 (1): 147–159.
^ Horváth, L; Kokoszka, P. (2012). Inference for functional data with applications. Springer Series in Statistics. Springer-Verlag.
^ ^a ^b ^c ^d Wang, JL; Chiou, JM; Müller, HG. (2016). "Functional data analysis". Annual Review of Statistics and Its Application. 3 (1): 257–295.
^ Ramsay, JO; Dalzell, CJ. (1991). "Some tools for functional data analysis". Journal of the Royal Statistical Society, Series B (Methodological). 53 (3): 539–561.
^ Malfait, N; Ramsay, JO. (2003). "The historical functional linear model". The Canadian Journal of Statistics. 31 (2): 115–128.
^ He, G; Müller, HG; Wang, JL. (2003). "Functional canonical analysis for square integrable stochastic processes". Journal of Multivariate Analysis. 85 (1): 54–77.
^ ^a ^b Yao, F; Müller, HG; Wang, JL. (2005). "Functional data analysis for sparse longitudinal data". Journal of the American Statistical Association. 100 (470): 577–590.
^ He, G; Müller, HG; Wang, JL; Yang, WJ. (2010). "Functional linear regression via canonical analysis". Journal of Multivariate Analysis. 16 (3): 705–729.
^ Fan, J; Zhang, W. (1999). "Statistical estimation in varying coefficient models". The Annals of Statistics. 27 (5): 1491–1518.
^ Wu, CO; Yu, KF. (2002). "Nonparametric varying-coefficient models for the analysis of longitudinal data". International Statistical Review. 70 (3): 373–393.
^ Huang, JZ; Wu, CO; Zhou, L. (2002). "Varying-coefficient models and basis function approximations for the analysis of repeated measurements". Biometrika. 89 (1): 111–128.
^ Huang, JZ; Wu, CO; Zhou, L. (2004). "Polynomial spline estimation and inference for varying coefficient models with longitudinal data". Statistica Sinica. 14 (3): 763–788.
^ Şentürk, D; Müller, HG. (2010). "Functional varying coefficient models for longitudinal data". Journal of the American Statistical Association. 105 (491): 1256–1264.
^ Eggermont, PPB; Eubank, RL; LaRiccia, VN. (2010). "Convergence rates for smoothing spline estimators in varying coefficient models". Journal of Statistical Planning and Inference. 140 (2): 369–381.
^ ^a ^b 야오, F; 뮐러, HG. (2010)"기능 2차 회귀 분석".바이오메트리카 97 (1:49–64).
^ Horváth, L; Reeder, R. (2013). "A test of significance in functional quadratic regression". Bernoulli. 19 (5A): 2120–2151.
^ Chen, D; Hall, P; 뮐러 HG(2011)."비모수 링크가 있는 단일 및 다중 인덱스 기능 회귀 모형"통계 연보. 39(3):1720–1747.
^ Jiang, CR; Wang JL(2011)."종방향 데이터에 대한 기능적인 단일 인덱스 모델"통계연보. 39 (1):362–388.
^ Müller HG; Wu Y; Yao, F. (2013). "Continuously additive models for nonlinear functional regression". Biometrika. 100 (3): 607–622.{{cite journal}}: CS1 maint : 복수이름 : 작성자 목록(링크)
^ Müller HG; Stadmüller, U. (2005). "Generalized Functional Linear Models". The Annals of Statistics. 33 (2): 774–805.{{cite journal}}: CS1 maint : 복수이름 : 작성자 목록(링크)
^ Chiou, JM; Li, PL. (2007). "Functional clustering and identifying substructures of longitudinal data". Journal of the Royal Statistical Society, Series B (Statistical Methodology). 69 (4): 679–699.
^ Banfield, JD; Raftery, AE. (1993). "Model-based Gaussian and non-Gaussian clustering". Biometrics. 49 (3): 803–821.
^ James, GM; Sugar, CA. (2003). "Clustering for sparsely sampled functional data". Journal of the American Statistical Association. 98 (462): 397–408.
^ Jacques, J; Preda, C. (2013). "Funclust: A curves clustering method using functional random variables density approximation". Neurocomputing. 112: 164–171.
^ Jacques, J; Preda, C. (2014). "Model-based clustering for multivariate functional data". Computational Statistics & Data Analysis. 71 (C): 92–106.
^ Coffey, N; Hinde, J; Holian, E. (2014). "Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data". Computational Statistics & Data Analysis. 71 (C): 14–29.
^ Heinzl, F; Tutz, G. (2014). "Clustering in linear-mixed models with a group fused lasso penalty". Biometrical Journal. 56 (1): 44–68.
^ Angelini, C; Canditiis, DD; Pensky, M. (2012). "Clustering time-course microarray data using functional Bayesian infinite mixture model". Journal of Applied Statistics. 39 (1): 129–149.
^ Rodríguez, A; Dunson, DB; Gelfand, AE. (2009). "Bayesian nonparametric functional data analysis through density estimation". Biometrika. 96 (1): 149–162.
^ Petrone, S; Guindani, M; Gelfand, AE. (2009). "Hybrid Dirichlet mixture models for functional data". Journal of the Royal Statistical Society. 71 (4): 755–782.
^ Heinzl, F; Tutz, G. (2013). "Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm". Statistical Modelling. 13 (1): 41–67.
^ Leng, X; Müller, HG. (2006). "Classification using functional data analysis for temporal gene expression data". Bioinformatics. 22 (1): 68–76.
^ James, GM; Hastie, TJ. (2001). "Functional linear discriminant analysis for irregularly sampled curves". Journal of the Royal Statistical Society. 63 (3): 533–550.
^ Hall, P; Poskitt, DS; Presnell, B. (2001). "A Functional Data—Analytic Approach to Signal Discrimination". Technometrics. 43 (1): 1–9.
^ Ferraty, F; Vieu, P. (2003). "Curves discrimination: a nonparametric functional approach". Computational Statistics & Data Analysis. 44 (1–2): 161–173.
^ Chang, C; Chen, Y; Ogden, RT. (2014). "Functional data classification: a wavelet approach". Computational Statistics. 29 (6): 1497–1513.
^ Zhu, H; Brown, PJ; Morris, JS. (2012). "Robust Classification of Functional and Quantitative Image Data Using Functional Mixed Models". Biometrics. 68 (4): 1260–1268.
^ Dai, X; Müller, HG; Yao, F. (2017). "Optimal Bayes classifiers for functional data and density ratios". Biometrika. 104 (3): 545–560.
^ Delaigle, A; Hall, P (2012). "Achieving near perfect classification for functional data". Journal of the Royal Statistical Society. Series B (Statistical Methodology). 74 (2): 267–286. ISSN 1369-7412.
^ Wang, JL; Chiou, JM; Müller, HG. (2016). "Functional Data Analysis". Annual Review of Statistics and Its Application. 3 (1): 257–295.
^ Gasser, T; Müller, HG; Kohler, W; Molinari, L; Prader, A. (1984). "Nonparametric regression analysis of growth curves". The Annals of Statistics. 12 (1): 210–229.
^ Ramsay, JO; Li, X. (1998). "Curve registration". Journal of the Royal Statistical Society, Series B. 60 (2): 351–363.
^ ^a ^b ^c Marron, JS; Ramsay, JO; Sangalli, LM; Srivastava, A (2015). "Functional data analysis of amplitude and phase variation". Statistical Science. 30 (4): 468–484.
^ Sakoe, H; Chiba, S. (1978). "Dynamic programming algorithm optimization for spoken word recognition". IEEE Transactions on Acoustics, Speech, and Signal Processing. 26: 43–49.
^ Kneip, A; Gasser, T (1992). "Statistical tools to analyze data representing a sample of curves". Annals of Statistics. 20: 1266–1305.
^ Gasser, T; Kneip, A (1995). "Searching for structure in curve sample". Journal of the American Statistical Association. 90 (432): 1179–1188.
^ Tang, R; Müller, HG. (2008). "Pairwise curve synchronization for functional data". Biometrika. 95: 875–889.
^ ^a ^b Anirudh, R; Turaga, P; Su, J; Srivastava, A (2015). "Elastic functional coding of human actions: From vector-fields to latent variables". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: 3147–3155.
^ ^a ^b Dubey, P; Müller, HG (2021). "Modeling Time-Varying Random Objects and Dynamic Networks". Journal of the American Statistical Association. 0 (0): 1–16.
^ Pigoli, D; Hadjipantelis, PZ; Coleman, JS; Aston, JAD (2017). "The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages". Journal of the Royal Statistical Society. Series C: Applied Statistics. 67 (5): 1130–1145.
^ Happ, C; Greven, S (2018). "Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains". Journal of the American Statistical Association. 113 (522): 649–659.
^ Chiou, JM; Yang, YF; Chen, YT (2014). "Multivariate functional principal component analysis: a normalization approach". Statistica Sinica. 24: 1571–1596.
^ Carroll, C; Müller, HG; Kneip, A (2021). "Cross-component registration for multivariate functional data, with application to growth curves". Biometrics. 77 (3): 839–851.
^ Dai, X; Müller, HG (2018). "Principal component analysis for functional data on Riemannian manifolds and spheres". The Annals of Statistics. 46 (6B): 3334–3361.
^ Chen, K; Delicado, P; Müller, HG (2017). "Modelling function-valued stochastic processes, with applications to fertility dynamics". Journal of the Royal Statistical Society. Series B (Statistical Methodology). 79 (1): 177–196.

[1] Grenander, U. (1950). "Stochastic processes and statistical inference". Arkiv för Matematik. 1 (3): 195–277.

[:4-2] Rice, JA; Silverman, BW. (1991). "Estimating the mean and covariance structure nonparametrically when the data are curves". Journal of the Royal Statistical Society. 53 (1): 233–243.

[3] Müller, HG. (2016). "Peter Hall, functional data analysis and random objects". Annals of Statistics. 44 (5): 1867–1887.

[4] Karhunen, K (1946). Zur Spektraltheorie stochastischer Prozesse. Annales Academiae scientiarum Fennicae.

[5] Kleffe, J. (1973). "Principal components of random variables with values in a seperable hilbert space". Mathematische Operationsforschung und Statistik. 4 (5): 391–406.

[6] Dauxois, J; Pousse, A; Romain, Y. (1982). "Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference". Journal of Multivariate Analysis. 12 (1): 136–154.

[:7-7] Ramsay, J; Silverman, BW. (2005). Functional Data Analysis, 2nd ed. Springer.

[8] Hsing, T; Eubank, R (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics.

[9] Shi, M; Weiss, RE; Taylor, JMG. (1996). "An analysis of paediatric CD4 counts for acquired immune deficiency syndrome using flexible random curves". Journal of the Royal Statistical Society. Series C (Applied Statistics). 45 (2): 151–163.

[10] Hilgert, N; Mas, A; Verzelen, N. (2013). "Minimax adaptive tests for the functional linear model". Annals of Statistics. 41: 838–869.

[11] Kong, D; Xue, K; Yao, F; Zhang, HH. (2016). "Partially functional linear regression in high dimensions". Biometrika. 103 (1): 147–159.

[12] Horváth, L; Kokoszka, P. (2012). Inference for functional data with applications. Springer Series in Statistics. Springer-Verlag.

[wang:162-13] Wang, JL; Chiou, JM; Müller, HG. (2016). "Functional data analysis". Annual Review of Statistics and Its Application. 3 (1): 257–295.

[14] Ramsay, JO; Dalzell, CJ. (1991). "Some tools for functional data analysis". Journal of the Royal Statistical Society, Series B (Methodological). 53 (3): 539–561.

[15] Malfait, N; Ramsay, JO. (2003). "The historical functional linear model". The Canadian Journal of Statistics. 31 (2): 115–128.

[16] He, G; Müller, HG; Wang, JL. (2003). "Functional canonical analysis for square integrable stochastic processes". Journal of Multivariate Analysis. 85 (1): 54–77.

[:5-17] Yao, F; Müller, HG; Wang, JL. (2005). "Functional data analysis for sparse longitudinal data". Journal of the American Statistical Association. 100 (470): 577–590.

[18] He, G; Müller, HG; Wang, JL; Yang, WJ. (2010). "Functional linear regression via canonical analysis". Journal of Multivariate Analysis. 16 (3): 705–729.

[19] Fan, J; Zhang, W. (1999). "Statistical estimation in varying coefficient models". The Annals of Statistics. 27 (5): 1491–1518.

[20] Wu, CO; Yu, KF. (2002). "Nonparametric varying-coefficient models for the analysis of longitudinal data". International Statistical Review. 70 (3): 373–393.

[21] Huang, JZ; Wu, CO; Zhou, L. (2002). "Varying-coefficient models and basis function approximations for the analysis of repeated measurements". Biometrika. 89 (1): 111–128.

[22] Huang, JZ; Wu, CO; Zhou, L. (2004). "Polynomial spline estimation and inference for varying coefficient models with longitudinal data". Statistica Sinica. 14 (3): 763–788.

[23] Şentürk, D; Müller, HG. (2010). "Functional varying coefficient models for longitudinal data". Journal of the American Statistical Association. 105 (491): 1256–1264.

[24] Eggermont, PPB; Eubank, RL; LaRiccia, VN. (2010). "Convergence rates for smoothing spline estimators in varying coefficient models". Journal of Statistical Planning and Inference. 140 (2): 369–381.

[yao:10-25] 야오, F; 뮐러, HG. (2010)"기능 2차 회귀 분석".바이오메트리카 97 (1:49–64).

[26] Horváth, L; Reeder, R. (2013). "A test of significance in functional quadratic regression". Bernoulli. 19 (5A): 2120–2151.

[chen:11-27] Chen, D; Hall, P; 뮐러 HG(2011)."비모수 링크가 있는 단일 및 다중 인덱스 기능 회귀 모형"통계 연보. 39(3):1720–1747.

[28] Jiang, CR; Wang JL(2011)."종방향 데이터에 대한 기능적인 단일 인덱스 모델"통계연보. 39 (1):362–388.

[29] Müller HG; Wu Y; Yao, F. (2013). "Continuously additive models for nonlinear functional regression". Biometrika. 100 (3): 607–622.{{cite journal}}: CS1 maint : 복수이름 : 작성자 목록(링크)

[30] Müller HG; Stadmüller, U. (2005). "Generalized Functional Linear Models". The Annals of Statistics. 33 (2): 774–805.{{cite journal}}: CS1 maint : 복수이름 : 작성자 목록(링크)

[31] Chiou, JM; Li, PL. (2007). "Functional clustering and identifying substructures of longitudinal data". Journal of the Royal Statistical Society, Series B (Statistical Methodology). 69 (4): 679–699.

[32] Banfield, JD; Raftery, AE. (1993). "Model-based Gaussian and non-Gaussian clustering". Biometrics. 49 (3): 803–821.

[33] James, GM; Sugar, CA. (2003). "Clustering for sparsely sampled functional data". Journal of the American Statistical Association. 98 (462): 397–408.

[34] Jacques, J; Preda, C. (2013). "Funclust: A curves clustering method using functional random variables density approximation". Neurocomputing. 112: 164–171.

[35] Jacques, J; Preda, C. (2014). "Model-based clustering for multivariate functional data". Computational Statistics & Data Analysis. 71 (C): 92–106.

[:03-36] Coffey, N; Hinde, J; Holian, E. (2014). "Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data". Computational Statistics & Data Analysis. 71 (C): 14–29.

[37] Heinzl, F; Tutz, G. (2014). "Clustering in linear-mixed models with a group fused lasso penalty". Biometrical Journal. 56 (1): 44–68.

[38] Angelini, C; Canditiis, DD; Pensky, M. (2012). "Clustering time-course microarray data using functional Bayesian infinite mixture model". Journal of Applied Statistics. 39 (1): 129–149.

[39] Rodríguez, A; Dunson, DB; Gelfand, AE. (2009). "Bayesian nonparametric functional data analysis through density estimation". Biometrika. 96 (1): 149–162.

[40] Petrone, S; Guindani, M; Gelfand, AE. (2009). "Hybrid Dirichlet mixture models for functional data". Journal of the Royal Statistical Society. 71 (4): 755–782.

[41] Heinzl, F; Tutz, G. (2013). "Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm". Statistical Modelling. 13 (1): 41–67.

[42] Leng, X; Müller, HG. (2006). "Classification using functional data analysis for temporal gene expression data". Bioinformatics. 22 (1): 68–76.

[43] James, GM; Hastie, TJ. (2001). "Functional linear discriminant analysis for irregularly sampled curves". Journal of the Royal Statistical Society. 63 (3): 533–550.

[44] Hall, P; Poskitt, DS; Presnell, B. (2001). "A Functional Data—Analytic Approach to Signal Discrimination". Technometrics. 43 (1): 1–9.

[45] Ferraty, F; Vieu, P. (2003). "Curves discrimination: a nonparametric functional approach". Computational Statistics & Data Analysis. 44 (1–2): 161–173.

[46] Chang, C; Chen, Y; Ogden, RT. (2014). "Functional data classification: a wavelet approach". Computational Statistics. 29 (6): 1497–1513.

[47] Zhu, H; Brown, PJ; Morris, JS. (2012). "Robust Classification of Functional and Quantitative Image Data Using Functional Mixed Models". Biometrics. 68 (4): 1260–1268.

[:0-48] Dai, X; Müller, HG; Yao, F. (2017). "Optimal Bayes classifiers for functional data and density ratios". Biometrika. 104 (3): 545–560.

[49] Delaigle, A; Hall, P (2012). "Achieving near perfect classification for functional data". Journal of the Royal Statistical Society. Series B (Statistical Methodology). 74 (2): 267–286. ISSN 1369-7412.

[wang:16-50] Wang, JL; Chiou, JM; Müller, HG. (2016). "Functional Data Analysis". Annual Review of Statistics and Its Application. 3 (1): 257–295.

[51] Gasser, T; Müller, HG; Kohler, W; Molinari, L; Prader, A. (1984). "Nonparametric regression analysis of growth curves". The Annals of Statistics. 12 (1): 210–229.

[52] Ramsay, JO; Li, X. (1998). "Curve registration". Journal of the Royal Statistical Society, Series B. 60 (2): 351–363.

[:6-53] Marron, JS; Ramsay, JO; Sangalli, LM; Srivastava, A (2015). "Functional data analysis of amplitude and phase variation". Statistical Science. 30 (4): 468–484.

[54] Sakoe, H; Chiba, S. (1978). "Dynamic programming algorithm optimization for spoken word recognition". IEEE Transactions on Acoustics, Speech, and Signal Processing. 26: 43–49.

[55] Kneip, A; Gasser, T (1992). "Statistical tools to analyze data representing a sample of curves". Annals of Statistics. 20: 1266–1305.

[56] Gasser, T; Kneip, A (1995). "Searching for structure in curve sample". Journal of the American Statistical Association. 90 (432): 1179–1188.

[:1-57] Tang, R; Müller, HG. (2008). "Pairwise curve synchronization for functional data". Biometrika. 95: 875–889.

[:2-58] Anirudh, R; Turaga, P; Su, J; Srivastava, A (2015). "Elastic functional coding of human actions: From vector-fields to latent variables". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: 3147–3155.

[:3-59] Dubey, P; Müller, HG (2021). "Modeling Time-Varying Random Objects and Dynamic Networks". Journal of the American Statistical Association. 0 (0): 1–16.

[60] Pigoli, D; Hadjipantelis, PZ; Coleman, JS; Aston, JAD (2017). "The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages". Journal of the Royal Statistical Society. Series C: Applied Statistics. 67 (5): 1130–1145.

[61] Happ, C; Greven, S (2018). "Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains". Journal of the American Statistical Association. 113 (522): 649–659.

[62] Chiou, JM; Yang, YF; Chen, YT (2014). "Multivariate functional principal component analysis: a normalization approach". Statistica Sinica. 24: 1571–1596.

[63] Carroll, C; Müller, HG; Kneip, A (2021). "Cross-component registration for multivariate functional data, with application to growth curves". Biometrics. 77 (3): 839–851.

[64] Dai, X; Müller, HG (2018). "Principal component analysis for functional data on Riemannian manifolds and spheres". The Annals of Statistics. 46 (6B): 3334–3361.

[65] Chen, K; Delicado, P; Müller, HG (2017). "Modelling function-valued stochastic processes, with applications to fertility dynamics". Journal of the Royal Statistical Society. Series B (Statistical Methodology). 79 (1): 177–196.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

[56]

[57]

[58]

[59]

[60]

[61]

[62]

[63]

[64]

[65]

Search

기능 데이터 분석

네임스페이스

더

목차

역사

수학적 형식주의

힐베르트 랜덤 변수

확률적 과정

기능 데이터 설계

1. 임의의 고밀도 그리드에서 잡음 없이 완전히 관측된 기능

2. 노이즈 측정으로 촘촘하게 샘플링한 기능(센스 설계)

3. 노이즈 측정(종도 데이터)으로 함수를 희박하게 샘플링함수

기능주성분 분석

기능 선형 회귀 모형

스칼라 반응이 있는 기능 회귀 모형

기능적 반응을 갖는 기능적 회귀 모형

척도 내 기능 회귀 분석

동시 회귀 모형

기능 비선형 회귀 분석 모형

기능 다항식 회귀 분석 모형

기능적인 단일 및 다중 인덱스 모델

기능적 첨가 모델(FAM)

일반화 함수 선형 모형

기능 데이터의 클러스터링 및 분류

시간 뒤틀림

동기부여

방법들

동적 시간 뒤틀림

랜드마크 등록

확장

$X(\cdot )$ $X(\cdot )$ 의 다차원 도메인 ${\displaystyle$ X $(\cdot )}$

다변량 확률 공정

R 패키지

참고 항목

추가 읽기

참조

Search

기능 데이터 분석

역사

수학적 형식주의

힐베르트 랜덤 변수

확률적 과정

기능 데이터 설계

1. 임의의 고밀도 그리드에서 잡음 없이 완전히 관측된 기능

2. 노이즈 측정으로 촘촘하게 샘플링한 기능(센스 설계)

3. 노이즈 측정(종도 데이터)으로 함수를 희박하게 샘플링함수

기능주성분 분석

기능 선형 회귀 모형

스칼라 반응이 있는 기능 회귀 모형

기능적 반응을 갖는 기능적 회귀 모형

척도 내 기능 회귀 분석

동시 회귀 모형

기능 비선형 회귀 분석 모형

기능 다항식 회귀 분석 모형

기능적인 단일 및 다중 인덱스 모델

기능적 첨가 모델(FAM)

일반화 함수 선형 모형

기능 데이터의 클러스터링 및 분류

시간 뒤틀림

동기부여

방법들

동적 시간 뒤틀림

랜드마크 등록

확장

X( ⋅)의 다차원 도메인 {\displaystyle X(\cdot )}

다변량 확률 공정

R 패키지

참고 항목

추가 읽기

참조

$X(\cdot )$ $X(\cdot )$ 의 다차원 도메인 ${\displaystyle$ X $(\cdot )}$