립넷

립넷은 시각적 음성 인식을 위한 심층 신경망이다.이것은 야니스 아싸엘, 브렌단 실링포드, 시몬 화이트슨, 그리고 옥스퍼드 대학의 연구원 난도 드 프리타스에 의해 만들어졌다.2016년 11월 논문에 개괄된 이 기법은 스피커 입의 움직임에서 나온 텍스트를 해독할 수 있다.^[1]전통적인 시각적 음성 인식 접근법은 문제를 시각적 특징의 설계 또는 학습과 예측의 두 단계로 구분했다.립넷은 주걱턱 시각적 특징과 시퀀스 모델을 동시에 학습한 최초의 단대단 문장 수준의 립싱글 모델이었다.^[2]시청각 음성인식은 보청기 개선 애플리케이션, 중증환자의 회복과 웰빙 개선 등 의료 애플리케이션,^[3] 엔비디아 자율주행차 ^[4]등 소음이 심한 환경에서의 음성인식 등 실질적인 잠재력이 크다.^[5]null

참조

^ Assael, Yannis M.; Shillingford, Brendan; Whiteson, Shimon; de Freitas, Nando (2016-12-16). "LipNet: End-to-End Sentence-level Lipreading". arXiv:1611.01599 [cs.LG].
^ "AI that lip-reads 'better than humans'". November 8, 2016 – via www.bbc.com.
^ "Home Elementor". Liopa.
^ Vincent, James (November 7, 2016). "Can deep learning help solve lip reading?". The Verge.
^ Quach, Katyanna. "Revealed: How Nvidia's 'backseat driver' AI learned to read lips". www.theregister.com.

[1] Assael, Yannis M.; Shillingford, Brendan; Whiteson, Shimon; de Freitas, Nando (2016-12-16). "LipNet: End-to-End Sentence-level Lipreading". arXiv:1611.01599 [cs.LG].

[2] "AI that lip-reads 'better than humans'". November 8, 2016 – via www.bbc.com.

[3] "Home Elementor". Liopa.

[4] Vincent, James (November 7, 2016). "Can deep learning help solve lip reading?". The Verge.

[5] Quach, Katyanna. "Revealed: How Nvidia's 'backseat driver' AI learned to read lips". www.theregister.com.

[1]

[2]

[3]

[4]

[5]

Search

립넷

네임스페이스

더

참조