ISLRN

GLips

Full Official Name: GLips - German Lipreading Dataset

Submission date: April 29, 2022, 4:28 p.m.

The German Lipreading dataset consists of 250,000 publicly available videos of the faces of speakers of the Hessian Parliament, which was processed for word-level lip reading using an automatic pipeline. The format is similar to that of the English language Lip Reading in the Wild (LRW) dataset, with each H264-compressed MPEG-4 video encoding one word of interest in a context of 1.16 seconds duration, which yields compatibility for studying transfer learning between both datasets. Choosing video material based on naturally spoken language in a natural environment ensures more robust results for real-world applications than artificially generated datasets with as little noise as possible. The 500 different spoken words ranging between 4-18 characters in length each have 500 instances and separate MPEG-4 audio- and text metadata-files, originating from 1018 parliamentary sessions. Additionally, the complete TextGrid files containing the segmentation information of those sessions are also included. The size of the uncompressed dataset is 16GB.

Creator(s)

University of Hamburg - Gerald Schwiebert

Distributor(s)

University of Hamburg

Right Holder(s)

Hessian Parliament

University of Hamburg

Status : Accepted

ISLRN :

573-598-164-315-1

Version

1.0

Source

https://www.fdr.uni-hamburg.de/record/10048

Resource Type

Video/Speech

Media Type

Audio

Text

TextGrid

Video

Language(s)

German

Access Medium

Internet Download