ISLRN

English-Vietnamese Parallel Corpus

Full Official Name: English-Vietnamese Parallel Corpus

Submission date: Jan. 17, 2018, 12:39 p.m.

This is a corpus of 500,000 English-Vietnamese sentence pairs, built to develop SMT (Statistical Machine Translation) systems. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news, collected between 2000 and 2007. All Vietnamese sentences have been word-segmented and morphologically analyzed. The texts are provided in TEI format.

Creator(s)

Distributor(s)

ELRA

Right Holder(s)

Status : Accepted

ISLRN :

838-483-738-912-8

Version

1.0

Source

http://catalog.elra.info/product_info.php?products_id=1316

Resource Type

Primary Text

Media Type

Text

Language(s)

English

Vietnamese

Access Medium

Downloadable