ISLRN

English Punjabi Parallel Corpus

Full Official Name: English Punjabi Parallel Corpus

Submission date: April 15, 2021, 9:39 a.m.

The English-Punjabi Parallel corpus consists of 14 lakh parallel sentences each of English and Punjabi languages. These sentences are extracted from Comparable corpus of varying degree collected from Wikipedia and other authentic language resources. The comparable corpus obtained from Wikipedia dump is used to extract parallel sentences by using similarity measures through an integrated approach.

Creator(s)

Lyallpur Khalsa College, Jalandhar - Manpreet Singh Lehal

Punjabi University, Patiala - Vishal Goyal

Multani Mal Modi College, Patiala - Ajit Kumar

Distributor(s)

Lyallpur Khalsa College, Jalandhar - Manpreet Singh Lehal

Punjabi University, Patiala - Vishal Goyal

Multani Mal Modi College, Patiala - Ajit Kumar

Right Holder(s)

Lyallpur Khalsa College, Jalandhar - Manpreet Singh Lehal

Punjabi University, Patiala - Vishal Goyal

Multani Mal Modi College, Patiala - Ajit Kumar

Status : Accepted

ISLRN :

880-749-246-493-2

Version

1.0

Source

https://shabad22.github.io/bilingual-dataset/

Resource Type

Primary Text

Media Type

Text

Language(s)

English

Panjabi

Access Medium