Resource: English Punjabi Parallel Corpus

Reference English Punjabi Parallel Corpus
Date of Submission April 15, 2021, 9:39 a.m.
Status accepted
ISLRN 880-749-246-493-2
Resource Type Primary Text
Media Type Text
Source
Language English, Punjabi
Description

The English-Punjabi Parallel corpus consists of 14 lakh parallel sentences each of English and Punjabi languages. These sentences are extracted from Comparable corpus of varying degree collected from Wikipedia and other authentic language resources. The comparable corpus obtained from Wikipedia dump is used to extract parallel sentences by using similarity measures through an integrated approach.

Version 1.0
Creator Manpreet Singh Lehal - Lyallpur Khalsa College, Jalandhar , Vishal Goyal - Punjabi University, Patiala , Ajit Kumar - Multani Mal Modi College, Patiala
Distributor Manpreet Singh Lehal - Lyallpur Khalsa College, Jalandhar , Vishal Goyal - Punjabi University, Patiala , Ajit Kumar - Multani Mal Modi College, Patiala
Rights Holder Manpreet Singh Lehal - Lyallpur Khalsa College, Jalandhar , Vishal Goyal - Punjabi University, Patiala , Ajit Kumar - Multani Mal Modi College, Patiala