An-Nahar Newspaper Text Corpus

Full Official Name: An-Nahar Newspaper Text Corpus
Submission date: Jan. 24, 2014, 4:17 p.m.

The An-Nahar Lebanon Newspaper Text Corpus comprises articles in standard Arabic from 1995 to 2000 (6 years) stored as HTML files on CDRom media. Each year contains 45 000 articles and 24 million words. Each article includes information such as title, newspaper's name, date, country, type, page, etc. For each year, the size in byte is as follows: 1995 : 128 MB 1996 : 138 MB 1997 : 152 MB 1998 : 140 MB 1999 : 130 MB 2000 : 118 MB

Right Holder(s)