Full Official Name: The PALMA corpora of African Varieties of Portuguese
The PALMA corpora are three new corpora of urban varieties of Portuguese spoken in Angola, Mozambique, and São Tomé and Príncipe, where Portuguese is increasingly being spoken as first and second language in different multilingual settings. The corpora consist of transcribed spoken data, complemented by a rich set of metadata describing the setting of the audio recordings and sociolinguistic information about the speakers. They are annotated with POS and lemma information and made available on the CQPweb platform. The total number of tokens is 1,090,280 and each subcorpus is around 320,000 - 380,000 tokens.

