The ONOMASTICA project was a European-wide research initiative within the scope of the Linguistic Research and Engineering Programme, the aim of which was the construction of a multi-language pronunciation lexicon of proper names. That project covered eleven European languages: Danish, Dutch, English, French, German, Greek, Italian, Norwegian, Portuguese, Spanish and Swedish. Although the ONOMASTICA project ended in June 1995, the work continued with the introduction of new partners, addressing names in Eastern and Central European languages: Czech, Estonian, Latvian, Polish, Romanian, Slovakian, Slovenian and Ukrainian, in a new project funded by the European Commission?s Copernicus Programme. The corpus consists of a collection of 1,783,390 transcriptions of 1,705,653 names, broken down as follows: · Czech: 257,700 entries consisting of 244,025 names prepared by Dr. Pavel Kolar of the Language Institute, Silesian University, Opava, Czech Republic. · Estonian: 209,515 entries consisting of 208,380 names prepared by Dr. Peeter Päll of the Institute for the Estonian Language, Estonian Academy of Sciences, Tallinn, Estonia. · Latvian: 258,214 entries consisting of 245,331 names prepared by Dr. Andrejs Spektors of the Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia. · Polish: 285,412 entries consisting of 244,632 names prepared by Prof. Wiktor Jassem of the Institute of Fundamental Technological Research, Polish Academy of Sciences, Posnan, Poland. · Slovak: 228,257 entries consisting of 228,257 names prepared by Dr. Peter Durco of the Department of Foreign Languages, Police Academy of the Slovak Republic, Bratislava, Slovak Republic. · Slovenian: 285,862 entries consisting of 283,449 names prepared by Dr. Zdravko Kacic of the Faculty of Technical Sciences, University of Maribor, Maribor, Slovenia. · Ukrainian: 258,430 entries consisting of 251,579 names prepared by Dr. Yevgeniy Ludovik of the Institute of Cybernetics, Ukraine Academy of Sciences, Kiev, Ukraine. The databases are presented in Microsoft Access format and in ASCII text format, together with a database browser software prepared by Keith Edwards of the Centre for Communication Interface Research, The University of Edinburgh.