[Evertype]  The Alphabets of Europe Home
 
 

The Alphabets of Europe

Michael Everson
Version 3.0


Contents

0 Introduction
1 Scope
1.1 The geographical area of Europe
1.2 The languages of Europe
1.3 The scripts of Europe
2.0 Definitions
3.0 Criteria for evaluation
4.0 Writing systems
5.0 Repertoires
5.1 Punctuation
5.2 Digits and numbers
Annex A Administrative units of Europe
Annex B European Sign Languages
Annex C Changes from previous versions

0 Introduction

The Alphabets of Europe provides a source of linguistic data for the indigenous languages of Europe. The use of the term indigenous (or autochthonous) indicates that this report covers the languages native to the European geographical area. “Nativeness” is to be understood in an academic linguistic sense. Other languages, more recently imported or transplanted to Europe (Vietnamese and Bengali, for instance) are not covered by this report (Greenlandic and Kazakh are exceptions here). The exclusion of such languages from this report is not intended to imply any bias whatsoever against such “immigrant” languages or their speakers. However, it is relatively easy to get information on the orthography of Vietnamese and Bengali, while many of Europe’s indigenous minority languages have been poorly served in the area of Information Technology – if, indeed, they are acknowledged at all. The Alphabets of Europe serves to remedy that oversight.

The main function of these pages is to present a catalogue of European alphabets. The characters which are, and in some cases were, used to write each of the languages of Europe (as far as it has been possible to find information on them), are included here. Some of Europe’s languages (particularly in the Caucasus) still have no tradition of writing, though other information on such languages is provided here when it is available. Likewise, some languages have used, or continue to use, one or more than one writing system, which may also be reflected here.

The Alphabets of Europe could not have been compiled without the input of many, many people, and the difficult nature of the material presented here begs for explicit acknowledgement of the abundant expertise which has been contributed. Among these are the following: Celso Alvarez Cáccamo, Wolf Arfvidson, Baldur Jónsson, Uldis Balodis, Nelson H. F. Beebe, Elżbieta Broma-Wrzesień, Gerhard Budin, Bernard Chauvois, John Clews, Philippe Deschamp, Paul Dettmer, Charles Gribble, Gintautas Grigas, Borka Jerman-Blažič, Mike Jones, Peter Kirk, Jörg Knappen, Erkki I. Kolehmainen, Mike Ksar, Marc Wilhelm Küster, Chris Lilley, Ferran Lupescu, Stavros Macrakis, Chris Makemson, Christopher Miller, Åke Persson, Hugh McGregor Ross, Klaas Ruppel, Keld Simonsen, Xulio C. Sousa Fernández, Monica Ståhl, Alexandrina Stătescu, Libor Sztemon, Trond Trosterud, V. S. Umamaheswaran, Luc van den Berghe, Uwe Waldmann, Max Wheeler, Þorgeir Sigurðsson, and Þorvarður Kári Ólafsson. Very special thanks are due to Judy Nye (University Research Library, University of California, Los Angeles), for granting the editor a week’s indulgence as he photocopied hundreds of pages of material in 1997.

1 Scope

The Alphabets of Europe gives information on characters used in Europe’s indigenous languages. In order to accomplish this, a definition of the geographical area covered has been given to assist the reader.

1.1 The geographical area of Europe

The Alphabets of Europe uses the following geographical and geophysical definition of Europe:
“Europe” extends from the Arctic and Atlantic (including Iceland and the Faroe Islands) southeastwards to the Mediterranean (including Malta and Cyprus), with its eastern and southern borders being the Ural Mountains, the Ural River, the Caspian Sea, and Anatolia, inclusive of Transcaucasia.
This report also includes languages found in the following areas:
Anatolian Turkey, Greenland
Information concerning the administrative units covered by this geographical definition can be found in Annex A. It is important to note that this is a geolinguistic survey. It is not a political survey. The area defined here may be seen on page xiv, “Geographical Comparisons”, in The Times Atlas of the World: comprehensive edition, 1990 (ISBN 0-7230-0346-7).
Eurasia

1.2 The languages of Europe

A convenient way of enumerating the languages of Europe is to do so by linguistic family. The classification used in The Alphabets of Europe is based on, but is not identical with, the classification found in Merritt Ruhlen’s A guide to the world’s languages. Volume 1: Classification, 1992 (ISBN 0-340-56186-6), which is a well-defined taxonomy with a bibliography helpful for further study.

Most Europeans speaking indigenous European languages speak Indo-European or Uralic languages, but five other language families are also represented in Europe. The intent of The Alphabets of Europe is to be neutral with respect to language; its task is to document alphabets, not to rank languages in any particular way. Accordingly, languages are listed by family and subfamily in clause 1.2.1, and alphabetically in clause 1.2.2. An asterisk (*) following a language name in the indices indicates that the language has no standard literary orthography. For some of these languages, the populations speaking them are rather large; conversely, some of the languages with standard orthographies have very small numbers of speakers. For each language, an estimated population has been given. The population estimates for European languages are best taken with a spoonful of salt. Accurate censuses have never been taken. For some communities, where the entire population is more or less identical to the number of speakers (Iceland for instance), the number given may be considered to be fairly reliable. Much of the Soviet data was collected on the basis of reasonably robust and comprehensive questionnaires involving self-identification. Sources for the population estimates are given; readers must use their own judgement as to the usefulness of this data.

In The Alphabets of Europe, all languages are considered equal insofar as their alphabets and the field of Information Technology are concerned. The fact that there are something like 90,000,000 German speakers and something like 12,000 Rutul speakers means that spell-checkers and grammar checkers might be expected to be made available for the former in the short term, but not for the latter. No particular recommendations are made with regard to such implementation. What is strongly recommended is that all the letters of the all the standard literary alphabets of Europe be representable in Unicode and ISO/IEC 10646.

1.2.1. Genetic index of languages

In this HTML document, clicking on the name of a language will retrieve a PDF file with data for that language. (Parentheses around a repertoire name indicates that the PDF file available is a dummy document, when information on the language’s orthography is not available or has not been processed.)

Afro-Asiatic languages
Afro-Asiatic: Semitic: West: Central: Arabo-Canaanite: Arabic: Maltese
Afro-Asiatic: Semitic: West: Central: Aramaic: (Aisor)

Basque
Basque

Caucasian languages
Caucasian: South: Georgian
Caucasian: South: (Judeo-Georgian)
Caucasian: South: Svan
Caucasian: South: Zan: Laz
Caucasian: South: Zan: Mingrelian
Caucasian: North: Northwest: Ubykh *
Caucasian: North: Northwest: Abkhaz-Abaza: Abaza
Caucasian: North: Northwest: Abkhaz-Abaza: Abkhaz
Caucasian: North: Northwest: Circassian: Adyghe
Caucasian: North: Northwest: Circassian: Kabardian
Caucasian: North: Northeast: Nakh: Bats *
Caucasian: North: Northeast: Nakh: Chechen-Ingush: Chechen
Caucasian: North: Northeast: Nakh: Chechen-Ingush: Ingush
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Avar
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Andi) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Botlikh) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: Godoberi *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Chamalal) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Bagulal) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Tindi) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: (Karata) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Andi: Akhvakh *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Dido: (Khvarshi) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Dido: Dido-Hinukh: (Hinukh) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Dido: Dido-Hinukh: (Tsez) *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Dido: Bezhta-Hunzib: Bezhta *
Caucasian: North: Northeast: Dagestan: Avaro-Andi-Dido: Dido: Bezhta-Hunzib: (Hunzib) *
Caucasian: North: Northeast: Dagestan: Lak-Dargwa: Dargwa
Caucasian: North: Northeast: Dagestan: Lak-Dargwa: Lak
Caucasian: North: Northeast: Dagestan: Lezgian: Archi *
Caucasian: North: Northeast: Dagestan: Lezgian: Khinalug
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Agul *
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Budukh *
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Lezgian
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: (Kryts)
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Rutul
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Tabasaran
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Tsakhur *
Caucasian: North: Northeast: Dagestan: Lezgian: Lezgian Proper: Udi

Eskimo-Aleut languages
Eskimo-Aleut: Eskimo: Inuit: Greenlandic

Indo-European languages (IE)
IE: Armenian: Armenian
IE: Indo-Iranian: Indic: (Romani)
IE: Indo-Iranian: Iranian: East: Northeast: West Scythian: Ossetian
IE: Indo-Iranian: Iranian: West: Northwest: Kurdish: (Kirmanji)
IE: Indo-Iranian: Iranian: West: Northwest: Kurdish: (Judeo-Kurdish)
IE: Indo-Iranian: Iranian: West: Northwest: Kurdish: Kurdish
IE: Indo-Iranian: Iranian: West: Northwest: Talysh: (Talysh)
IE: Indo-Iranian: Iranian: West: Southwest: (Judeo-Tati)
IE: Indo-Iranian: Iranian: West: Southwest: Tati
IE: Albanian: Albanian
IE: Albanian: (Arvanite)
IE: Greek: Greek
IE: Greek: (Tsakonian) *
IE: Italic: Latino-Faliscan: Latin
IE: Italic: Latino-Faliscan: Romance: (Sardinian) *
IE: Italic: Latino-Faliscan: Romance: Continental: Artificial: Esperanto
IE: Italic: Latino-Faliscan: Romance: Continental: East: North: Istro-Romanian
IE: Italic: Latino-Faliscan: Romance: Continental: East: North: Romanian and Moldavian
IE: Italic: Latino-Faliscan: Romance: Continental: East: South: Arumanian
IE: Italic: Latino-Faliscan: Romance: Continental: East: South: Megleno-Romanian
IE: Italic: Latino-Faliscan: Romance: Continental: West: Italic: Italian: (Corsican)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Italic: Italian: Italian
IE: Italic: Latino-Faliscan: Romance: Continental: West: Rhaeto-Romance: Friulian
IE: Italic: Latino-Faliscan: Romance: Continental: West: Rhaeto-Romance: (Ladin)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Rhaeto-Romance: Romansch
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Gallic: North: (Franco-Provençal)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Gallic: North: French
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Gallic: North: (Walloon)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Gallic: South: Occitan
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Gallic: South: Catalan
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: Central: (Aragonese)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: Central: (Ladino)
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: Central: Spanish
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: West: Asturian
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: West: Galician
IE: Italic: Latino-Faliscan: Romance: Continental: West: Gallo-Iberic: Iberic: North: West: Portuguese
IE: Celtic: Insular: Goidelic: Irish Gaelic
IE: Celtic: Insular: Goidelic: Manx Gaelic
IE: Celtic: Insular: Goidelic: Scottish Gaelic
IE: Celtic: Insular: Brythonic: Breton
IE: Celtic: Insular: Brythonic: Cornish
IE: Celtic: Insular: Brythonic: Welsh
IE: Germanic: North: East: (Älvdalska)
IE: Germanic: North: East: Danish
IE: Germanic: North: East: Swedish
IE: Germanic: North: East: (Våmhusmål)
IE: Germanic: North: West: Faroese
IE: Germanic: North: West: Icelandic
IE: Germanic: North: West: Bokmål Norwegian
IE: Germanic: North: West: Nynorsk Norwegian
IE: Germanic: West: Continental: East: German
IE: Germanic: West: Continental: East: Luxemburgish
IE: Germanic: West: Continental: East: (Yiddish)
IE: Germanic: West: Continental: West: Dutch
IE: Germanic: West: Continental: West: (Low German)
IE: Germanic: West: North Sea: Frisian: (East Frisian)
IE: Germanic: West: North Sea: Frisian: (North Frisian)
IE: Germanic: West: North Sea: Frisian: West Frisian
IE: Germanic: West: North Sea: English: English
IE: Germanic: West: North Sea: English: Scots
IE: Balto-Slavic: Baltic: East: Latvian
IE: Balto-Slavic: Baltic: East: Lithuanian
IE: Balto-Slavic: Slavic: East: North: Belarusian
IE: Balto-Slavic: Slavic: East: North: Russian
IE: Balto-Slavic: Slavic: East: South: (Rusyn)
IE: Balto-Slavic: Slavic: East: South: Ukrainian
IE: Balto-Slavic: Slavic: West: North: Kashubian
IE: Balto-Slavic: Slavic: West: North: Polish
IE: Balto-Slavic: Slavic: West: Central: Lower Sorbian
IE: Balto-Slavic: Slavic: West: Central: Upper Sorbian
IE: Balto-Slavic: Slavic: West: South: Czech
IE: Balto-Slavic: Slavic: West: South: Slovak
IE: Balto-Slavic: Slavic: South: Old Church Slavonic
IE: Balto-Slavic: Slavic: South: West: Serbian, Croatian, Bosnian, and Montenegrin
IE: Balto-Slavic: Slavic: South: West: Slovenian
IE: Balto-Slavic: Slavic: South: East: Bulgarian
IE: Balto-Slavic: Slavic: South: East: Macedonian

Mongolian languages
Mongolian: East: Oirat-Khalkha: Oirat-Kalmyk: Kalmyk

Turkic languages
Turkic: Bolgar: Chuvash
Turkic: Common Turkic: South: (Crimean Turkish)
Turkic: Common Turkic: South: Gagauz
Turkic: Common Turkic: South: Turkish
Turkic: Common Turkic: South: Azerbaijani: Azerbaijani
Turkic: Common Turkic: West: Bashkir
Turkic: Common Turkic: West: Kumyk-Karachay: Balkar
Turkic: Common Turkic: West: Kumyk-Karachay: Karachay
Turkic: Common Turkic: West: Kumyk-Karachay: (Karaim)
Turkic: Common Turkic: West: Kumyk-Karachay: Kumyk
Turkic: Common Turkic: West: Tatar: Baraba Tatar
Turkic: Common Turkic: West: Tatar: Crimean Tatar
Turkic: Common Turkic: West: Tatar: Kazan Tatar
Turkic: Common Turkic: Central: Kazakh
Turkic: Common Turkic: Central: Nogai

Uralic languages
Uralic: Samoyed: North: Tundra Nenets
Uralic: Finno-Ugric: Ugric: Hungarian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: Finnish
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: Ingrian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: Karelian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: (Ludian)
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: (Olonets)
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: North: Vepsian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: South: Votic
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: South: Estonian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Finnic: South: Livonian
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: Central: Lule Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: Central: Northern Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: East: Inari Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: East: Kildin Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: East: Skolt Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: East: (Ter Sami) *
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: South: Southern Sami
Uralic: Finno-Ugric: Finno-Permic: Finno-Samic: Samic: South: (Ume Sami) *
Uralic: Finno-Ugric: Finno-Permic: Permic: Komi
Uralic: Finno-Ugric: Finno-Permic: Permic: Komi-Permyak
Uralic: Finno-Ugric: Finno-Permic: Permic: Udmurt
Uralic: Finno-Ugric: Finno-Permic: Volgaic: Mari: Hill Mari
Uralic: Finno-Ugric: Finno-Permic: Volgaic: Mari: Meadow Mari
Uralic: Finno-Ugric: Finno-Permic: Volgaic: Mordvin: Erzya
Uralic: Finno-Ugric: Finno-Permic: Volgaic: Mordvin: Moksha

1.2.2 Alphabetic index of languages

In this HTML document, clicking on the name of a language will retrieve a PDF file with data for that language. (Parentheses around a repertoire name indicates that the PDF file available is a dummy document, when information on the language’s orthography is not available or has not been processed.)

Abaza
Abkhaz
Adyghe
Agul *
(Aisor)
Akhvakh *
Albanian
(Älvdalska)
(Andi) *
(Aragonese)
Archi *
Armenian
Arumanian
(Arvanite)
Asturian
Avar
Azerbaijani
(Bagulal) *
Balkar
Bashkir
Basque
Bats *
Belarusian
Bezhta *
Bosnian
(Botlikh) *
Breton
Budukh *
Bulgarian
Catalan
(Chamalal)
Chechen
Chuvash
Cornish
(Corsican)
Croatian
Czech
Danish
Dargwa
Dutch
English
Erzya
Esperanto
Estonian
Faroese
Finnish
(Franco-Provençal)
French
(Frisian, East)
(Frisian, North)
Frisian, West
Friulian
Gagauz
Gaelic, Irish
Gaelic, Manx
Gaelic, Scottish
Galician
Georgian
German
(German, Low)
(German, Swiss)
Godoberi *
Greek
Greenlandic
(Hinukh) *
Hungarian
(Hunzib) *
Icelandic
Ingrian
Ingush
Istro-Romanian
Italian
(Judeo-Georgian)
(Judeo-Kurdish)
(Judeo-Tati)
Kabardian
Kalmyk
Karachay
(Karaim)
(Karata) *
Karelian
Kashubian
Kazakh
Khinalug
(Khvarshi) *
(Kirmanji)
Komi
Komi-Permyak
(Kryts)
Kumyk
(Kurdish)
(Ladin)
(Ladino)
Lak
Latin
Latvian
Laz
Lezgian
Lithuanian
Livonian
(Ludian)
Luxemburgish
Macedonian
Maltese
Mari, Hill
Mari, Meadow
Megleno-Romanian
(Mingrelian)
Moksha
Moldavian
Nenets, Tundra
Nogai
Norwegian, Bokmål
Norwegian, Nynorsk
Occitan
Old Church Slavonic
(Olonets)
Ossetian
Polish
Portuguese
(Romani)
Romanian
Romansch
Russian
(Rusyn)
Rutul
Sami, Inari
Sami, Kildin
Sami, Lule
Sami, Northern
Sami, Skolt
Sami, Southern
(Sami, Ter) *
(Sami, Ume) *
(Sardinian) *
Scots
Serbian
Slovak
Slovenian
Sorbian, Lower
Sorbian, Upper
Spanish
Svan
Swedish
Tabasaran
(Talysh)
Tatar, Crimean
Tatar, Kazan
Tati
(Tindi) *
(Tsakonian) *
Tsakhur *
(Tsez) *
Turkish
(Turkish, Crimean)
Ubykh *
Udi
Udmurt
Ukrainian
(Våmhusmål)
Vepsian
Votic
(Walloon)
Welsh
(Yiddish)

1.3. The scripts of Europe

The languages listed in 1.2 employ the Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, and Georgian scripts. Formerly, the Linear A, Linear B, Cypriot, Old Italic, Iberian, Ogham, Runic, Old Hungarian, Old Permic, and Glagolitic scripts were used to write European languages.

2.0 Definitions

The following definitions apply to The Alphabets of Europe:

2.1 abjad

A structured collection of graphic symbols used to represent one or more languages, having specific characters for consonants. The Arabic and Hebrew scripts are abjads. The term is derived from sounds of the first four basic letterforms of the Arabic alphabet, ALEF, BEH, JEEM, and DAL. (Peter T. Daniels & William Bright , The world’s writing systems, 1996 (ISBN 0-19-507-9993-0))

2.2 alphabet

A structured collection of graphic symbols used to represent one or more languages, having specific characters for vowels and consonants. The Latin, Greek, Cyrillic, Armenian, and Georgian scripts are alphabets. The term is derived from the first two letters of the Greek alphabet, ALPHA and BETA.
NOTE: To write Yiddish and Ladino, the Hebrew abjad is used as an alphabet.

2.3 autochthonous

Belonging to the original or earliest known inhabitants of a country; aboriginals. (Concise Oxford)

2.4 indigenous

(Esp. of flora or fauna) originating naturally in a region; (of a people) born in a region; (foll. by to) belonging naturally to a place. (Concise Oxford)

2.5 letter

An element of an abjad or alphabet.

2.6 writing system

An abjad or alphabet combined with a collection of numerals, punctuation marks, and other symbols to represent one or more languages.

3.0 Criteria for evaluation of the repertoires

The following criteria form the basis of the review of characters included in the alphabet of each repertoire.

Criterion 1: Letters for inclusion should be determined by definitive and authoritative reference works, if available. Some languages have official institutions governing orthography and usage (examples: L’Académie française for French, Norsk språkråd for Nynorsk Norwegian and Bokmål Norwegian. Other languages have unofficial but respected institutions governing orthography and usage (example: Oxford University Press for English). Most languages have no official institutions, but are described in dictionaries, educational materials, scholarly linguistic texts, and other kinds of documents.

NOTE: Users of this document may choose to weigh the authority for an entry hierarchically:
  1. national standard
  2. official institution
  3. unofficial institution
  4. dictionary
  5. educational material
  6. linguistic description
  7. other description
The source references used for each repertoire are stated in the section dealing with each language. This will allow the user to make an informed judgement of the authenticity of the selection of letters included in an alphabet.

Criterion 2: The selection should be supplemented by usage in literature of well-informed writers in works published by organizations which are recognized to be particular about correct normal usage.

NOTE: A certain snobbishness with regard to notions of “good typography” or “proper spelling” is inherent in this criterion. An example would be the preference of the spelling façade over facade in English.

Criterion 3: Letters from other languages commonly used in texts of a given language may be included if, in good usage by well-informed writers, they are used naturally in the recipient language. Commonly used personal names, such as French names in Breton texts or Russian names in Tatar texts, would fall into this category. Rarely used letters, however, would normally not.

Criterion 4: Letters from loanwords in common usage within the regions in which the language is spoken may be included. (It is unlikely that this criterion would add letters not already included as the result of the previous criteria.)

4.0 Writing systems

Modern European writing systems are alphabetic, not syllabic (like Ethiopic or Cherokee) or logographic (like Chinese). Together with its own alphabet, each of Europe’s languages uses a set of punctuation marks which, in general, is standardized between the scripts. (Exceptions or additions to this set are discussed in the relevant sections below.) Decimal digits are also in standard use, though the Arabic script has unique glyphs for these. All European scripts also have a system of non-decimal numbers derived by giving numeric values to the letters. Decimal numerals are used for calculations; alphabetic numerals are often used for the pagination of front matter, indexing and so forth. (The Alphabets of Europe does not give further information on alphabetic numeral systems.) European scripts typically have a fixed number of basic letters, to which additional letters are appended for use with particular languages. Some of these additional letters are also basic letters which are not to be identified with any other letter; others are derived letters created either by some deformation of the basic letter itself or by adding some diacritic mark or sign to it. The Latin, Greek, Cyrillic, and Armenian alphabets have case, that is, almost all letters have both a capital and a small form. Hebrew, Arabic, and Mkhedruli Georgian do not share this feature. Latin, Greek, Cyrillic, Armenian, and Georgian are written from left to right; Hebrew and Arabic are written from right to left. Hebrew and Arabic are non-European scripts used to write some European languages.

4.1 Punctuation and numbers

Punctuation marks to indicate major breaks in text are relatively ancient. In the earliest texts, spacing, a single dot or stroke, or multiple dots or strokes, were often the only marks used. Over the past 400 years or so, however, a standard set of punctuation marks has evolved which is generally used in common regardless of script. The Alphabets of Europe provides a repertoire of punctuation characters.

4.2 How to read the repertoires

  • For each language, first the name of the language is given in English, followed by the original name of the language in its natural spelling, with a transliteration into Latin letters in parentheses where the original language does not use the Latin script. Some unwritten language names are given in the International Phonetic Alphabet, always set off with [square brackets].
  • The version number of the repertoire is given. The base version was Version 1.0. When comments are received for a particular language and any change to the repertoire made (including notation of the approval of an authoritative source), the version number is increased to 1.1, 1.2, etc.
  • The repertoire itself is given. The repertoire is given, in an alphabetical order as found in the sources, and includes digraphs, trigraphs, or tetragraphs used as “letters” for alphabetizing, when a language is subject to this practice.
    Example: In Welsh, the order “A, B, C, CH, D...” indicates that, in the sources, all words beginning with CH follow all words beginning with CY and precede words beginning with DA (afon, blwyddyn, c̢n, cyngres, chwech, dydd).

    The punctuation conventions in the repertoires are important. Commas separate letters of the alphabet considered unique at the first level of alphabetic ordering.

    Example: In Northern Sami, the order “A, Á, B, C...” indicates that, in the sources, A and Á are considered separate letters and would have their own chapters in a dictionary (arahad, arába, áhkku, bures, cavihit, cábadit).

    Letters in (parentheses) are fundamental letters normal to the alphabet of a languages, used in writing native or naturalized (non-foreign) words, but which are, in the sources, interfiled with the base letter.

    Example: In Irish Gaelic, the order “A (Á), B, C, D...” indicates that, in the sources, A and Á are considered to be variants of the same letter of the alphabet, not different letters, as they are in Northern Sami (ábhar, áir, amháin, áth, bó, ceol, doras)

    [Square brackets] around a letter indicate that, in the sources, a letter is a) usually listed in the alphabet but is only ever used to represent foreign names and words; b) never or rarely listed in the alphabet (in schoolbooks, for instance) but used to represent foreign names and words; or c) never listed as part of the alphabet but often used to represent foreign names and words.

    Example: In Irish Gaelic, the order “A (Á) [À], B, C, D...” describes the fact that words from Scottish Gaelic (such as Gàidhlig) may appear in Irish Gaelic texts.
    Letters in [square brackets] often have their own place in the alphabetic order in the sources.

    {Curly brackets} around a letter indicate letters which may be found in grammars or dictionaries but not in ordinary texts. Letters in curly brackets never have their own place in the alphabetic order in the sources, but are always considered variants of another letter. Examples can be found in the repertoire for Danish.

    NOTE 1: The ordering presented for the repertoire of a language is descriptive, for convenience only, and should not be taken as an authoritative specification. The presentation, in general, reflects the ordering found in source documents.

    NOTE 2: It is known that some languages have, formal or informally, more than one tradition of alphabetical ordering. This information is noted, with greater or lesser precision, on the relevant repertoires when it is available.

  • Notes regarding special characters (such as punctuation) used or other special uses of characters in the language follow the main repertoire.

  • The typical quotation marks used in the language are given where this information is available. In some cases, especially in the case of the “lesser-used” languages, this information may have been inferred from the preferred quotation marks used by a “dominant” language in the area in which the “lesser-used” language is found. This is particularly true in the case of many languages found in the Russian Federation. The information on quotation mark usage should be used with care.
    NOTE: The editor has tried to be careful, but caveat emptor should be the watchword of the implementor. In the case of Tundra Nenets, for instance, because ’ and ” are used as letters of the alphabet, it is fairly certain that « and » are used as quotation marks (as they are in Russian), even though the sources consulted do not give this information explicitly.

  • Unicode and ISO/IEC 10646 characters appearing to this point in each repertoire are listed by ranges of code points in hexadecimal notation. (Characters elsewhere on the page, such as in the bibliographical sources for the repertoire, are not listed here.)

  • Letters appearing in the repertoire but not found as individual characters in the Unicode and ISO/IEC 10646 are then listed, showing the glyph and giving a notional (but not standardized) name. In some cases, these characters may be representable as sequences of characters; in other cases, they cannot be represented by the standard. Specification of the combining sequences is outside the scope of The Alphabets of Europe.
    Example: In Svan, the letter GEORGIAN LETTER AN WITH DIAERESIS can be represented by the sequence U+10D0 and U+0308. The letter GEORGIAN LETTER ELIFI cannot (when this was originally written in 1998-11 – it has since been added to the standard) be represented by any character in the UCS.

  • Sources used in compiling the repertoire are given. These references are presented in the language of the source as they appear on the title page of that source, because transliteration for these titles may differ greatly from country to country in national library practices, and it has been considered impractical to try to be comprehensive or to prefer any particular transliteration practice for the presentation of this information. Transliteration tables for European scripts are not difficult to procure – most national encyclopedias contain this information, for instance.

    Where a repertoire has been specified or confirmed by an authoritative source, information about that specification has been given before the bibliographical references.

    NOTE: Many of the sources, especially those sources printed in the former Soviet Union, have two title pages, one in Russian and one in another language. In such a case, the primary title page information is given as the principal form and the secondary title page (or bibliographical information found elsewhere in the work) is given either in parentheses (for the author’s name) or following an equals sign = (for the title). An example of this can be found on the page for Abaza.

4.3 Commenting on the repertoires

The repertoires are based on facts, namely, dictionaries, grammars, and other materials. If you are not happy with the content of an alphabetic repertoire, it is necessary that documentation of the requested changes be sent to the editor. To ensure completeness, such documentation should include proof of the additional letter or letters and their relative order for presentation with other letters, and the title page and copyright page of the source material. To ensure clarity, the documentation should be sent in hardcopy photocopies to the editor at Evertype, 73 Woodgrove, Portlaoise, R32 ENP6, Ireland (Ireland), or made available to him by PDF file.

Annex A. Administrative units of Europe

The following list enumerates the administrative units corresponding to the geographical definition of Europe in clause 1.1. This list was valid at the time of its compilation (1995-03-01). Spelling of entity names follows that given in ISO 3166-3.

The following countries and self-governing dependencies: Albania, Andorra, Armenia, Austria, Azerbaijan (including the autonomous republic of Naxçivan), Belarus, Belgium, Bosnia and Hercegovina, Bulgaria, the Channel Islands, Croatia, Cyprus, the Czech Republic, Denmark, Estonia, the Faroe Islands, Finland (including Åland), France, Georgia (including the autonomous republics of Abkhazia and Ajaria and the Autonomous Region of South Ossetia), Germany, Greece, Hungary, Iceland, Ireland, the Isle of Man, Italy, Latvia, Liechtenstein, Lithuania, Luxembourg, Macedonia, Malta, Moldova, Monaco, the Netherlands, Norway, Poland, Portugal, Romania, San Marino, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey (excluding Anatolia), Ukraine, the United Kingdom, the Vatican City, Yugoslavia (Crna Gora, Srbija, Kosovo-Metohija, and Vojvodina).

The following Republics in the Russian Federation: Adygea, Baškortostan, Čečenija, Čuvašija, Dagestan, Ingušetija, Kabardino-Balkarija, Kalmykija, Karačaj-Čerkesija, Karelija, Komi, Mari-El, [Mordvinija,] Severnaja Osetija, Tatarstan, Udmurtija.

The following oblasts in the Russian Federation: Arkhangelʹsk (including the Nenets Autonomous Okrug), Astrahanʹ, Belgorod, Brjansk, Ivanovo, Jaroslavlʹ, Kaliningrad, Kaluga, Kirov, Kostroma, Kursk, Leningrad, Lipetsk, Moskva, Murmansk, Nižnij Novgorod, Novgorod, Orël, Orenburg, Penza, Permʹ (including the Komi-Permjak Autonomous Okrug), Pskov, Rostov, Ryazanʹ, Samara, Saratov, Smolensk, Tambov, Tula, Tverʹ, Ulʹjanovsk, Vladimir, Volgograd, Vologda, Voronež.

The following krais in the Russian Federation: Krasnodar, Stavropolʹ.

Annex B. European Sign Languages

There are several indigenous signed languages used by the deaf in Europe. Some of these have official use (being required or permitted in court for instance); for others there is little information available. Various transcription systems exist. In addition to academic transcriptions like the Stokoe notation, however, a writing system called SignWriting was developed 25 years ago by Valerie Sutton and is rapidly spreading worldwide. SignWriting is It is known to be in use in six European countries as shown below. Most of the information here on the official status of these languages is derived from the SIL Ethnologue.

Armenian Sign Language. Survey needed.
Austrian Sign Language. Some official use.
Belgian Sign Language. Offical use.
British Sign Language. Official use.
Bulgarian Sign Language. Official use.
Catalonian Sign Language. Survey needed.
Czech Sign Language. Official use.
Danish Sign Language. Official use.
Dutch Sign Language. Some official use.
Finnish Sign Language. Official use.
French Sign Language. Official use.
German Sign Language. Official use.
Greek Sign Language. Some official use?
Icelandic Sign Language. Official use.
Irish Sign Language. Some official use.
Italian Sign Language. Little official use?
Latvian Sign Language. Survey needed.
Lithuanian Sign Language. Survey needed.
Lyons Sign Language. Survey needed.
Maltese Sign Language. Survey needed.
Monastic Sign Language.
Norwegian Sign Language. Official use.
Polish Sign Language. Official use.
Portuguese Sign Language. Some official use?
Romanian Sign Language. Survey needed.
Russian Sign Language. Official use.
Scandinavian Pidgin Sign Language.
Serbian Sign Language. Official use.
Slovakian Sign Language. Survey needed.
Slovenian Sign Language. Survey needed.
Spanish Sign Language. Official use.
Swedish Sign Language. Official use.
Swiss-French Sign Language. Survey needed.
Swiss-German Sign Language. Survey needed.
Ukrainian Sign Language. Survey needed.

Annex C. Changes from previous versions

2002-10-31 Karelian and Kildin Sami added. Adyghe, Slovenian, and Votic updated.
2002-10-10 Slovak updated.
2001-12-30 Swiss-German Sign Language link added.
2001-11-20 Slovak updated.
2015-11-09 Belarusian and Ukrainian updated. Montenegrin added.
 
HTML Michael Everson, Evertype, 73 Woodgrove, Portlaoise, R32 ENP6, Ireland, 2004-02-07

Copyright © 1993-2015 Evertype. All Rights Reserved
 
Number of visitors since 2001-11-20: WebCounter