Note: updated versions of these tables were generated in April 2014 and April 2015.


Choosing a second language (for English speakers)

I have dabbled in foreign languages for years and picked a little Spanish and French vocabulary. This year I decided to get serious. Spanish is the obvious choice for me here in the USA, but I decided to review the possible languages objectively before investing too much time in Spanish. So what makes a good language? How does one compare one language to another? This method is amusing. Are there more objective methods?

When comparing difficulty of languages, there is a somewhat objective measure for English speakers. The Defense Language Institute provides language courses to US military and government personnel. The time length of the basic course for a language is a guideline for the overall difficulty of that language for an English speaker. DLI courses fall into four bands: 26 weeks, 35 weeks, 48 weeks, and 64 weeks. The Romance languages form the easiest group due to shared vocabulary from the Roman Empire occupation and the Norman invasion. Arabic and the China Sea languages fill the most difficult band. Other languages fall somewhere between.

Difficulty by itself is a negative measure. What is the attractive force of each language? A common approach is to compare the number of speakers, but the absolute number means little to me. A billion people in poverty are less interesting than 10 million people with money for education, art, and travel. How does one quantify that? I chose to the use the per capita GDP/PPP. The Gross Domestic Product (GDP) represents the total value of goods and services produced in a country for a year. The Purchasing Power Parity (PPP) adjusts the GDP based on the cost of living in each country. The per capita PPP reflects the standard of living in a country. Per capita values may serve as proxies for disposable income, education levels, literacy, quality of print publishing, and breadth of media content. One can set an arbitrary floor for per capita PPP and count the population of speakers above that value. I chose an arbitrary amount of US $20K as the critical value. In the most developed countries, the overall GDP is so high that I include the entire population. For other countries, I use the income distributions from to count the population at comparable levels. Then I divide that population by the DLI course length to produce a ranking number. The Python code that generated these tables can be found here.

Floor = US $20K
Language Pop.(m) Weeks Efficiency
English 525.5
Spanish 290.0 26 11.15
Mandarin 347.8 64 5.43
Portuguese 112.0 26 4.31
French 96.8 26 3.72
Russian 138.6 48 2.89
German 96.1 35 2.75
Italian 60.1 26 2.31
Japanese 127.3 64 1.99
Arabic 72.0 64 1.13
Malay 38.5 35 1.10
Turkish 47.6 48 0.99
Polish 38.5 48 0.80
Korean 50.0 64 0.78
Persian 36.0 48 0.75
Floor = US $10K
Language Pop.(m) Weeks Efficiency
English 680.6
Mandarin 1241.5 64 19.40
Spanish 425.7 26 16.37
Portuguese 216.0 26 8.31
French 119.9 26 4.61
Russian 200.0 48 4.17
German 96.1 35 2.75
Malay 90.1 35 2.58
Italian 60.1 26 2.31
Arabic 143.0 64 2.23
Japanese 127.3 64 1.99
Persian 76.6 48 1.60
Turkish 75.6 48 1.57
Hindi 66.1 48 1.38
Thai 63.9 48 1.33

Each difficulty group has a dominant language. It is hard to argue in favor of Korean or Arabic when the interesting population of Mandarin Chinese is so much larger; only Japanese comes close. Spanish dominates the lower band. French and Portuguese may receive honorable mentions, but Italian cannot compete with the populations of colonial legacies. Changing the floor to US$10K does not threaten the leader of any difficulty group. From this perspective, there are only seven interesting candidates: Spanish, Portuguese, French, German, Russian, Mandarin, and Japanese.

The middle tier of the US$20K table is filled with countries that share a problem. The populations of Germany, Russia, Japan, and Italy face a demographic decline. Japan and South Korea especially are rapidly aging nations with looming problems. This may mean future job openings for current language students. However, if the core population of a language is shrinking, it makes that language less interesting for literature and cinema and it makes the market area less attractive. On the other hand, Spanish, Portuguese, and European French should experience mild, healthy growth. China is actively trying to control its population growth and appears to be succeeding.

Financial calculations are just a starting point. Richer populations have the opportunity to be interesting, but do they take advantage of it? There are no convenient charts for the nebulous category of cultural production. It is hard to judge the quality of literature in languages you do not know. The Norwegian Book Club prepared a list of The 100 Best Books of All Time, assembled by 100 writers from 54 countries in 2002. Half of the entries were in English, French, or German. Another data source is the Nobel Prize in Literature. The Nobel Prize has become more inclusive over time but Portuguese and Arabic each have only one winner. The lack of Asian entries probably reflects a Western bias.

It is hard to find a useful language-neutral list of films because Hollywood dominates the genre, but Britain's BAFTA Awards are interesting. The Foreign Language nominations show persistent strength in French films. Early strength in Italian and Japanese faded after 1989 while Spanish cinema surged; Chinese and Hindi films began appearing at the same time. The British Academy may have some bias towards their neighbors, but the volume of French nominations (41% of all nominations) makes the consistent strength and breadth of French cinema obvious.

Language EngFraGerEspIta RusPolJapChi (Nordic)OtherTotal
Nobel 1901-193976623 1200 9338
Nobel 1944-19798.55.5352 3010 5638
Nobel 1980-201210.52441 0.5212 1533
Nobel Winners2613.513116 4.5422 1514109
Norwegian Book Club29.511.51065 9021 620100
BAFTA 1949-198940.55322.5 4.536.50 8699
BAFTA 1990-201343.38156 0.5016.5 519.7105

The value of learning a language is not measured only by quantity of material. Paul Ward's comment to Français non plus? resonates very strongly with me: "the great joy and mystery for me in my pursuit of French language skills has been in discovering a people who look so much like others in the West, but who have such particular perspectives that I'm constantly being brought up short in my understanding. Learning French isn't about the number of people speaking it, it's about the sheer volume of insights into history, culture, and semiotics that come along with learning it." One could define an idea of cultural distance as the variations between one's own culture and a target culture and use that idea to evaluate and compare languages. Learning a language from a culture with the same philosophical or religious background does not provide any new insights or challenges to the student. Learning a language from a completely alien culture could provide both immense rewards and immense challenges. For the purposes of this paper, I would argue that the interesting languages fall into four cultural groups: European, Islamic, Indian, and Sinitic.

Although Europe has a superficial diversity, its components interact so strongly that ideas and philosophies flow easily across borders. Each strong idea is quickly translated into dozens of languages. There is a shared background based on Greek philosophy, Christianity, and Renaissance Humanism. When one learns a European language, one learns to say the same things with a different vocabulary. French is the exception. France's perpetual struggle against Anglo-American and German ideas makes them the most independent culture in Europe1. The French language also has a great variety of speakers; while other colonial languages displaced the native languages, French is still spoken alongside native languages in Africa and Asia. In addition, the country and culture of France itself attracts more tourists than any other country in the world. Other candidate languages such as German or Russian provide more difficulty but less rewards.2

Europe's frequent contact with the Muslim world, including the occupations of Spain and the Balkans, reduces the cultural distance. Europe has tried and rejected the Islamic core value of violent monotheism. The Muslim culture of violence and repression is growing stronger in Malaysia, Indonesia, Egypt, and Turkey and the already anemic world of Arabic literature could begin to shrink.3 Arabic, the premier Muslim language, also presents a steep learning curve to a student with its unwritten vowels, diglossia between written and spoken language, and significant regional variations in the spoken language.4

The Indian subcontinent is a fascinating area and Hindu culture has been a resilient foil to Muslim aggression. Although there is sufficient cultural distance to make Indian languages interesting, or perhaps mystifying, the continuing presence of English as an official language in India, Pakistan, and Bangladesh reduces the incentive to learn any of them.5 In the US$10K table, Hindi does appear in the rankings, but it appears that population growth in northern India will exceed economic expansion for the forseeable future, keeping per capita GDP figures dismally low.

The most distant cultural area is China and its neighbors. The China Sea cultures evolved completely independent of the West. Confucianism and Buddhism are the most interesting belief systems outside of the Western tradition. China has economic and demographic advantages but Japan has a much higher per capita GDP and more interesting literature. Japan's urban population achieved literacy in the 1700s and the pool of works is large, varied, and inventive. Classic Chinese is notoriously impenetrable. Modern Chinese authors had to cope with the Civil War, the Japanese invasion, and Communist censorship. The interesting work in Chinese tends to come from the periphery, but the combined population of Taiwan, Hong Kong, and Singapore is less than half of Japan's population. The smaller populations of South Korea, Thailand, and Vietnam cannot compare to the giants.

When comparing interesting population size to difficulty, two languages stand above the rest: Spanish dominates the West and Mandarin dominates the East. One might choose French or Portuguese over Spanish, but it is harder to make an argument for any other European language. The American Romance languages have positive economic and demographic trends and French has exceptionally strong literature and cinema. All three languages cover large geographic areas and can offer a variety of locations for work, tourism, or retirement. Spanish has also become too useful to ignore here in the USA. Outside of Europe, Mandarin and Japanese are the overwhelming favorites. The larger time investment is justified by the way they open doors to the most interesting non-Western cultures and to the largest national economies outside of the Anglophone world.

1"the French are so … well, French, and therefore designed by God to seem as provokingly dissimilar from the British as possible." - Julian Barnes, Preface to Something to Declare: Essays on France and French Culture, Vintage Books, 2002.
2While Russian seems competitive on the Norwegian Book Club list, seven of those nine entries were written by Dostoevsky and Tolstoy. Entries of other languages were more diverse.
3David Tresilian described a trend of "the growing intolerance of literary expression generally, which has made what was always perhaps a minority activity into one that is now that of a sometimes embattled minority. Religious conservatism tends not to value literature on the liberal model – literature, in other words, that carves out a space for intellectual exploration and freedom of expression ..." - Conclusion to A Brief Introduction to Modern Arabic Literature, Saqi, 2008
4See Arabic: A language with too many armies and navies? for an example of the wide divergence between varieties of spoken Arabic.
5Likewise, most African states have adopted English, French, or Portuguese as a bridge language for their disparate populations.