Saturday 31 March 2007

Language and Translation Industry of India: A historical Perspective

By Ravi Kumar (Founder President, Indian Translators Association )
This blog finds its great contribution in content as well as data from a research done by B. Mallikarjun, Ph.D, Central Institute of Indian Languages, Maysore, India. In addition I have also used data from Common Sense Advisory, Inc., Nasscom, and Microsoft India. The information as well as data may not be taken as final in itself, rather they are just indicative to support few of my personal views and analysis that I intend to present to depict overall picture of Indian Translation Industry.
INDIAN MULTLINGUALISM
Modern India, as per the 1961 count, has more than 1652 mother tongues, genetically belonging to five different language families. Apart from them 527 mother tongues were considered unclassifiable at that time. The 1991 Census had 10,400 raw returns and they were rationalized into 1576 mother tongues. They are further rationalized into 216 mother tongues, and grouped under 114 languages: Austro-Asiatic (14 languages, with a total population of 1.13%), Dravidian (17 languages, with a total population of 22.53%), Indo-European (Indo-Aryan, 19 languages, with a total population of 75.28%, and Germanic, 1 language, with a total population of 0.02%), Semito-Harmitic (1 language, with a total population of 0.01%), and Tibeto-Burman (62 languages with a total population of 0.97%). It may be noted that mother tongues having a population of less than 10000 on all India basis or not possible to identify on the basis of available linguistic information have gone under 'others'. So, good number of "languages" recorded in the Indian Census could not be classified as to their genetic relation, and so are treated as Unclassified Languages. The Indo-Aryan languages are spoken by the maximum number of speakers, followed in the descending order by the Dravidian, Austro-Asiatic, and Sino-Tibetan (Tibeto-Burman) languages. Eighteen Indian languages, namely, Assamese, Bengali, Gujarati, Hindi, Kashmiri, Kannada, Konkani, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu, and Urdu are spoken by 96.29% of the population of the country and the remaining 3.71% of the population speak rest of the languages. Not only India as a whole is multilingual but also each State and Union Territory within India is equally multilingual. Linguistically India is made of many mini-Indias. The number of multilingual population is also remarkable. They constitute 19.44% of the total population in India. The traditionally strong constituent of multilingual groups is further strengthened in modern times from one decade to another, as mobility within the country as well as the introduction of formal education in all parts of the country that insists on learning at least two languages until the end of high or higher secondary education. Although Kerala appears to be the most cohesive linguistic state with a single language, Malayalam, claiming the mother tongue status for nearly 96 percent of its population, bilingualism among this mother tongue group is equally good.
A CENTURY OF RECORDED BILINGUALISM
For more than one hundred years, the Census of India reports have been taking notice of the bilingual situation in India. Bilingualism is often taken as a given fact. Bilingualism is also used as a denominator of the movement of various populations from one region or province to another. Bilingualism figures are often used to make political claims and seek privileges in administration, education, mass communication, and other departments of public life in general. Educational policies of the states are guided by these figures. However, the quality of bilingualism or the level of bilingualism often remains unspecified in linguistic terms in these claims. The way the details of bilingualism and tri-lingualism are arrived at, in surveys such as the Census enumerations, is also noteworthy. In the Census, names of two other languages known to the respondents in the order of proficiency are recorded. Here, the names of languages, other than the one recorded as the mother tongue, is elicited by asking the respondent about the other languages known to him or her. These may be Indian or foreign languages. If the respondent knows only one language, the name of that particular language only is recorded. If the respondent has knowledge of more than one language, the names of two languages in the order of proficiency, self-assessed by the respondent, are recorded. These two languages are recorded one after the other. Between these two languages, that language in which the respondent can, according to his claim, comprehend, speak and communicate is recorded first, and the other language as the second item. The individual need not know reading and writing in these languages. It is enough if he speaks and communicates in these two languages. However, the number of languages thus recorded will not exceed two. Naturally evolved bilingualism coupled with bilingualism evolving through schooling has become a big language resource, and it is exploited mainly by the mass media for enhancing its reach across the population. What is needed is a more in depth linguistic study of bilingualism as a linguistic idea. While figures are very important, qualitative features of bilingualism as a linguistic idea yet to be studied. The number of bilinguals is on the increase from Census to Census. Their national average is: 1961- 9.70; 1971- 13.04%; 1981- 13.34%; 1991- 19.44%. Some of the important results of the multilingual picture of India emerging from the 1991 Census is that in case of major languages, 18.72% are bilinguals and 7.22% are trilinguals and the bilinguals among minor languages are 38.14% and the trilinguals are 8.28%. Significantly among major language speakers, spread of bilingualism in English is more(than in Hindi) - 8% as second language, 3.15% as third language where as the same in Hindi is 6.15% and 2.16%.
LANGUAGE POLICY
The Language Policy of India relating to the use of languages in administration, education, judiciary, legislature, mass communication, etc., is pluralistic in its scope. It is both language-development oriented and language-survival oriented. The policy is intended to encourage the citizens to use their mother tongue in certain delineated levels and domains through some gradual processes, but the stated goal of the policy is to help all languages to develop into fit vehicles of communication at their designated areas of use, irrespective of their nature or status like major, minor, or tribal languages. The policy is accommodative and ever-evolving, through mutual adjustment, consensus, and judicial processes. The accommodative spirit may be dim at times, and the decisions vacillating and fidgety, but this spirit was continuously prevalent from the early days of the struggle for independence from the British rule. This was seen as a necessity in nation-building. Political awareness or consciousness relating to the maintenance of native languages has been very high, both among the political leadership and among the ordinary people who speak these languages. The language policy of the country is elucidated in its Constitution, implemented through various executive orders that have been issued from time to time and the judicial pronouncements since 1950. These have directed the way the languages are used in various domains.
LANGUAGE CLUSTERING
The Constitution of India listed fourteen languages Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Oriya, Punjabi, Sanskrit, Tamil, Telugu, and Urdu, into its Eighth Schedule in 1950. Since then, this has been expanded thrice, once to include Sindhi, another time to include Konkani, Manipuri and Nepali, just this month the third time to include Bodo, Santhali, Maithili and Dogri. The 100th Constitution Amendment which added four more languages - Bodo, Maithili, Santhali and Dogri into the Eighth Schedule was supported by all the 338 members present in the Parliament. It has been stated that the claims of 33 more languages for inclusion are under consideration. This list is open-ended and has become a tool to bargain and gain benefits for the languages. Once a language gets into this club, its nomenclature itself will change, status will change, and it will be called Modern Indian Language (MIL), Scheduled Language (SL), etc. This Schedule has emerged as the most important language policy statement. It clusters thousands of written and unwritten languages and dialects into two broad categories of Scheduled and Non-Scheduled languages. Though historically, it is not possible to find any rationale to cluster the Indian languages into these categories, the languages of the Eighth Schedule are not normally treated on par with Non-Scheduled languages(Mallikarjun 1986). The languages of the Schedule have preferential treatment, and the languages listed in this schedule are considered first for any and almost every language development activity, and are bestowed with all facilities including facilities to absorb language technology initiatives of the government. It is needless to mention that the Technology Development in Indian Languages (TDIL) did not, and under present circumstances would not percolate beyond these languages. The second kind of clustering is at the level of mother tongues into "languages." Though 114 languages are arrived at by the Census Office, many of these languages are not independent and individual entities as such. Within these, there are many mother tongues/languages/dialects. The group of "languages" is formed by clustering of the populations of many mother tongues under an umbrella called "language." For example, Hindi is a cluster of more than 45 mother tongues, which include Awadhi, Banjari, Bhojpuri, Braj Bhasha, Bundelkhandi, Chambeali, Chattisgarhi, Garhwali, Haryanvi, Kangri, Kulvi, Labani, Magahi, Maithili, Marwari, Mewari, Pahari, Rajasthani, Sadri, Sugali, etc. The varieties of Hindis combined to form the Hindi of post-independence era helped in the unification of the Hindi-speaking population for demographic purposes (statistical majority), and not for the development of communicative pan-Indian Hindi as envisaged by the framers of the Constitution. Due to the expansion of media network in the past decade, pan-Indian Hindi is developing mainly through the audio-visual mass media. The Hindi thus developed has a greater impact on non-Hindi speaking states. This could lead to a position where pan-Indian Hindi assumes some of the functions of non-Hindi major Indian languages. A secondary globalization process, thus, may help Hindi, and may not help the other major Indian languages (Mallikarjun 2003).
MASS COMMUNICATION
Print Media: Here, people's choice of languages in which they wish to read the news and related information directs the language policy to be adopted. There is no bar on starting newspapers in any language or dialect in the country since the private sector mainly rules this domain. Also, there is no bar on any language to be written in any script in India. The print media in India got initiated in 1780. Since then it has grown enormously. Also, their growth is steady over the years. According to the 2002 Survey, newspapers and periodicals are published in 101 languages and dialects. It may be seen that foreign languages are also part of this list. However, Hindi tops the ranking of the languages, according to the number of newspapers being published in any language: Hindi (2507), Urdu (534), English (407), Marathi (395), Tamil (395), Kannada (364), Malayalam (225), Telugu (180), Gujarati (159), Punjabi (107), and Bengali (103). In terms of readership in languages in various languages the Survey 2003 provides us with some very interesting figures. According to the same study, in terms of all-India readership (urban + rural), the top ten magazines are (figures in millions): Saras Salil (Hindi): 9.38; India Today (Hindi): 5.9; Vanitha (Malayalam) : 5.51; Grihashobha (Hindi) : 5.41 ; Malayala Manorama(Malayalam): 5.40; Meri saheli(Hindi.): 4.26; India Today(English): 3.95; Balarama(Malayalam) : 3.57; Mangalam(Malayalam): 3.51 The National Readership Survey 2002 conducted by the National Readership Studies Council (NRSC) says that there is sharp growth in the sales of English newspapers in towns with populations ranging from one lakh to five lakhs, whereas growth in Hindi and regional language newspapers is from the towns with populations below five lakhs. English is becoming more popular in the rural areas due to the growth and development of reading skill in English through school. English, thus, is establishing a solid mass base for itself in the rural areas. The language policy of India in status, corpus and acquisition planning, as we said earlier, protects and preserves plurality in all domains of language use in spite of presence of languages with 'power'. Though the UNESCO reports that "... about half of the approximately 6000 languages spoken in the world are under threat, seriously endangered or dying," it does appreciate that "India has maintained its extensive and well-catalogued linguistic diversity, thanks to its government policies." DIGITAL DIVIDE Apart from the linguistic divide, India faces many other divides revolving around ethnicity, religion, region, social identity, rural/urban, literate/illiterate, etc. Majority of her population live in the rural areas. In 2001 urbanites constituted 27.24% whereas the rural population was 72.24%. The rate of literacy for the entire country in 2001 was 65.2 %, with the highest literacy in Kerala above 90%, lowest literacy in Bihar less than 50%, rural literacy at 59%, urban at 80%, males at 76%, and females at 54%. We may learn few lessons when we study as to how India tried to bridge these divides. India approached the rural/urban divide issue through rapid urbanization and creation of near equal infrastructure in the rural areas. Similarly, the literate/illiterate divide was approached through the movements for mass adult literacy, combined with education for all through schooling. The religious divide was sought to be bridged by declaring the nation secular, and by providing Constitutional protection to religious minorities, thereby to a large extent religious harmony was maintained except for some rare aberrations. In all the Indian languages, however, a different kind of digital divide is also developing. On the one hand, the level of literacy in the mother tongue/regional language is on the increase because of the accelerated effort from the non-formal and informal sectors, and, on the other hand, in the formal sector of education, literacy in mother tongue is losing value in the context of demand for English and computer literacy (Mallikarjun 2003).
INFORMATION TECHNOLOGY
Enter IT revolution, we see the emergence of an information society, scattered and loosely connected, and created by the rapid surge in the information and communication technologies. But the slow pace with which the Indian society is trying to absorb these technologies through its organs such as language has added one more divide to the many already existing - the "digital divide" resulting in the disparity in access to information and to the means of communication in Modern India in the 21st Century. Computer penetration in India is estimated to be 7.5 per 1000 people but at the same time the internet is able to reach only about one percent of the total population of the country. Internet subscription - in India actually is only by the 0.4 percentage of the population according to the 2003 report of the TRAI. The Indian languages in which the internet search engine Google can search is - Bengali, Hindi, Marathi, Tamil and Telugu. Persistent and intense maintenance of the digital divide may result in more retrograde and disastrous steps than all other divides put together, because a new generation of people with same color and blood (to play with the phrase introduced by Lord Macaulay in his Minute in 1834) but with no commitment to the locals will homogenize everything resulting in the loss of age-old pluralism that endangered freedom. So, since what people want in the digital world is not available in their languages, both the government and the people are fast moving towards introducing English at the earliest level in education. The language vitality - capacity of a language to live, grow, and develop - depends upon various factors. Some of these are: social status, demography, and institutional support. Access to Information and communication technology in their own language is one of the ways to empower the people and enhance the vitality of a language.
LOCAL LANGUAGE COMPUTING APPLICATIONS
There is a huge untapped potential that needs to be explored by the government and the vendors to ensure successful use of local language computing applications. The local language computing sector requires a boost to encourage the use of local language computing applications among the masses. Some of the projects initiated by the government have failed primarily due to the lack of commercialization of technology and lax timelines for projects. Moreover, the majority of the players in the sector are mid-sized companies or educational institutions with limited financial muscle; hence they often tend to be restrained in terms of their research and development (R&D) spending on new technologies. The key to success lies in reducing redundancies and enabling positive amalgamation of ideas and sharing of knowledge among government institutions, academia and vendors. A collective and combined approach is required to generate adequate content. Machine translation and creation of lexware, dictionaries, and WORDNET also need a collaborative approach that can lead to a faster development and intelligent computer learning of the language. The Government of India has begun using local language applications in their departments. It is now very important for them to ensure that most of the software for workflow process and documentation systems is enabled in local languages.
INITIATIVES IN LOCAL LANGUAGE MARKET
Only 3 percent of the Indian population can speak in English while close to 40 percent of the Indian population speaks Hindi or one of its variants. Still, the medium of communication in higher education, judiciary, bureaucracy, and the corporate sector is English. Since English is the medium of interaction in IT systems too, structurally, such a situation aggravates the divide between segments of population that have access to computing and the ones that don't. To arrest this situation, an important step has come from the Ministry of Communication and Information Technology in the form of The Technology Development for Indian Languages (TDIL). TDIL has been mandated to bridge the digital divide by developing IT tools in local languages in India. Since 1991, TDIL has sponsored research in developing Indian language computing resources, processing systems, tools and translation support systems and localization of software for Indian languages. The other key initiatives have come in from development of Human-Machine Interface Systems and development of web centric applications. TDIL operates on a distributed innovation model through collaborations with 13 resource centers across India. Some of the notable milestones have come through CDAC, a collaborative partner of TDIL in form of GIST (Graphics and Intelligence-based Script) that has brought diverse users to employ local language IT tools. Applications have ranged from desktop publishing to sub-titles in TV broadcast in various Indian languages. A Local Language word processor, ‘LEAP' has brought desktop publishing to a large segment of population in a language they can communicate in naturally. In this Microsoft has also taken lead by launching Windows Xp enabled in local languages in year 2005. In addition in June 2005, National Knowledge Commission (NKC) has been created by government of India to take steps that will give India the ‘knowledge edge’ in the coming decades, i.e. to ensure that the country becomes a leader in the creation, application and dissemination of knowledge. Under this objective NKC has been entrusted with the task to promote literature and translation activities in Indian Languages. This may be considered as one of the most important steps by Indian Government to promote Translation and language activities in India, at the initial stage it involves a package to the tune of 2500 Million Indian Rupees. ( Average: 45 Rupees = 1 USD)
INDUSTRY CHALLENGES
While the eventual benefits of increasing access to local language market is clear, there are multiple challenges that the fledgling Indian market will have to overcome before the avowed vision is taken to reality. Some of the key challenges confronting the market at this point of time are: Lack of universal standards for scripts and fonts, input devices and transliteration Limited availability of software and fonts Low availability of local language content
LOCAL LANGUAGE SOFTWARE MARKET-VENDOR ANALYSIS
The key drivers that will drive exponential growth for this market will be • Newer areas of application for Local language IT • Broad based e-Governance initiatives that will employ local language as a front end to disseminate Government services to citizens and • Bundling of multi-lingual software with PC's and other access devices The market for Local Language IT is also likely to face a number of restraints that could inhibit the pace of adoption. They are: • Lack of formal language-based IT training • Limited usage of available local language applications Lack of spending • Low connectivity The Local Language IT market constitutes predominantly of word processing. Word Processing applications revenues in 2002 constituted 48 percent of the total market, with Packages and DTP constituting 20 percent and 18 percent respectively. While word processing software will continue to occupy a lion's share of the total revenues by 2005, package applications and local language multimedia and video applications are likely to grow at a significant pace. Reflecting the diverse application areas that local language IT will be used across in the future, consulting services revenues are expected to see a big jump. Consulting services revenues were 47 percent in 2002; by 2006 the consulting services revenues are expected to grow to 67 percent of the total market. Investments by Governments on e-Governance will find a way to the Local Language IT market. The share of e-Governance has increased from 38 percent in 2002 to 58 percent in 2005. The Local Language IT market constitutes of about 12 to 14 vendors. Most of the domestic players are regional and have limited access to the market. They offer both off-the-shelf products and custom made applications in all the major Indian languages. The other set of key player in the Local Language IT market are international players. International vendors are yet to take off in a big way in terms of the application offering across different languages. IBM offers a Hindi version of Lotus Notes in India. However, the participation of international vendors is expected to increase in the next three years. As mentioned before Microsoft has already taken lead by launching Windows Xp enabled in local languages in year 2005. There is an overall consensus on the benefits of e-Governance in India. While a wide variance exists between states in terms of their e-Governance initiatives, it is expected that over the medium term, a greater number of states will provide services to citizens over the electronic medium. Deploying Local Language IT as a part of State and Central e-Governance implementations will serve the cause of improving the reach and quality of services offered across a wide section of the citizens.
E-GOVERNANCE INITIATIVES AND POTENTIAL FOR LOCAL LANGUAGE MARKET
State Governments have deployed citizen services in local languages and the early benefits are clearly visible. Early Government-to-Citizen Portals such as eSeva have proved the feasibility of the model. Frost & Sullivan expects this trend to extend on both scale and scope: a wider bouquet of services will be available to a larger section of citizens. Andhra Pradesh is the state with the biggest spend on Local Language IT contributing 23.6 percent to the total market revenues for the Industry. Gujarat is the second highest spender followed closely by West Bengal.
TRANSLATION INDUSTRY AND NEED OF CAT TOOLS
India is one of top ranked destinations for BPO and IT outsourcing and as per the reports from Nasscom it would see almost 10-fold growth, from $17.3 billion now to $166.5 billion, by 2010. The report by Nasscom and McKinsey projected that the BPO business would grow from $11.6 billion at present to $150 billion by 2010, while IT outsourcing would go up from $18.4 billion to $150 billion in the next five years. It said India's offshore industries had grown three-fold from $4 billion in 2000 to $ 12.8 billion in 2004. On the other hand, services exports grew 60 per cent from $16 billion in 2000 to $25 billion in 2004. The below mentioned table shows the projection of revenues (worldwide) of several thousand companies active in the translation and localization related business. The calculation includes many freelancers, and an approximation of the revenue generated by international and ethnic marketing agencies, boutiques, system integrators, consultants, printers, and other service providers who facilitate translation and localization.
% of Market- 2005- 2006- 2007- 2008 - 2009- 2010
U.S. 42 %- 3,696- 3,973- 4,271- 4,592- 4,936- 5,306
Europe 41 %- 3,608- 3,879- 4,169- 4,482- 4,818- 5,180
Asia 12 %- 1,056- 1,135- 1,220- 1,312- 1,410- 1,516
ROW 5 %- 440- 473- 508- 547- 588- 632
Totals N/A- 8,800- 9,460- 10,168- 10,933- 11,752- 12,634
Language Services Revenues, in Millions of U.S. dollars Source: Common Sense Advisory, Inc.
SWOT Analysis
Strengths
Abundance of Translators in Indian as well as foreign languages Presence of IT giants, IT service provider and BPO boom creates high demand for Language Professionals. Agencies, Institutions / Universities / Diplomatic Missions, Corporate house, Government bodies, BPOs, Publishing Houses, E-books, Software companies etc. are using services of language professionals in big number.
Weakness
Although there is abundance of Manpower, however, Indian Translators are less equipped with CAT tools stringent quality control process. However, in recent years it has seen many translators using CAT tools. CAT tools like TRADOS, SDLX etc. are very costly. Recently a CAT tool maker called HEARTSOME (http://www.heartsome.net/ from Singapore has entered the Indian market with cost effective tools to provide variety of options to Indian Translators at Indian Price.
Opportunities
Opportunities are abound in India that is one of the largest markets in the world. Microsoft study shows that the Local Language IT market is in a development stage and the market is expected to grow at a healthy rate of 80 percent (CAGR) from $ 11 Million in 2002 to $ 64 Million in 2005 and $ 115 Million in 2006 Accordingly as per our estimates the Translation market is estimated to grow from $ 115 millions in 2006 to 1150 millions by 2010. (This estimate is based on comparative analysis of data issued by a leading consultancy Common Sense advisory: that Asia shares 12% of the Translation market, and that we assume India shares 50% of the Asian market since it is the Global destination for BPO and IT outsourcing).
Threat
Indian Translators involved into foreign language translation activities have challenges from low cost countries like China or countries from Latin America, however, translators involved into translation of Indian languages do not face any challenges from abroad, rather they will face competition from within their own country. Further there are chances that leading language service providers from abroad may arrive in India and create bigger competition for domestic agencies by offering better services and better remuneration to translators. However as per our estimate based on random survey and personal interaction with leading translation agencies and translators forums Indian translators are yet to be called under threat because of the fact that awareness has been created via annual Indian Translators Meet http://www.bhashaindia.com/Patrons/events/TranslatorsMeetReport.aspx
or through agencies as well as through forums to keep upgrading translation as well as technological skills. Newly formed Indian Translators Association (http://www.itaindia.org/) is another successful step towards uniting Indian Translators and Language Industry of India. We foresee Indian Translators Association playing major role in promotion of Translation and Language Industry of India.
Conclusion
From the above one sees great future of Translation Industry of India. But the true benefit will come only when Indian Translators are united for a common cause, Indian governments comes forward with special packages to promote this nascent Industry, Individual initiatives are taken by translators as well as agencies to upgrade skills and get tuned to ever changing technological needs in terms of use of CAT tools and quality control process etc., given the condition that India continues to shine!  visit http://www.modlingua.com/

Ravi Kumar

Ravi Kumar
Founder Director - Allied Modlingua- Your preferred language partner in India