1. INTRODUCTION
In 2005, Woolhiser declared that there were but a small number of papers that specifically addressed the topic of border effect as a cause of language change and that, as a result, “it is difficult at present to speak of ‘border studies’ as a distinct area of research in dialectology and sociolinguistics” (). Naturally, the set of papers that subsequently explored linguistic convergence and divergence in cross-border areas from the standpoint of social dialectology has since increased (see, for example, ; ; and ). Nevertheless, studies that are not exclusively focused on (mainly European) political borders between states that, based on a policy of ethnolinguistic nationalism, promote one standard language as part of their strategy for political and cultural homogenisation (see, as a paradigmatic example of a monolingual polity, the case of France as explained by ), remain a challenge.
However, there is no reason to assume that the sociocultural discontinuities resulting from the establishment of a political border are limited to dialect continua traversed by external or inter-state boundaries. In fact, the first objective of this study is to broaden the scope of border studies and show that, in relatively decentralised plurilingual states, internal or intra-state borders may impact dialect continua as much as, or even more than, international borders. In this regard, we pay particular attention to the north-western dialect continuum of the Catalan language, which is crossed by both an external political border (separating Spain from the country of Andorra) and an internal political border (separating the Spanish autonomous communities of Catalonia and Aragon). Figure 1 presents an overview of the political divisions of the Catalan-speaking area across four states (Andorra, France, Italy, and Spain) and, within Spain, across five administrative regions or autonomous communities (Aragon, the Balearic Islands, Catalonia, Murcia, and the Valencian Country). The north-western dialect continuum is presented in §2.
To assess the impact of the internal borders, this paper primarily analyses whether the process of advergence () toward standard Catalan —which took place in Catalonia and Andorra between the first quarter of the 20th century and the early 21st century (as demonstrated, among others, by )— occurred concurrently with a simultaneous process of advergence towards standard Spanish in the Catalan-speaking counties of Aragon. This parallel development has been largely overlooked until the recent investigation by , who employed a novel multi-method dialectometric approach. Our research aims to corroborate his findings employing a more sociolinguistically-oriented methodology because, as states, “this article evinces the need to further develop a form of social dialectometry that not only answers sociolinguistic questions, but also makes it possible to objectively evaluate the social motivations fuelling the ongoing changes”.
Therefore, in addition to the interest attached to the specific findings on the impact of the internal border effect, the second objective of this paper is to determine which social variables contribute to the ongoing process of language change. To do so, we will apply generalized additive mixed-effects regression modelling to a corpus of 53,350 linguistic items collected from 192 speakers divided into four age cohorts. Using a non-linear regression approach enables us to model the non-linear influence of geography (i.e. social contact) on the Levenshtein distances between north-western Catalan and standard Catalan, on the one hand, and between north-western Catalan and standard Spanish, on the other. It also permits considering the influence of other social variables, as well as individual variation. By using this methodological approach, we intend to help eliminate the often all too impervious boundary between two highly related disciplines, because we believe, as do , that “given the close relation between dialectology and sociolinguistics, [...] dialectometric techniques might be deployed more often in sociolinguistic studies than they are now”. It is promising that generalized additive modelling has already been used in relatively recent sociolinguistic studies (; ) after its introduction to dialectometry by .
Lastly, the third objective of this study is to explore whether the increased advergence to Spanish in Catalan-speaking Aragon correlates with the decline in the ethnolinguistic vitality of Catalan. Addressing the challenge of numerically correlating these two tendencies, which, in past investigations, were only subject to impressionistic comparisons, was one of the tasks identified as requiring attention in future research projects. Furthermore, establishing this correlation might help expanding the debate on whether hybridization is a cause, mechanism, or (in line with findings) result of the processes of language shift.
1.1. Forgotten internal borders
One of the indirect consequences of the 1996 publication of volume 10 of the journal Sociolinguistica, on the processes of dialect convergence and divergence in Europe, was the resurgence of scientific interest in political borders in the field of social dialectology. In fact, in their introduction to the above-mentioned volume, not only presented political borders as a source of linguistic divergence (to the point where they also constituted dialect borders), but also gave them a more significant role than certain factors traditionally regarded as crucial in dialectology, such as natural borders.
Despite the revamped interest in borders as facilitators of linguistic divergence over the past 25 years, most studies on the subject have focused their analysis solely on inter-state borders, effectively ignoring intra-state boundaries. The Germanic continuum has, in effect, been the language area in which the impact of political boundaries on the divergent evolution of dialects has been studied in the greatest detail. This impact, in opinion, cannot merely be attributed to the role of these inter-state borders as barriers to communication and mobility, but to the fact that speakers perceive them as the outer limits of the national community (and, therefore, identity). This continuum has been the subject of investigation by numerous authors, including , , , , and , all of whom equate the concept of border to the notion of international border. Even Woolhiser (, ), in his retrospectives on the processes of linguistic convergence and divergence in Europe, centres his attention on the fragmentation of the Germanic continuum, yet emphasises the need to extend the scope of border studies to the Romance and Slavic continua. These two studies, however, do not mention intra-state borders as a possible catalyst for linguistic divergence.
Various other studies have also focused exclusively on the effect of international borders, such as the study of on the impact of the Russian-Finnish border in Karelia, as well as studies conducted outside of Europe, such as, inter alia, , who focuses on the influence of certain features of Canadian English along the international border between Windsor (Ontario) and Detroit (Michigan), or , who explores the process of linguistic divergence of Wolof at the border between Senegal and The Gambia. More recently, international borders have also been the subject of studies which analyse the emergence of new European nation-states and the subsequent use of language as a political means for fuelling ethnolinguistic nationalism (see ).
Despite the preferential consideration afforded to inter-state borders, several recent studies analyse (in line with the tenets of traditional dialectology) the effect of internal borders on dialect continua. While some explore the extent to which certain current political and linguistic borders are remnants of historical migrations (see on the case of Switzerland), others show that such borders are the result of former physical barriers which have since been overcome. In his analysis of the continued existence of the Fenland dialect boundary, for instance, Britain argues that, although the former marshes have been drained, “what was a physical boundary became a linguistic (and social, attitudinal, economic, political and infrastructural) one” () because “the boundary effect of the original marshland, and the consequent boundary effects that this engendered —attitudinal, infrastructural, socio-economic— has shaped people’s routine socio-spatial behaviours” ().
As is clear, these studies analyse the modern-day consequences (political, linguistic, and otherwise) of former ethnic and natural boundaries that did not have a political correlate in the past. The aim of this research, however, is precisely the opposite: to understand why a political border that has remained virtually unchanged since the 14th century and which had never previously affected (at least noticeably) the north-western dialect continuum of Catalan, gradually became a linguistic border in the 20th century (see §1.3). In our opinion, addressing internal borders and understanding the reasons why they have engendered processes of linguistic divergence that are as or even more significant than those caused by external borders, is a crucial gap that needed attention in the field of social dialectology.
1.2. Complex political borders
Due to the vicissitudes of history, national communities that once shared the same language often become divided into different states. As a result, state borders have also become boundaries for policies that often pursue the linguistic homogenisation of the population by promoting knowledge of one single linguistic variety (the standard version of the national language) at the expense of all other varieties (languages or dialects) spoken by the inhabitants. This has given rise to several different scenarios, which some authors have set about systematising. In the literature on language change, the most widely-used form of classifying border areas based on the number and typology of the languages spoken is likely the one put forth by . According to these authors (see also ; ; and ), there are “at least” three potential scenarios or constellations. In the first, the same standard language is spoken on both sides of the state border, as occurs at the German-Austrian border. In the second, different but related standard languages are spoken on either side of the border, such as along the border between The Netherlands and Germany. In the third, although the dialect continuum is also crossed by a state border, the corresponding standard language is used on only one side of the border. As a “genetically unrelated standard” is spoken on the other side of the border, the authors consider the dialect varieties spoken in this territory (such as Albanian in Greece or Swedish in Finland) “roofless dialects”. Other authors (see, for instance, ) have proposed similar classifications.
These classifications can indeed be somewhat expanded. First, by explicitly mentioning that internal borders within states may also be the dividing line for language policies that are as or even more antagonistic than those between states. In plurilingual states where regions have jurisdiction over language policy, it is not uncommon for a regional language to have official status (or a certain degree of protection) only in part of its historical area, thus becoming the variety of reference for only some of its speakers. In the Catalan-speaking world, for instance, this is the case in the region of Carche, in the autonomous community of Murcia (Spain). Despite being a Catalan-speaking region, following its resettlement by Valencian farmers from nearby communities in the 1880s, Catalan has never been officially recognised. As a result, the regional border between Murcia and Valencia splits speakers into two groups. To the east of the border, Catalan (there referred to as Valencian) is the official language along with Spanish, making it likely for dialects to converge towards the standard variety of Valencian). To the west of the border, Spanish is the sole official language, meaning that the most plausible process of convergence would occur toward standard Spanish.
Furthermore, a region’s sociolinguistic reality may be more complex than can be accounted for by these three constellations. For example, in regions where minority languages do benefit from some means of protection (or even, in some cases, full official status), the policies promoting the regional language tend to vary considerably from region to region. For instance, in the Catalan-speaking counties of La Franja, in Aragon (the subject of this study), Catalan is not an official language, but most of the population has access to the Catalan public media and Catalan has been taught in schools as a voluntary subject since the 1984–1985 school year. As a result, Catalan in these counties is simultaneously roofed by two different but related standard languages: the more prestigious and widespread standard Spanish (), and standard Catalan, especially among the young generations (who have mostly received literacy training in Catalan) and activists in favour of the local Catalan language.
Finally, even supposedly similar scenarios can lead to very different outcomes. For instance, while both Catalan and Spanish are co-official languages in the autonomous communities of Catalonia, the Balearic Islands, and the Valencian Country, the language policies enacted to revitalise the language specific to these three regions differ greatly in terms of their intensity (being stronger in Catalonia and less intense in the Balearic Islands, and, especially, in Valencia). As a result of these differences, in the autonomous community of the Valencian Country, the prestige variety for most people continues to be Spanish, with highly negative consequences for the vitality of the Catalan language (see ). We believe that these examples highlight the necessity for further theoretical reflection on the types of border scenarios that contribute to language change. This reflection should encompass intra-state political borders as potential dialect or linguistic boundaries on par with inter-state political borders.
1.3. The Catalan–Aragonese border
The border that currently separates the autonomous communities of Catalonia and Aragon dates back to the 14th century, except in the far north, where the border was not established until 1592. It follows the boundaries laid down between 1300 and 1359 by the Kingdom of Aragon and the Principality of Catalonia, two of the federated states in the so-called Crown of Aragon ().
This means that the region under study here has been crossed by a stable political border (formerly external, currently internal) for over six hundred years, during which Catalan remained the dominant language of its population. In Aragon, Catalan was present as early as the 11th century and would continue to be used in formal settings until 1707, the year in which the Kingdom of Castile conquered Aragon and abolished the legal charters governing the region through the so-called Nueva Planta Decree. Catalan has never regained official status in Aragon, even though, according to the successive Autonomy Statutes enacted in the autonomous community following the restoration of democracy in Spain in 1978, the “protection, recuperation, teaching, promotion, and diffusion” of its “languages and linguistic varieties” has been regulated by law ().
In Catalonia, on the other hand, although Catalan was banished from the public domain following the conquest by the Kingdom of Castile in 1714 and the promulgation of the Nueva Planta Decree in 1716 (a prohibition that persisted for two centuries), the language verged on formal recognition during the Commonwealth of Catalonia (1914-1923) and eventually attained restricted official status () during the Second Spanish Republic (1931-1939). In both instances, the ensuing dictatorships (headed firstly by Miguel Primo de Rivera, from 1923 to 1930, and, secondly, by Francisco Franco, from 1939 to 1975) once again prohibited the public use of Catalan, despite the fact that, during the ‘second Francoism’ (1959-1975), some public manifestations of the language, particularly in literature, were tolerated. With the approval of the Statute of Autonomy of Catalonia in 1979, Catalan was reinstated as an official language, alongside Spanish and, in 2006, Aranese Occitan. Thanks to this status, it has served as the vehicular language of government, education, and public media over the past forty years.
The historically Catalan-speaking area of Aragon (known since the late 1970s as La Franja, literally “the strip”) runs north to south through four counties along the border with Catalonia (La Ribagorça, La Llitera, El Baix Cinca, and El Matarranya). This area had a population of 47,631 people in 2014, constituting 3.59% of Aragon’s total population (see ). Its population is most densely concentrated in the centre of La Franja, while the edges experienced a severe decline in population during the 20th century. In terms of the flow of economic resources and mobility, the central and northern counties tend to engage more with the neighbouring areas in Catalonia, particularly with the city of Lleida (138,956 inhabitants in 2019), while the southern area tends to do so more with communities in Aragon, located further to the west.
In this context, it is plausible to think that the psychological border generated by the political border is a long-standing phenomenon. According to , one must go back to the Reapers’ War (1640–1652) to locate the emergence of an Aragonese collective consciousness in La Franja. This conflict represented the first clash between these counties and the Principality of Catalonia, leading the Aragonese political authorities to initiate the first “great campaign” against Catalan separatism (). These campaigns recurred after the War of Succession (1702–1715), the Peninsular War (1808–1814), and the Spanish Civil War (1936–1939), fostering the promotion of “symbols of collective identity that delineate a clearly distinct image from Catalonia” by the Aragonese authorities (), to the extent that nowadays, in Aragon, “catalanisation equals stigmatisation” (Espluga & Capdevila 1996: 88). Therefore, the inhabitants of La Franja “often find themselves compelled to conceal some basic traits of their everyday life, such as their language”, because making the Catalan language a visible element of their social identity “entails a penalty from the powers and public opinion of the region to which they belong” ().
Despite the distant origin of this deep-rooted identity conflict, it seems that it did not have any tangible linguistic translation until the second half of the 20th century. Indeed, according to a reference manual of Catalan dialectology (), no isoglosses of north-western Catalan coincided with the political border between Catalonia and Aragon, and it was not until the 1990s that some studies identified (exclusively in the northernmost part of the area) a trend that could eventually align the linguistic border with the political border (see B, and ).
Nevertheless, perhaps the clearest evidence that the Francoist dictatorship and the centripetal tendencies promoted by the autonomous communities from 1978 onwards have succeeded in creating a mental border among the inhabitants of La Franja regarding what Catalonia represents is the widespread rejection of the glottonym català in favour of chapurriau. This glottonym, which literally means “poorly spoken language”, likely began gaining popularity sometime between the late 18th and early 19th centuries (). However, it did not surpass the term català —or other local denominations— until after the Civil War (). Even with the restoration of democracy, the normalization of the language name in La Franja has not been facilitated: the glottonym català is not mentioned a single time in the Aragonese Autonomy Statute (), and it was even replaced by a name lacking any philological tradition (LAPAO, ‘Aragonese language specific to the eastern area’) in the second Law of Languages of Aragon ().
As we have seen in §1.1, any process of patoisisation involves a weakening of the awareness of language unity that leads to the abandonment of formal usages and the hybridization of the recessive language. The conflict between mutually exclusive identities () experienced in La Franja precisely explains why certain individuals consistently oppose any attempt to promote the knowledge and use of standard Catalan: the codified language (often biasedly identified as the language of Catalonia) is perceived as a threat to both the distinctive features of the local dialects and the Aragonese identity of the speakers. The reaction to this negative assessment of the standard language has often been to promote the linguistic features that distinguish the local varieties, including numerous loanwords from Spanish (), in a clear example of what calls naboopposisjon (“neighbouring opposition”, cf. ).
One of the goals of this study is precisely to examine whether the border effect between Catalonia and Aragon should be attributed, more than any other social factor, to the long-term consequences of belonging to distinct political entities, or whether, instead, there are other social factors at work in this process of language change.
2. PREVIOUS RESEARCH ON THE BORDER EFFECT BETWEEN CATALONIA AND ARAGON
The term north-western Catalan designates the diverse dialectal varieties of Catalan spoken in Andorra, the western half of Catalonia, and the eastern counties of Aragon. Figure 2 shows the location of these dialects within the Catalan-speaking area.
Following the pioneering work by , the relationship between north-western Catalan dialects and standard Catalan has been investigated from an overall perspective (i.e. including all dialects) by . Their research showed clear structural dialect loss (in contrast to functional dialect loss; see ) due to linguistic advergence to standard Catalan in many north-western Catalan dialects spoken in the Autonomous Community of Catalonia (Spain) and Andorra. From a less general point of view, this process of advergence towards prestige varieties (traditionally central Catalan and, more recently, standard Catalan) has been observed across almost all north-western dialects, including those spoken in the areas of Lleida (), Tortosa (), La Ribagorça (; ), and El Pallars (; ). In fact, dialect levelling due to convergence with standard Catalan has been observed in the majority of Catalan dialects, encompassing the ones spoken in Ibiza (), Majorca (), northern Valencia (), and southern Valencia (), among others.
According to other authors, this process of levelling has occurred parallel to a second process of advergence towards standard Spanish, particularly in areas where Spanish carries more weight as a language of reference. Examples include for southern Valencian; for Majorcan; for all Balearic varieties; and for the Catalan spoken in Benavarri (La Franja).
In this context, provided evidence that the dialect levelling taking place in Catalonia and Andorra strongly contrasts with the (apparent) relative stability of the Catalan dialects in Aragon. They showed that this situation resulted in a significant border effect between the varieties located on either side of the Catalan-Aragonese border and in the subsequent weakening of the traditional north-western dialect continuum. This border effect was argued to be rooted in the opposing sociolinguistic situations of the three studied regions, as Catalonia and Andorra have strong language policies in place to support Catalan, while Aragon does not. The analysis conducted by was based on a contemporary corpus comprising 363 items per informant, collected from 320 participants (54.6% male and 45.4% female) originating from 40 localities (2 from Andorra, 8 from Aragon, and 30 from Catalonia) and evenly distributed across four age cohorts (in each locality, 8 informants were surveyed, 2 per age group). As not all informants were able to provide all survey items, the final corpus consisted of 113,749 items. A range of dialectometric methods was applied to calculate and analyse the linguistic distance between varieties from an aggregate perspective.
However, since languages are dynamic by nature, the aggregate approach might be concealing some less remarkable changes in quantitative terms (as compared to the substantial standardisation process undergone in Catalonia and Andorra) that might be affecting the Aragonese varieties as a result of their isolation from standard Catalan. In line with the statement made by that “in situations of sustained isolation, internal tendencies possibly have free play”, the Aragonese varieties were expected to be evolving simultaneously as a result of both internal factors (such as analogy or paradigm simplification) and external factors (namely the prestige of standard Spanish, which is the only official language in Aragon, as explained in §1.3).
To explore these hypotheses, Valls (, ) switched to qualitative methods to reanalyse the same corpus and discovered that internal factors had indeed triggered several intrasystemic changes, such as the inter- and intra-paradigmatic expansion of the velar and palatal verbal inflections and the simplification of the binary combinations between a third person dative clitic and an accusative clitic. These regional intrasystemic changes are a clear sign of divergence from the standard language (an “unlikely scenario in the present-day European context”, according to , as cited by ) and offer insight into how the north-western varieties may have evolved in Catalonia and Andorra had they not been subject to the influence of the standard language.
However, Valls’ results showed that these innovations had mainly spread across the central and southern counties of Aragon (where the ethnolinguistic vitality of Catalan is higher, see §3), and that convergence towards standard Spanish was the main trigger for linguistic change in the northern Aragonese counties (where the use of Catalan has declined considerably among younger speakers). In the northernmost county of La Ribagorça, for example, Valls (, ) found evidence that some speakers were phonologically, morphologically, and lexically converging with Spanish by reducing their phonemic inventory, adopting several Spanish clitic combinations, and incorporating many lexical transfers from Spanish. These results confirmed that the border effect was not only taking place due to the above-mentioned process of vertical advergence towards standard Catalan in Catalonia and Andorra, but also as the outcome of internal developments within Aragonese varieties and the (more recent, widespread, and powerful) process of vertical advergence towards standard Spanish in the Catalan-speaking counties of Aragon. However, since the approach was qualitative, Valls (, ) was unable to measure the scale of this trend.
At the same time, and in efforts to reply to certain objections, such as view that dialectometry systematically ignores social variables, attempted to include candidate social variables as well as geography in one single aggregate (dialectometric) analysis of the same data. To investigate the existence of a border effect between Aragon, on the one hand, and Catalonia and Andorra, on the other, they applied a generalized additive mixed-effects regression model to the same corpus. This made it possible to examine the effect of several speaker-related variables (year of birth, gender, and level of education) and location-specific social variables (community size, average community age, average community income, and the importance of tourism) on the evolution of the north-western Catalan dialects. Although their analysis clearly showed that speakers in Aragon were at a greater linguistic distance from standard Catalan than those in Catalonia and Andorra, they did not find support for the importance of the above-mentioned social variables, with the exception of speaker age. In fact, their analysis found that speaker age had a significant effect (with pronunciations closer to standard Catalan among younger speakers) in Catalonia and Andorra, but not in Aragon. is likely the first study to combine dialectometry and social dialectology to show that, in certain circumstances, internal borders may have a higher impact on a dialect continuum than external borders. In this paper, we give evidence that it is important to take this combination of methodological approaches into consideration in border studies, as it retains the aggregate perspective without overlooking the importance of social variables.
3. HYPOTHESES
This study takes a step further in the analysis of the internal border effect between Catalonia and Aragon. We exclusively focus on 4 Aragonese counties and 8 Catalan counties located along the political border. This enables us to examine whether the Catalan varieties spoken in Aragon are undergoing a process of vertical advergence towards Spanish, as previous qualitative (Valls , ) and quantitative () results have suggested. The aim is to demonstrate, employing a specifically sociolinguistic-oriented methodology for the first time, our first hypothesis: that the border effect is not just the result of a single process of vertical advergence, but the combined action of two simultaneous processes of vertical advergence (towards standard Catalan in Catalonia and towards standard Spanish in Aragon) that have increased divergence at the border.
In addition, to determine which specific social variables might be contributing to this internal border effect, we will examine the same speaker-related and location-specific social variables as those utilized by . As a second hypothesis, we expect that at least region and speaker age will significantly predict linguistic change (see §4.2).
We also know that there are major differences in the ethnolinguistic vitality of Catalan across the 250 kilometres stretching from north to south encompassed by the 4 Catalan-speaking counties of Aragon. According to the most recent surveys available (; based on data collected in 2014), Catalan remains the habitual language of communication at home in the southernmost county of El Matarranya (where it is used by 65.5% of the population), whereas in the northernmost county of La Ribagorça only 38.3% of the population regularly speak Catalan at home (for more details, see Table 2 in §4.2).
In this regard, it is important to bear in mind that the decline of Catalan as a first (and most used) language is a recent phenomenon. Unfortunately, indicators disaggregated by counties are not available. However, according to , Catalan was the initial language for 71.1% of the population in La Franja in 2014, but decreased to 52.8% in just ten years. In terms of age groups, in 2014, as much as 68.4% of the population aged 65 and older identified Catalan as their first language, in contrast to only 34% among speakers aged 15–29. This generational shift appears to be the result of both demographic movements (such as the arrival of a large number of immigrants from Eastern Europe to work in the primary sector) and attitudinal changes (the diminishing prestige of Catalan leading some parents to interrupt the transmission of the language to their children in the northern counties).
Our third hypothesis in this study is, therefore, that the advergence process towards Spanish, despite occurring throughout La Franja, is stronger in the northern counties (as previous results also indicate) than in the central and southern counties, and that there is a correlation between this process of hybridization and the ethnolinguistic vitality of Catalan. We explores this relationship in greater depth using the sociolinguistic indicators described in §4.2.
4. MATERIALS
4.1. Pronunciation data
The Catalan dialect dataset contains basilectal phonetic transcriptions (using the International Phonetic Alphabet) of 275 words in 24 dialectal varieties, in addition to standard Catalan and Spanish. To compile this corpus, we transcribed supplementary data collected from the same informants used by . This dataset includes phonetic, phonological, morphological (both nominal and verbal), and, especially, lexical information, amounting to a total of 53,350 words. While 52,800 words were collected during the interviews, the remaining 550 words correspond to transcriptions of standard Catalan and Spanish pronunciations based on the respective official grammars (; ).
The 24 locations are spread out over 8 Catalan and 4 Aragonese border counties located near the border that separates the Spanish autonomous communities of Catalonia and Aragon (see Figure 3). For each of these 12 counties, both the capital city and a rural village were chosen as data collection sites (see Table 1). In every location, 8 speakers were interviewed, 2 per age group (F4: born between 1917 and 1930; F3: born between 1946 and 1960; F2: born between 1975 and 1982; F1: born between 1991 and 1995). The median years of birth per age cohort are 1922 (F4), 1953 (F3), 1978 (F2), and 1993 (F1). Although there is a similar number of male and female informants in the corpus, and an attempt was made to collect data from one man and one woman in each age group of each locality, it was not always possible. The proportion of women, therefore, is slightly higher in groups F3, F2, and F1, while the proportion of men is higher in group F4. All data was transcribed by one single transcriber (the first author of this paper), who also conducted the fieldwork for the youngest age group between 2008 and 2011. The fieldwork for the other age groups was performed by another fieldworker (Mar Massanell) between 1995 and 1996.
The fact that the gap between the median birth years of F4 and F3, and F3 and F2 (31 and 25 years), is larger than between F2 and F1 (only 15 years) has two explanations. First, our desire to interview individuals from the two youngest age groups who were about to finish compulsory education and, as a result, had been taught exclusively in Catalan in Catalonia or partially in Catalan in Aragon, as opposed to members of the two oldest age cohorts, all of whom had been taught exclusively in Spanish. This focus reflects our interest in the introduction of Catalan into the educational systems of Aragon (as an optional subject) and Catalonia (as the vehicular language of education) in the early 1980s. Second, our desire to interview informants with as stable idiolects as possible, which, as research shows, do not typically stabilise until late adolescence () or early adulthood (). Updating the F1 with younger speakers (which would have resulted in a more balanced distribution of birth years) would have been problematic because we would be comparing informants at the end of compulsory education (F2) with others still distant from the end of this educational stage (F1). Among the latter, furthermore, we might have tapped into unstable idiolects and drawn, therefore, ill-founded conclusions.
4.2. Sociolinguistic data
We used the same speaker-related variables (year of birth, gender, and level of education) and location-specific social variables collected by . This second group of variables includes the number of inhabitants (i.e. community size), the average community age, the average community income, and the relative number of tourist beds per inhabitant (a metric used to estimate the importance of tourism) in the most recent year available at the time this study was conducted (ranging between 2007 and 2010).
We also took into account several additional sociolinguistic variables (all extracted from ) to explore whether there was a relationship between the advergence process towards Spanish potentially taking place in Aragon and the (declining) ethnolinguistic vitality of Catalan in La Franja. More specifically, we decided to focus our attention on 4 variables (see Table 2): 1) the percentage of people who use Catalan as their home language, i.e. the language that is most commonly spoken at home with other members of the family (“CatalanHome”), 2) the percentage of people with progeny who speak Catalan as a first language and transmit it to their children (“CatalanChildren”), 3) the percentage of people with progeny who speak Spanish as a first language and transmit it to their children (“SpanishChildren”), and 4) the Intergenerational Transmission Index (ITI). This index (which was proposed by and has been widely used in Catalan sociolinguistics since then) is obtained by calculating the difference between the percentages of language use with parents and with children with the following formula:
Where: %LACh = percentage of those who speak language A exclusively or predominantly with their children.
%LAP = percentage of those who speak language A exclusively or predominantly with their parents.
The result is an index that ranges from -1 to +1, measuring the intensity of regression or advancement in language intergenerational transmission.
As can be observed in Table 2, the southern and central counties of El Matarranya and El Baix Cinca have a positive ITI, whereas in the northern counties of La Llitera and La Ribagorça, the ITI values are negative. This is due to the fact that in El Matarranya and El Baix Cinca, there are more people with children who use Catalan with them than with their own parents (3.9% and 2.6%, respectively, which, given as a fraction of one and rounded to two decimal places, results in an ITI of 0.04 for El Matarranya and 0.03 for El Baix Cinca). In La Llitera and La Ribagorça, in contrast, people tend to use Catalan less with their children than with their parents (-0.8% and -1.6%, respectively, which, given as a fraction of one and rounded to two decimal places, results in an ITI of -0.01 for La Llitera and -0.02 for La Ribagorça). It should be noted that ITI and CatalanChildren are very strongly related (they correlate with r = 0.95).
5. METHODS
5.1. Obtaining pronunciation distances
We calculated the pronunciation distance between the standard Catalan and standard Spanish pronunciations and their dialectal counterparts for all 192 speakers using a modified version of the Levenshtein distance (). The Levenshtein distance transforms one string into the other by minimising the number of insertions, deletions, and substitutions. While the regular Levenshtein distance does not distinguish between different types of substitutions, the adapted Levenshtein distance (), which we use here, includes automatically determined sensitive sound distances. For example, the cost of substituting an /i/ with an /a/ will be higher than the cost of substituting an /i/ with an /e/. In line with the approach taken by , we normalised the Levenshtein distances by dividing them by the alignment length, calculated the logarithm of these values (to adjust for skewness) and centred these values by subtracting the mean value. We calculated the adjusted Levenshtein distances using both standard Catalan and standard Spanish as a reference, to obtain two series of Levenshtein distances for each dialectal pronunciation.
5.2. Generalized additive mixed-effects regression modelling
As in , we used generalized additive mixed-effects regression modelling to analyse our data. While a complete description of this approach may be found in the above-mentioned study, it should be noted that this non-linear regression approach enables us to model the non-linear influence of geography (i.e. social contact) on the Levenshtein distances between north-western Catalan and standard Catalan, on the one hand, and between north-western Catalan and standard Spanish, on the other, while taking into account the influence of individual variation and other social variables.
Specifically, we created two separate generalized additive models, one for each standard language, assessing the inclusion of random intercepts for speaker, word, and location. We also assessed the non-linear effect of geography using longitude and latitude coordinates, and the inclusion of all aforementioned social variables (see §4.2). In addition to this, to assess the influence of the sociolinguistic indicators shown in Table 2, we fitted another generalized additive mixed-effects regression model that did not take into account the influence of geography (as there were only eight locations).
6. RESULTS
For both models (for both standard languages, Spanish and Catalan), the non-linear pattern for geography was significant (p < 0.001), as shown in Figures 4 and 5. Figure 4 shows that the varieties spoken in Catalonia are more similar to standard Catalan (indicated in green) than the varieties spoken in La Franja (indicated in yellow and pink). The varieties with the greatest distance from standard Catalan may be found in central and, in particular, northern Aragon, while in the southernmost reaches of the studied area, no remarkable differences were observed between the dialects on either side of the political border. Figure 5, however, shows that the linguistic distance from standard Spanish is smaller among all the varieties spoken in La Franja, particularly those in the central and northern counties again, but also (to a lesser extent) in the southernmost varieties of El Matarranya. These results clearly validate the first hypothesis of this study, namely the existence of a border effect between the north-western varieties of Catalonia and Aragon, which was not attested in the classical descriptions of this dialectal continuum ().
Obviously, these results do not allow us to draw conclusions as to how dynamic the process of language change is, because they provide a static view of the dialectal landscape. As a result, we cannot determine whether the border effect has increased over time. However, the interaction found between the only other significant variables (as expected in the second hypothesis), the effect of the speaker’s year of birth and the region (Aragon vs. Catalonia; p < 0.001) does provide insights. The interactions are visualized for each reference language in Figures 6 and 7. When using standard Catalan as a reference (see Figure 6), the pronunciation of younger speakers in Catalonia tends to be closer to the standard than that of older speakers (β = -0.00014, SE: 0.00007, p = 0.046). Among Aragonese speakers, this pattern is completely inverted (β = 0.00066, SE: 0.00009, p < 0.001): younger speakers diverge farther from standard Catalan than their older counterparts. This indicates that the information depicted in Figures 4 and 5 should be understood as the result of a dual process of change: while in Catalonia the younger generations tend to converge with standard Catalan, in Aragon, the trend is to diverge. However, based on the data examined so far, it is still unclear whether this divergence (from standard Catalan and, indirectly, from the varieties spoken to the east of the political border) is the result of a second process of vertical advergence between the north-western Aragonese varieties and standard Spanish.
This may indeed be observed in Figure 7: when using standard Spanish as a reference, the pronunciation among younger speakers in Aragon tends to be closer to the standard than that of older speakers (β = -0.00021, SE: 0.00008, p = 0.005). Indeed, this progressive convergence towards standard Spanish is significantly greater in Aragon than in Catalonia, although to the east of the border, a slight convergence towards standard Spanish is also detected (β = -0.00026, SE: 0.00006, p < 0.001).
Regarding the latter, one might initially assume that, given its status as one of the official languages of Catalonia, Spanish might be exerting an influence in this region similar to the one observed on other Catalan dialects (see §2). However, the qualitative analysis of the data prompts us to consider, in light of evidence suggesting minimal interference from Spanish that has remained stable over time (see ), that this convergence is, in fact, a byproduct of the standardization process in Catalonia. For example, in the central counties of the studied area, the final post-tonic /a/ (a context abundant in the corpus) was traditionally pronounced [ε] (e.g., [ˈkazε] for ‘house’). Today, however, this feature is losing ground to the standard pronunciation [a] ([ˈkaza]), which is at the same time closer to the Spanish pronunciation [ˈkasa]. Likewise, replacing the conventional north-western pronunciation, exemplified by an initial [a] in words such as [aɾiˈso] (‘eriçó’, meaning ‘hedgehog’), with the standard pronunciation featuring [e] (e.g., [eɾiˈso]), results in a pronunciation more akin to the Spanish [eˈɾiθo], a context that is again very abundant in the corpus. Identifying such side-effects is crucial to avoid drawing ill-founded conclusions based on the proposed methodology.
For both models, the random-effect factor for location was not significant. This is likely due to the inclusion of the non-linear effect of geography. We therefore included only the necessary random intercepts for word and speaker, as well as by-word random slopes for the speaker’s year and region of birth (Aragon vs. Catalonia).
Finally, as for the best model for investigating whether the increased advergence to Spanish in Catalan-speaking Aragon correlates with the decline in the ethnolinguistic vitality of Catalan (i.e. to analyze the veracity of the third hypothesis), only two fixed-effect variables proved necessary. These were the speakers’ year of birth (which showed a greater convergence towards standard Spanish among younger speakers than among older speakers: β = -0.00048, SE: 0.0001, p < 0.001) and the percentage of speakers in each location who transmitted Catalan to their children (which showed a greater distance from standard Spanish in locations where more parents teach Catalan to their children: β = 0.0085, SE: 0.0024, p < 0.001). This random-effects structure was similar to that of the other models, barring the inclusion of a random intercept for location (as no non-linear effect for geography was included). These results confirm that the greater structural loss observed in the varieties spoken in the northern half of La Franja is linked to a decline in ethnolinguistic vitality and, in short, to the shift away from Catalan in these counties.
7. DISCUSSION
In relation to the first research hypothesis, this study has demonstrated, employing a specifically sociolinguistical-oriented methodology for the first time, that the border effect between Catalonia and Aragon is not solely the result of a single process of vertical advergence towards standard Catalan in Catalonia, but involves a second process of vertical advergence towards standard Spanish in Aragon. This finding is in line with previous qualitative (Valls , ) and quantitative () results. Compared to the latter, the added value of this paper lies in demonstrating this double process of vertical advergence using a methodology (generalized additive mixed-effects regression modelling) that addresses the “need to further develop a form of social dialectometry that not only answers sociolinguistic questions, but also makes it possible to objectively evaluate the social motivations fuelling the ongoing changes”, as claimed by . This work also illustrates how a border dating back to the 14th century and which did not exhibit any tangible linguistic translation until the second half of the 20th century () gradually became a linguistic border and could eventually align the linguistic border with the political one, as some authors have suggested (see , and ).
The comparison of these results with those of , which demonstrate that, in Catalonia and Andorra, the north-western varieties evolved as if there were no state border between them, also indicates that intra-state borders can impact dialect continua more significantly than inter-state borders because they can be the dividing line for language policies that are as or even more antagonistic than those between states. Rather than accepting assertion (as cited by ) that situations of divergence between a dialectal variety and its standard reference (such as the one observed between north-western and standard Catalan in Aragon) are “unlikely scenarios in the present-day European context”, we propose that a more comprehensive analysis of regions intersected by internal borders may unveil additional instances akin to the one described. This undertaking should contribute to sparking a renewed theoretical reflection on the perhaps overly restrictive types of border scenarios that have been proposed thus far.
Additionally, this study has examined (following the steps of ) the effect of several speaker-related variables and location-specific social variables on the evolution of the north-western Catalan dialects. However, as expected in the second hypothesis, only two of these variables (the effect of the speaker’s year of birth and the region) have proven to be significant in predicting linguistic change. Consequently, it seems clear that the border effect between Catalonia and Aragon should be attributed, more than any other social factor, to the long-term consequences of belonging to different administrative realities with historically antagonistic language policies. Furthermore, psychological and identity-related barriers (often promoted by the Aragonese political powers, as evidenced, among others, by , , and ) have also played a significant role in shaping these linguistic divisions.
Nevertheless, the observation that, since the establishment of the boundary in the 14th century, the dialectal continuum remained largely unaffected by the impact of the border, leads us to consider that, despite the longstanding efforts to foster the emergence of an Aragonese identity in La Franja, it was only in the second half of the 20th century that the Francoist dictatorship initially, and the increased political capacity of the Spanish regions to promote regional identities since the restoration of democracy later, succeeded in transforming the political border into a psychological one. A border that has materialized, for example, in the explicit rejection of the glottonym català and the majority adoption of the glottonym chapurriau along with other local designations.
In fact, as we have seen in §1.1 under the term patoisisation, the denialism surrounding the unity of the language, the abandonment of formal language usage, and the hybridization of the recessive language are common indicators of a process of language substitution. Precisely, the third hypothesis of this study posits that the advergence process towards Spanish, although occurring throughout La Franja, is stronger in the northern counties (as previous results also indicate) than in the central and southern counties, and that it is possible to establish a correlation between this process of hybridization and the ethnolinguistic vitality of Catalan. Addressing the challenge of numerically correlating these two tendencies, which, in past investigations, were only subject to impressionistic comparisons, was one of the tasks that required attention in future research projects, as emphasized by .
Indeed, the results have confirmed that the greater structural loss observed in the varieties spoken in the northern half of La Franja is linked to a decline in ethnolinguistic vitality and, ultimately, to the shift away from Catalan in these counties. From a theoretical standpoint, these results seem to indicate (in line with perspective) that hybridization, as a non-essential factor for language substitution, is an outcome of a process of language shift rather than a cause or mechanism. This outcome highlights the failure of successive Aragon governments to protect and recover its “languages and linguistic varieties” in compliance with its Autonomy Statute (), because Catalan is spoken less and is more interfered with by Spanish in Aragon today than when these laws were passed. Therefore, social dialectometry has proven to be an effective tool for assessing the impact of language policies in sociolinguistic studies.
8. CONCLUSIONS
The primary objective of this study was to demonstrate that in relatively decentralised states like Spain, internal borders may have as great or even greater an impact than external borders on dialect continua. Our analysis confirmed that the border effect between Catalonia and Aragon is not solely the result of a single process of vertical advergence but rather the combined action of two simultaneous processes of vertical advergence (towards standard Catalan in Catalonia and towards standard Spanish in Aragon) that have increased divergence at the border. Given that this internal border effect has demonstrated a more pronounced impact than the international border between Spain and Andorra (cf. ), these results underscore the importance of investigating internal borders in linguistic convergence and divergence studies.
As a secondary objective, we aimed to bridge the gap between dialectometry and sociolinguistics, striving to identify the social factors that play a role in the ongoing language change on either side of the border. Since our findings indicate that the main predictors of change are the speakers’ year and region of birth, we assert that the border effect between Catalonia and Aragon should be primarily attributed, more than any other social factor, to the enduring consequences of belonging to different administrative realities with historically antagonistic language policies, and to the psychological and identity-related barriers that these borders have generated. The methodology employed underscores, in our view, the potential of social dialectometry to examine language change from a multidisciplinary perspective.
Lastly, this paper provides empirical support for the hypothesis that, throughout a process of language substitution, the decline in ethnolinguistic vitality of the recessive language correlates with an increase in structural hybridization (a common outcome in language shift processes) caused by advergence towards the expanding language.
This approach may be applied in countless border areas. Concerning the Catalan-speaking regions, a more in-depth analysis should be conducted to examine the impact of the triple border between the autonomous communities of Catalonia, Valencia, and the Balearic Islands on the language. Furthermore, on the Iberian Peninsula, there is a need to further analyze the evolution and vitality of Galician at the political crossroads between Galicia (where Galician is an official language), Asturias and Castile-Leon (where it has practically no protection), and northern Portugal. Additionally, social dialectometry might be useful to investigate the impact of the political division of the Basque language across two states and several autonomous communities (in Spain) and regions (in France) on its vitality and evolution. We hope this paper stimulates future research within the framework of social dialectometry in these and other border areas.
Acknowledgments
This research was partly funded by the project Native and non-native phonology: analysis and creation of digital resources (PID2020-113971GB-C21), financed by MCIN / AEI / 10.13039/501100011033.
References
1
Alturo, Núria & Maria Teresa Turell. 1990. Linguistic change in El Pont de Suert: The study of variation of /ʒ/. Language Variation and Change 2, 19-30. https://doi.org/10.1017/S0954394500000247.
2
Auer, Peter & Frans Hinskens. 1996. The convergence and divergence of dialects in Europe. New and not so new developments in an old area. In Ulrich Ammon, Klaus Mattheier & Peter Nelde (eds.), Sociolinguistica. International Yearbook of European Sociolinguistics. Convergence and divergence of dialects in Europe. 1-30. Tübingen: Max Niemeyer Verlag. https://doi.org/10.1515/9783110245158.1.
3
Auer, Peter. 2005. The construction of linguistic borders and the linguistic construction of borders. In Markku Filppula, Juhani Klemola, Marjatta Palander & Esa Penttilä (eds.), Dialects across borders. 3-30. Amsterdam & Philadelphia: John Benjamins. https://doi.org/10.1075/cilt.273.03aue.
4
5
Bailey, Guy. 2004. Real and Apparent Time. In Jack Chambers, Peter Trudgill & Natalie Schilling-Estes (eds.), The Handbook of Language Variation and Change. 312-332. Malden, Oxford & Carlton: Blackwell. https://doi.org/10.1002/9780470756591.ch12.
6
7
8
9
Boberg, Charles. 2000. Geolinguistic diffusion at the U.S.-Canada border. Language Variation and Change 12, 1-24. https://doi.org/10.1017/s0954394500121015.
10
Britain, David. 2014. Where North meets South? Contact, divergence, and the routinisation of the Fenland dialect boundary. In Dominic Watt & Carmen Llamas (eds.), Language, Borders and Identity. 27-43. Edinburgh: Edinburgh University Press. https://doi.org/10.1515/9780748669783-006.
11
Britain, David. 2015. Between North and South: The Fenland. In Raymond Hickey (ed.), Researching Northern English. 417-436. Amsterdam & Philadelphia: John Benjamins. https://doi.org/10.1075/veaw.g55.18bri.
12
13
Castellà, Carles M. 2016. La varietat lingüística geogràfica del Baix Ebre i el procés d’estandardització. Manteniment i pèrdua de formes pròpies. Un tast. In Joaquim Mallafrè & Miquel Àngel Pradilla (eds.), Jornades de la Secció Filològica de l’Institut d’Estudis Catalans a Móra la Nova (17 i 18 d’octubre de 2014). 169-200. Barcelona: Institut d’Estudis Catalans & Institut Ramon Muntaner.
14
Derungs, Curdin, Christian Sieber, Elvira Glaser & Robert Weibel. 2019. Dialect borders—political regions are better predictors than economy or religion. Digital Scholarship in the Humanities 35(2), 276-295. https://doi.org/10.1093/llc/fqz037.
15
16
17
18
19
Fruehwald, Josef. 2017. Generations, lifespans, and the Zeitgeist. Language Variation and Change 29(1), 1-27. https://doi.org/10.1017/s0954394517000060.
20
21
Gerritsen, Marinel. 1999. Divergence of dialects in a linguistic laboratory near the Belgian‑Dutch‑German border: Similar dialects under the influence of different standard languages. Language Variation and Change 11, 43-65. https://doi.org/10.1017/s0954394599111037.
22
23
24
Heeringa, Wilbert, John Nerbonne, Hermann Niebaum & Rogier Nieuweboer. 2000. Dutch-German contact in and around Bentheim. In Dicky Gilbers, John Nerbonne & Jos Schaeken (eds.), Languages in Contact. 145-156. Amsterdam & Atlanta: Rodopi. https://doi.org/10.1163/9789004488472_014.
25
Hinskens, Frans, Jeffrey L. Kallen & Johan Taeldeman. 2000. Merging and drifting apart. Convergence and divergence of dialects across political borders. International Journal of the Sociology of Language 145, 1-28. https://doi.org/10.1515/ijsl.2000.145.1.
26
Hinskens, Frans, Peter Auer & Paul Kerswill. 2005. The study of dialect convergence and divergence: conceptual and methodological considerations. In Peter Auer, Frans Hinskens & Paul Kerswill (eds.), Dialect Change. Convergence and Divergence in European Languages. 1-48. Cambridge: Cambridge University Press. https://doi.org/10.1017/cbo9780511486623.003.
27
28
29
30
Kühl, Karoline & Kurt Braunmüller. 2014. Linguistic stability and divergence. In Kurt Braunmüller, Steffen Höder & Karoline Kühl (eds.), Stability and Divergence in Language Contact. 13-38. Amsterdam & Philadelphia: John Benjamins. https://doi.org/10.1075/silv.16.02kuh.
32
33
Ley 3/2013, de 9 de mayo, de uso, protección y promoción de las lenguas y modalidades lingüísticas propias de Aragón. Boletín Oficial del Estado, 138, de 10 de junio de 2013, 43654-43662. https://www.boe.es/eli/es-ar/l/2013/05/09/3/con.
34
Ley Orgánica 5/2007, de 20 de abril, de reforma del Estatuto de Autonomía de Aragón. Boletín Oficial del Estado, 97, de 23 de abril de 2007, 17822-17841. https://www.boe.es/eli/es/lo/2007/04/20/5/con.
35
36
37
Mattheier, Klaus J. 1996. Varietätenkonvergenz: Überlegungen zu einem Baustein einer Theorie der Sprachvariation. In Ulrich Ammon, Klaus J. Mattheier & Peter H. Nelde (eds.), Sociolinguistica. International Yearbook of European Sociolinguistics. Convergence and divergence of dialects in Europe. 31-52. Tübingen: Max Niemeyer Verlag. https://doi.org/10.1515/9783110245158.31.
38
39
40
Nábělková, Mira. 2016. The Czech-Slovak Communicative and Dialect Continuum: With and Without a Border. In Tomasz D. Kamusella, Motoki Nomachi & Catherine Gibson (eds.), The Palgrave Handbook of Slavic Languages, Identities and Borders. 140-184. London: Palgrave Macmillan. https://doi.org/10.1007/978-1-137-34839-5_8.
41
42
43
44
45
46
47
48
49
50
Sarhimaa, Anneli. 2000. The divisive frontier: The impact of the Russian-Finnish border on Karelian. International Journal of the Sociology of Language 145, 153-180. https://doi.org/10.1515/ijsl.2000.145.153.
51
52
53
54
56
Tamminga, Meredith, Christopher Ahern & Aaron Ecay. 2016. Generalized Additive Mixed Models for intraspeaker variation. Linguistics Vanguard 2(s1), 33-41. https://doi.org/10.1515/lingvan-2016-0030.
57
58
Trudgill, Peter. 1988. On the role of dialect contact and interdialect in linguistic change. In Jacek Fisiak (ed.), Historical Dialectology. Regional and Social. 547-563. Berlin & Boston: Mouton De Gruyter. https://doi.org/10.1515/9783110848137.547.
59
Valls, Esteve, Martijn Wieling & John Nerbonne. 2013. Linguistic advergence and divergence in north-western Catalan: A dialectometric investigation of dialect levelling and border effects. Literary and Linguistic Computing 28(1). https://doi.org/10.1093/llc/fqs052.
60
Valls, Esteve. 2017. Cap a on va el català de la Franja? Alguns exemples de canvi lingüístic en curs. In Javier Giralt & Maria Teresa Moret (eds.), El repte d’investigar sobre la Franja d’Aragó: Jornada de l’Associació Internacional de Llengua i Literatura Catalanes a Saragossa (28 d’octubre de 2016). 51-86. Zaragoza: Prensas de la Universidad de Zaragoza.
61
62
Valls, Esteve. 2019b. Processos de canvi en les combinacions de clítics pronominals a la Franja. In Francesc Feliu & Olga Fullana (eds.), The Intricacy of Languages. 327-346. Amsterdam & Philadelphia: John Benjamins. https://doi.org/10.1075/ivitra.20.19val.
63
Valls, Esteve. 2022. Internal borders as a source of linguistic divergence: A multi-method dialectometric approach. Digital Scholarship in the Humanities 37(4), 1289-1315. https://doi.org/10.1093/llc/fqac001.
65
66
67
Wieling, Martijn, John Nerbonne & Harald Baayen. 2011. Quantitative Social Dialectology: Explaining Linguistic Variation Socially and Geographically. PLoS ONE 6(9). https://doi.org/10.1371/journal.pone.0023613.
68
Wieling, Martijn, Eliza Margaretha & John Nerbonne. 2012. Inducing a measure of phonetic similarity from pronunciation variation. Journal of Phonetics 40(2), 307–314. https://doi.org/10.1016/j.wocn.2011.12.004.
69
Wieling, Martijn & John Nerbonne. 2015. Advances in Dialectometry. Annual Review of Linguistics 1(1), 243-264. https://doi.org/10.1146/annurev-linguist-030514-124930.
70
Wieling, Martijn, Esteve Valls, Harald Baayen & John Nerbonne. 2018. Border effects among Catalan dialects. In Dirk Speelman, Kris Heylen & Dirk Geeraerts (eds.), Mixed-Effects Regression Models in Linguistics. 71-97. Cham: Springer. https://doi.org/10.1007/978-3-319-69830-4_5.
71
Woolhiser, Curt. 2005. Political borders and dialect divergence/convergence in Europe. In Peter Auer, Franz Hinskens & Paul Kerswill (eds.), Dialect Change. Convergence and divergence in European languages. 236–262. New York: Cambridge University Press. https://doi.org/10.1017/cbo9780511486623.011.
72
Woolhiser, Curt. 2011. Border effects in European dialect continua: dialect divergence and convergence. In Bernd Kortmann & Johan van der Auwera (eds.), The Languages and Linguistics of Europe. 501-524. Berlin & Boston: De Gruyter Mouton. https://doi.org/10.1515/9783110220261.501.
Notes
[1] In this article, we use the terms border and boundary interchangeably to refer to the lines that delimit the political limits of two bordering territories. Specifically, we use the terms internal or intra-state borders to designate the political boundaries that define territorial and administrative units within a state (such as the boundaries between autonomous communities in Spain), in contrast to external, inter-state, or international borders, which delineate the recognized limits between states (such as the frontier between Andorra and Spain).
[2] According to , in any process of language substitution, it is crucial to distinguish among the causes —such as the modernization of society or the emergence of nation-states— which are factors impacting the psychology of the linguistic community but do not necessarily lead to immediate changes in linguistic practices; the mechanisms —like the loss of intergenerational transmission of the recessive language— making the process of substitution perceptible; and the outcomes, which are concurrent factors but —unlike mechanisms— are not necessary for language substitution, such as an increase in code-switching or the integration of loanwords into the recessive language. The author argues that hybridization, as a non-essential factor for language substitution, is an outcome of this process, and incorporates it into the broader concept of patoisisation. With this concept, he refers to the spread of the belief among speakers that the recessive language does not exist as such but is only a conglomerate of local varieties. This loss of awareness of the unity of the language not only involves the abandonment of formal language usage but often leads to an increase in interferences from the expansive language —that is, the hybridization of the recessive language.
[3] In recent years, the percentage of pupils taking at least one Catalan language class has levelled off at around 80% ().
[4] The concept of roofing variety has been borrowed from , who in turn extends the notion of roofing language by . Although this term has been traditionally used to refer to the standard variety of a language in contrast to its dialects, for , the roofing variety to which a group of dialects tends to assimilate can also be the standard variety of a different language (for example, in a linguistic area fragmented by a political border).
[5] According to , only about 50% of the younger segment of the population was born in La Franja, with approximately 30% coming from abroad. This indicates just how deep the transformation of this essentially rural area has been in the 21st century.
[6] For the sake of brevity, we refrain from delving into the corpus description. However, a comprehensive account, including details on the questionnaire and the methods used to obtain informant responses, can be found in .
[7] We are aware that the comparison among the three older age groups is in apparent time, while the comparison between these three groups and the youngest age group resembles more of a real-time comparison (although this study is not longitudinal). This methodological contrast carries certain risks, including lifespan change, which we have sought to minimize by working with informants with as stable idiolects as possible, as explained later in this section. Despite these risks, we deemed it crucial to update the information collected in 1995 and 1996 with a new age group. Nevertheless, we recognize that these considerations warrant interpreting the results with caution.
[8] Although the parameters in this section have changed between the two moments of data collection, we traced the evolution of all the available indicators to ensure that no substantial changes had taken place, either between these two synchronous stages or among the relative values of the survey locations. Again, however, we suggest bearing this in mind when approaching the results.
[9] The automatic procedure we use to determine the segment distances is identical to the approach of . It first aligns the pronunciations using the standard Levenshtein algorithm (employing a binary same-different distinction between the sound segments, and aligning only vowels with vowels and consonants with consonants). Then the relative frequency of alignment of 2 segments is compared to their individual relative frequencies. If they align more often than would be expected on the basis of their individual frequencies, the segment distance will be lower than when they align less often. The exact formula is log(p(x,y)/(p(x)p(y))). A more-in-depth explanation of this pointwise mutual information (PMI) based procedure can be found in .