Digital Humanities, Dead Languages, and Real-World Web Business, or, How Do I Get Ovid on Twitter?

Editor’s Note: This is a guest post written by Amanda Krauss, a learned Latinist, former professor of Classics, and current tech guru. You can also find it cross-posted on her blog Tech in Translation.

I spent this weekend in Vancouver, at IA Summit. “IA” is short for information architecture, and as a discipline its whole purpose is organizing web information for humans. It covers everything from making a website nav bar useful to creating a user-friendly database out of wacky government information. And the folks who do it aren’t necessarily people who code, by the way. This is its own tech discipline, and if you’re obsessively organized and enjoy thinking about categories, you’ll find your people here.

I was supposed to present a poster about organizing the classical corpus online. Unfortunately, before the conference, I had the kind of week that started with me losing my passport, and got steadily worse. I did get my passport, and made it to the conference, but that was about it; neither myself nor my poster was in any shape to present. Luckily, my buddy Todd invited me to write this post, about the same topic, and with more room to develop my ideas. Of course I said, Yes please!

I have a Classics PhD and taught for a good ten years before I made the switch to software engineering and beyond. I care deeply about Classics, and just as deeply about how we put things on the web. I’m not, however, a Digital Humanities expert. I approach ancient material with the perspective I’ve gained from being in the modern tech biz, which I acquired after making my “conscious uncoupling” from academia. For that reason, and other reasons I’ll discuss, I think it’s worth thinking about how private-sector web development does (and does not) intersect with Digital Humanities (DH).

I do build classically-inspired web things, though. I put the Aristophanes translation I co-authored online, attempting to use appropriate W3C standards. This was difficult because they still haven’t decided what HTML poetry should look like. I used what was, at the time, a cutting-edge, Medium-inspired user interface (UI) for reading footnotes. A couple of years ago, I built a Vergil bot, inspired by the amazing artbots already out there. And I have tried building a few versions of an Ovid database. I like to combine my professional interests that way.

What we talk about when we talk about putting things on the web

Here’s what I’ve come to understand: When we’re talking about ancient materials, there’s a big difference between digitization and publication. Digitization, to me, means “get that stuff online, STAT!” So, we can digitize medieval manuscripts by putting pictures of each page online. Or digitize texts by uploading PDFs somewhere on the web.

That’s not publishing IMHO. While you’ve put the picture, or the PDF online, you haven’t published the actual material that lives inside the document. You could really argue it’s more akin to fancy photocopying (just with mass distribution), rather than to anything inherently digital; the meaning remains trapped, first in papyrus, then in paper, and finally in PDFs.

To make an artifact speak the native language of the web, we need to extract the text, marginalia, and maybe even illuminations into data that computers (and tech people) can read. I know it may sound heartless to talk about ancient texts as “data”, but realistically, structured data is really what the web is built on. PDF’s are shareable, for sure, but they’re not re-mixable, for lack of a better word. If I’m making a website, for example, and I want to use the material inside the PDF, rather than the PDF itself, I’ll have to take steps to extract and transform the data — unless, there’s a nice source of structured data already available.

The goal of true digital publication, conceived in this way, as the current equivalent of the printing press:

We want Tibullus and Ovid to corrupt the youth! PDF’s probably won’t cut it, though. So, how do we do that? Well, webpages and apps and social media are an interesting place to start — but again, to make that happen, we need a certain kind of information readily available.

Back to the conference. I got the latest info on how Google’s bots read your website. I learned about mathematically improbable taxonomies. And I surprised a few people when I told them that Classics had done pretty well for itself in terms of data structuring, at least in XML. Perseus (a project at Tufts) digitized a great deal of the classical corpus in 1999. How do I know it’s 1999? Because the creation date is in the XML markup, as it should be.

But here’s the thing: in my opinion, Classics DH is still stuck in 1999. It’s great that there’s a web standard – TEI – that is the basis of structuring that data. But it’s not enough, at least coming from a non-DH perspective.

In the first place, web standards are changing rapidly. There’s JSON, which is a new kid in town, but one that a lot of folks like better than XML. There’s also the fact that TEI, as far as I can tell, doesn’t talk to any non-linguistics web schema organizations. In the context of scholarship, this makes sense, but it also biases the entire information structure towards a very, very specific audience.

Perhaps this is why, even as someone familiar with metadata and web development, I often find DH a bit disorienting, like I’ve entered an alternate web development universe. DH projects often use web standards that aren’t quite what I’d expect. It’s like Bizarro world, except less evil.

Different audiences want different things

When building something online, it’s really important to know your user. And there are multiple possible audiences for our humanities projects, with differing expectations. Art historians might want the highest-fidelity photo of a piece. Archaeologists might want the same – and want it in 3D, which is possible, but requires the use of proprietary, weird software that doesn’t run in the browser. Literary scholars want the text, with notes (and again, that might mean the marginalia from the documents, or later, other texts that comment upon the original text). Students might want a dictionary or commentary to help them make their way through the material. Everybody else wants a translation, probably, or something that gives them context enough to understand what’s going on, without wading through purely academic details.

In terms of putting stuff online, DH tends to assume experts. I know that many would argue with that, and I also know that the NEH prefers projects that serve the common good. I believe this is a good faith effort. But one of my points here is that we haven’t really built an architecture that is capable of serving the general public. And even the NEH grant announcements considers data projects those that are “searching, analyzing, and understanding large bodies of material”, rather than structuring the existing data for public use.

One thing I learned at the IA Summit was that often you can’t organize information just once: two different audiences might require two different information systems. Etsy, for instance, had to use two different tagging systems: one for its makers (who are experts in the craft they practice) and another for its buyers (who were really shopping for a certain feeling, without expert knowledge).  An expert might make a beautiful cloche hat and list it for sale without ever using the word “hat”. This made it impossible for would-be hat buyers to find. It’s a familiar problem, when you let experts run things. Lacking beginner’s mind, they create material that’s impossible to navigate for actual beginners.

Easier parsed than published

It seems to me that the current state of DH is still very much focused the concerns of experts, and especially on parsing what’s already there. There’s a really great Classical Language Toolkit, for instance, which is branched from a widely used Python library for parsing language. And that’s great! It may even interest NLP folks outside of Classics – but they’re also what I’d consider a niche audience.

I’m more concerned with publishing the materials for a more general audience. And now, after this conference, I’m surer than ever that we need to think of public humanities as a separate project, with a separate information architecture.

I’m not kidding when I say I want an Ovid API. The only existing Classics API I know about is the Aeneid API, which my Vergil bot depends on. Full disclosure: I’m not a fan of Vergil. But I built the bot anyway, because that was the only Latin poetry API I could find.

One of the first things I’d expect from a discipline that wants to share its stuff is an good, solid API. So perhaps that is my main question, as a private-sector tech person: why aren’t there more Classics APIs?

Actually I’m pretty sure I know why, and here’s where we need to talk about taxonomies (just a fancy way to say “how we choose to organize things”). Classics already has a taxonomy, and as a trained Classicist, I know that if I want to read a certain poem by Ovid, I’ll can look it up under “Ovid, Amores, Book 1, poem 5”. But that taxonomy isn’t any help to a layperson who just wants to read some poetry, or to an interested amateur who might want to do something creative with the data, or to a literary type who wants to easily republish one of the poems on their site.

And that’s a relatively simple example. Add in some other potential queries, like “Vergil, Aeneid, Book 3, line 1” or “Plato, Theataetus, section 209d”, and we can see the lack of consistency. Those searches also incorporate terms aren’t unique. Both those things are bad, from a design perspective.

In the ancient world the “unique key” of a poem was its first line; since there were no titles, that’s how people knew what poem you meant. In the 20th century (quite modern by Classics standards) a taxonomy was developed for Linear B; a text’s location is built into the identifier, creating unique ids, so it’s actually not bad from a web perspective. All of this is to say, how we organize information is mutable, and recreatable, and the very foundation of what we’re doing online.

So these are the things I wonder about.  How do we put the Classics corpus online? Is a Classics API possible? Or even an API for all the poems by one author? What would that look like? And: Is the smallest data unit a single line of poetry? What about meter? How do we deal with that? To what degree do we want search to be a thing? And what about translations?

It’s a terrible mess. And/or a really interesting information architecture challenge.

The goal: public-facing architecture

I love JSTOR’s Shakespeare app and its related API. I think it’s everything a public-facing humanities project should be. Granted, there are only 38 plays and they’re already in English. But that’s okay! In tech lingo, a great MVP starts small, as an example for other versions moving forward.

I’d like something similar for Classics and other dead languages. I’d like well-documented APIs and truly public web projects. But that won’t happen with the current state of information architecture. DH could certainly learn from modern web practices, just as modern web practices might learn from Classics – which was, after all, a very early adopter of technology. Even before the Perseus project there was Pandora. Not the music app, but the HyperCard-based dictionary tool. It was an ingenious hack at the time.

Given that the humanities’ survival is threatened more than ever – and yes, we say that every year, but it’s also true – it’s time to focus on tools for non-experts. In the normal web development process that would mean you’d get your subject matter experts (the Classicists) and your tech experts (the information architects/user researchers) and your audience (regular  people interested in history, maybe?) together to hash out what, exactly, this thing would look like.

But I don’t see the normal web process happening in DH.

Which is precisely my point: it’s time to think about ancient material with an eye towards modern web development. That first means structuring ancient information in the modern format that’s expected in the private sector. Working with the Linguistics Research Center material, I’ve played with extracting the pages via Python, just so we can deliberately restructure them in a sane database – or maybe even an app! Who knows? I’ve done the same with Ovid: I’ve used Python to turn Perseus’ XML into a simplified data structure, and even a database, just to see what will happen. In both cases I’m trying to take what’s there and make it more generally usable data according to the modern web development process that I understand. I consider this my hobby, though I’d like to see it become a more regular part of DH or tech discussions.

I’m heartened by the fact that Sappho is now on Twitter, as is Linear B. I think these are really great goal posts in thinking about Classics as public humanities. But the question is: will Classics (and DH) take up the challenge of architecting its material for further projects like this? I certainly hope so.

Two Candidates – One Accent

By Axel Bohmann, Erica Brozovsky, Salvatore Callesano, Noli Chew, Kirsten Meemann, Lars Hinrichs, and Patrick Schultz

Texas English Linguistics Lab (TELL), The University of Texas at Austin

1 Introduction

When judged by their policies, Bernie Sanders and Donald Trump may be as different as any of the 2016 presidential candidates. But linguistically, they are surprisingly similar, united by their New York City accents. Nonetheless, our analyses of their speech patterns reveal some significant differences in how their NYC accents become stronger or weaker in different settings, namely interviews, speeches, and debates: Trump adapts his speech to certain factors, while Sanders does not.

The NYC accent is famous among American English dialects. It is often portrayed in films and television series, and a number of notable celebrities are identifiable with the accent, such as Joan Rivers, Danny DeVito, Woody Allen, and Leah Remini. Thus, most English speakers in the United States can imitate a speaker from NYC, or at least how they perceive this speaker to sound. In fact, the prized aggressiveness that comes with Trump’s and Sanders’ NYC accents has been noted by researchers in linguistics.

Speakers of this dialect are typically from northeastern New Jersey, western Long Island, the five NYC boroughs, and New York’s Hudson Valley. In addition, this dialect is simultaneously perceived as prestigious and stigmatized. Features of this dialect are often changing and a few classic NYC accent markers have ceased to exist, such as the pronunciation of thirty-third as “toity-toid”. However, many features are still notable in NYC English. For example, dropping of the r-sound in words like park and war, as well as the distinctive pronunciation of the vowel in words like thought, caught, and daughter, are still alive in the NYC accent. These are two of the phonetic features that we use here for our comparison of Trump’s and Sanders’ NYC accents.

Bernie Sanders, 2016 Presidential Campaign

Bernie Sanders during the 2016 Presidential Campaign

Donald Trump, 2016 Presidential Campaign

Donald Trump during the 2016 Presidential Campaign

2 Methods of Analysis

In order to determine the amount of stylistic versatility that each candidate shows when faced with different speech contexts, we began by collecting video clips of interviews, debate content, and speeches. Additionally, we selected videos that reflect both New York audiences and audiences in other parts of America to test whether the candidates shift their accents to be more like that of the assumed accents of their audiences. With the exception of the Trump gun-rights rally speech (April 2014), each clip was recorded during the 2016 presidential campaign between June 2015 and February 2016. All videos were found on YouTube and are linked in the table at the end of this post.The interviews show an unscripted back-and-forth between candidate and interviewer with high levels of interaction. By contrast, the speech and debate clips are scripted and delivered to large audiences, with little to no mutual interaction.

Given that both candidates are originally from NYC, we decided to focus on three specific accent features of that dialect region:

  • Dropping the r’s – NYC English traditionally lacks an r at the end of syllables or after vowels. In NYC, words such as park and here are “r-less” (the technical term is non-rhotic) and are pronounced without the r: “pahk” and “heeuh”.

  • THOUGHT vowel raising – For a few dialects of American English, words like cot and caught are not homophones, where cot sounds like “caht” and caught sounds like “cawht.” What we are specifically interested here is how speakers of NYC English produce a raised version of the vowel in words like caught, which we are calling the THOUGHT vowel. The raising of the THOUGHT vowel is, in part, what distinguishes NYC English from other accents that distinguish cot and caught.

  • huge/yooge – For some speakers of NYC English, the y-sound, as in yes, cannot be preceded by the h-sound, as it is in words like huge, human, etc. This leads to words that are traditionally pronounced with an initial hy- having only the y-sound. Words such as human *(“hyooman”) and *huge (“hyooge”) will be pronounced as “yooman” and “yooge”.

3 Fouath flooah (fourth floor): r-dropping

A view of New York City

A view of New York City

R-lessness, as in “Sandahs” for Sanders, “heah” for here, and “neighbahs” for neighbors, is not only a feature of NYC English, but it also occurs in other parts of the English-speaking world. However, rhoticity — the technical term for the degree to which r is used — in NYC is defined by two important factors. First, it has been shown to be socially stratified: speakers who are “r-full” (e.g. saying “storm”* for *storm, i.e. maintaining the r) typically belong to higher social classes as compared to lower, working-class individuals who are “r-less” (e.g. saying “stohm”). This type of class-based distinction often leads to social evaluation of the people who maintain such features. Producing the r in words such as park has for many years been perceived as prestigious and belonging to the upper-social classes. Secondly, rhoticity in NYC has been undergoing a change over time. It has been gaining ground, so that younger speakers are holding onto their r’s. So now NYC English may be described as “variably rhotic.”

For Trump and Sanders, our data show how rhotic each candidate is in their speeches, interviews, and debates. Both Sanders and Trump are from NYC: Sanders is from Brooklyn and grew up in a working-class home, while Trump is from a much more privileged, upper-middle class family in Queens. They are from the same generation, as Sanders is only about five years older than Trump. While their social backgrounds are different, their patterns of r-pronouncing and r-dropping are similar — at first glance.

Rhoticity of Sanders & Trump before different audiences

Figure 1. Retention and dropping of r-sounds in front of national audiences (left) and NYC audiences (right)

Figure 1 shows that Trump and Sanders exhibit roughly the same degree of r-fullness when speaking to non-New York audiences. To be sure: this regular way of speaking is, for both candidates, quite notably New York — they drop between one third and one half of all r-sounds for which this is possible. But in front of NYC audiences, it is Trump who becomes dramatically more r-less: his range of F3 values is much larger when he speaks in his hometown. Specifically, he reaches F3 values that are much higher than those of Sanders when in front of a NY audience. This shows that Trump is stylistically much more adaptive than Sanders — at least for this one accent feature. Sanders, meanwhile, shows practically the exact same use of r-sounds in New York and on the road.

In Figure 2, we see that this Trump’s variability also holds when we look at speaking context: In his political speeches and interviews, Trump drops r-sounds more frequently than in debates. Sanders, meanwhile, is more steady in his use of this feature across all contexts, with a slight rise in r-dropping during his interviews.

Retention and dropping of r-sounds according to speaking context

Figure 2. Retention and dropping of r-sounds according to speaking context

4 Cawfee Tawk: Raising the THOUGHT Vowel

NYC English is, in part, defined by the distinction of the vowels in cot *and *caught. Trump and Sanders both maintain this vocalic distinction. For the majority of American English speakers, cot and caught are homophones, however speakers of NYC English can clearly distinguish two different vowels. Cot *is produced with the same vowel as in the word *father, while caught is produced with a raised version of what is known as the open-o. For this raised vowel, NYC English speakers may produce something like “caw-uht” for caught, *“law-uh” for *law, and “caw-uh-fee”* for *coffee. For an example of a NYC version of the THOUGHT vowel, listen to how the male speaker in this video pronounces the word coffee. The raising of what linguists call the THOUGHT vowel makes it almost a sequence of two vowels one after the other.

As shown in Figure 3, Sanders sounds a fair bit more New-Yawkish than Trump in the aggregate picture (since lower values for the first formant, F1, indicate a more raised pronunciation).

It would not be outlandish to explain this global difference in the two candidates’ pronunciations of the THOUGHT vowel by the difference in their upbringing: higher-class New Yorkers (such as Trump’s family) raise their THOUGHT vowels less than the working class folks from Brooklyn among whom Sanders grew up.

Production of the THOUGHT vowel by both candidates

Figure 3. Production of the THOUGHT vowel by both candidates

In order to tease apart the treatment of this vowel by the two candidates, we used multivariate statistics. Results are shown in Figure 4. While this chart is somewhat more complicated than the preceding ones, it shows some crucial tendencies.

Output of multivariate statistics model for THOUGHT by candidate, context, and audience.

Figure 4. Output of multivariate statistics model for THOUGHT by candidate, context, and audience.

First, this chart shows that there is a small-ish, but highly significant tendency for both candidates to produce the THOUGHT vowel in a more raised way (i.e. with a lower F1 value) when they speak in front of New York audiences. This is unsurprising.

However, our model also shows that Sanders has the opposite trend in his speech as compared to Trump. This means that the strategy of “pandering to the audience,” i.e. speaking more New-York-like when in New York, is represented mostly by Trump, not by Sanders. Again, Sanders appears more solid in his usage across contexts.

Next, we see that the two most extreme terms in the model are “Sanders : speech” (far left) and “Speech” (far right). What this shows is the trend that overall in the data, the candidates seem to turn on the New York accent the most when they give political speeches, in comparison to interviews and debates. But again, Trump does so significantly more strongly than Sanders. Outside of speeches, Trump sounds much more mainstream American, while Sanders still sounds like the same strongly-accented New Yorker.

5 Huge/Yooge

The skyscrapers in Manhattan are yooge (huge)! – Another classic NYC English feature that we are interested in for this analysis is the elimination of the h-sound, when it comes before the y-sound. This leads to the pronunciation of words like human as “yooman” and humiliate as “yoomiliate.”

We ran an analysis to see how Sanders and Trump differ in terms of their dropping or retention of the h-sound in these speech contexts. From our data, it seems that Sanders is a much more frequent user of words that can drop the initial h-sound: we counted five words for Trump and ten for Sanders which started with a possible “yoo” or “hyoo” sound. Trump drops the h-sound in all five instances (e.g. “yoomane”* for *humane). Meanwhile, Sanders’s reduction of the h-sound is almost as categorical as Trump’s. Of his ten instances words that are relevant for this feature, he maintained one h-sound (namely, in the word human).

Overall, we find that the huge/yooge feature is a point of linguistic commonality between the two candidates.

6 Summary of Findings

  • Both Donald Trump and Bernie Sanders are easily categorizable and recognizable as speakers of NYC English, showing high usage levels for all of the features we studied.

  • Trump is a pliable adapter and linguistic performer who alters his speech depending on whom he speaks to and in what contexts. He shows the highest usage levels for New York accent features when giving political speeches and when addressing audiences in New York.

  • The same factors have only the slightest effect on Sanders’ speech, if any at all. Sanders frequently drops his r’s and raises his THOUGHT vowels, consistently across contexts. He does not change his speech to accommodate the rest of the world.

  • The tendency of style-shifting, which Trump shows much more strongly than Sanders, is common in the speech of politicians. What is remarkable is the fact that we do not see it in Sanders.

  • Although Sanders has spent many years in Vermont, outside of the cultural space of NYC, he remains a completely steady, unperturbed, almost textbook speaker of NYC accent of English.

7 Appendix: Video Data

Sanders Trump
Interview The Late Show with Stephen Colbert The Late Show with Stephen Colbert
Bill Maher interview Bashing Cruz and Bush
US Deficit Interview New Hampshire Win
Religion Interview New York Interview
Morning Joe Wolf Blitzer interview
State of the Union Immigration Plan
Nevada Loss Becoming a Conservative
Growing Up Regent University Interview
Debate Break Big Banks Up War on Women
Gun Provisions Jeb is Wrong
Speech Speech in Rochester Rubio’s Ears
Mocking Trump Gun Rights Rally
Wall Street Speech Campaign Announcement
Wall Street Greed Shoot Somebody
Tagged with: , , , ,

The Evolution of a Politician’s Accent

Lars Hinrichs, a professor at the University of Texas at Austin and associate of the Linguistics Research Center, is a specialist in English accents, with a particular interest in accents peculiar to Texas and other southern states.

Hillary Clinton in Texas in the 2008 Presidential Campaign

In particular Prof. Hinrichs, in collaboration with a number of student researchers, has been working on a novel application of the study of English accents: studying how politicians adjust their accents based on the perceived audience to which they address their remarks. Prof. Hinrichs and his group applied their expertise to the 2014 race for Texas Governor. In a two-part series, Vowel Power, for the LRC’s blog, they provided an explanation of their methods and a summary of their results (see Part 1 for their basic findings and Part 2 for the methods behind their analysis).

Never content to rest on their laurels, Prof. Hinrichs’s group is closely following the current national campaign for the presidency of the United States. They have already found some novelties of language use surrouding presidential candidate Sen. Hillary Clinton. The Texas Standard has taken note of their work and sat down to interview Prof. Hinrichs. Take a moment to listen to the interview and read some of the highlights here at the Texas Standard‘s website.

Vowel Power: Methods

Techniques used in our Study of the Texas Accent in Greg Abbott’s and Wendy Davis’s Speech

by Lars Hinrichs, Axel Bohmann, and Erica Brozovsky


When studying the speech of politicians, it is assumed that much of their linguistic performance is just that: a performance, where the choice of words, phrases, and pronunciations is to a considerable degree monitored. While it is hard to obtain any degree of certainty about whether a certain way of pronouncing a word was intentionally chosen or whether a speaker just happened to say a word in a given way, without giving it much thought, we assume that ultimately, the way that an individual speaks in a given social context is symbolic of how he or she wishes to be perceived.

In order to classify Abbott’s and Davis’s speech as closer to either a Texan or a mainstream U.S. English model, we first established two empirical baselines: the first, model Texans, represents speakers of (a version of) the traditional, local accent that many outsiders would expect to hear when meeting a Texan. The second, neutral Texans, exemplifies mainstream U.S. English speech without a strong regional accent – as spoken by Texans. Each group included ten speakers. Together, the two groups provide orientation as to the range of variation one may encounter in Texas between locally accented and neutral, more mainstream-like speech.

We did not follow a specific paradigm for choosing our example Texans; the speakers were selected based on availability and quality of recorded material and our opinion of how well they represent either the model Texan accent or the mainstream U.S English accent. We aimed for an even divide by gender, and each video interview or speech sample we selected was recorded after 2001. All videos were found on YouTube and are linked in the tables below.

Model Texans Birthplace Background DOB Video
Matthew McConaughey Uvalde Longview 11/4/69 Bobby Bones interview
Mary Gordon Spence Brownwood Brownwood ? A Magic Moment story, KUT Story speech
George Strait Poteet Pearsall 5/18/52 Headline Country interview
Joel Burns Fort Worth Crowley 2/4/69 The Last Word interview
Don Meredith Mount Vernon Mount Vernon 4/10/38 SMU interview
Ann Richards Lacy-Lakeview Waco 9/1/33 Texas Politics conversation
Laura Bush Midland Midland 11/4/46 Texas Exes Award, Texas Book Festival
Kay Granger Greenville Fort Worth 1/18/43 Schreier-Fleming interview, USGLC Tribute Dinner
Jim Hightower Denison Denison 1/11/43 Public Citizen interview
Linda Harper Brown Dallas Dallas 3/20/48 Election Video, Transportation Committee statement

After choosing, we transcribed each video and time-aligned the written text to the audio. From there, we measured every instance of our chosen vowels, using the mean as our baseline. A normalization algorithm was applied to make measurements from male and female voices comparable.

To gauge Davis’s and Abbott’s stylistic versatility, we analyzed a number of recordings from both candidates, made between July 2013 and February 2014 in the run-up to the gubernatorial election. These materials were selected in order to provide a wide range of speech contexts, differing with regard to audience, participation structure, and spontaneity. In addition, we opted for recordings that would enable us to draw meaningful comparisons between the two candidates. Hence, within each speech context we chose material with maximally similar contextual parameters.

Neutral Texans Birthplace Background DOB Video
Michelle Beadle Roanoke Boerne 10/23/75 Game of Thrones quiz, Dan Patrick interview
Marie Brenner San Antonio San Antonio 1/1/49 Google Author talk, Flashpoint Apples and Oranges
Linda Ellerbee Bryan Houston 8/15/44 British Royal Family chat, InnerVIEWS interview
Bill Macatee Rome, NY El Paso 11/17/55 Broncos-Raiders game, US Open recap
John Quiñones San Antonio San Antonio 5/23/52 Brookdale TV interview
Melinda Gates Dallas Dallas 6/15/64 Indian Women and Children
Michael Dell Houston Houston 2/23/65 Corporate Valley interview
Paul Lockhart Amarillo Amarillo 4/28/56 Elon University talk
Gloria Feldt Temple Temple 4/13/42 To the Contrary interview, Interview Forward interview
Wendy Kopp Austin Dallas 6/29/67 Charlie Rose interview, Big Think interview

The data presented here comprise five situational categories. First, we included both politicians’ announcement speeches. These are scripted addresses given in front of a politically supportive, co-present audience, but the intended audience is clearly the entire electorate of the State of Texas. With the exception of cheers and chants from the audiences, there is little interaction with other speakers. Next, we analyzed a campaign ad from either camp, for which the intended audience overlaps with that of the announcement speeches, but the rhetoricity of the situation is enhanced both by the possibility to rehearse and do several takes and by the affordances of post-production. Fortunately, we also had access to two separate interviews Evan Smith of the Texas Tribune conducted with Davis and Abbott, respectively. These constitute situations of unscripted, spontaneous — though still highly performative — speech with a higher level of interaction than the other situations. The interviewer was the same in both cases. The final two categories are a speech at a local rally from either candidate as well as an interview on political topics of national interest broadcast to a potentially nation-wide audience.

Abbott Davis
Campaign Announcement San Antonio Haltom City
Campaign Ad Preserve Texas A Texas Story
Interview Evan Smith interview Evan Smith interview
National TheBlaze interview Fort Worth Star-Telegram interview
Local speech Rally with Nugent Stand with Texas Women

We conducted spectrographic measurements of all the vowels in the entire material (both from candidates and control groups) and thus obtained a dataset of formant measurements for 27,427 instances of vowels.

Three American Vowels: ‘PRICE’, ‘FACE’, and ‘PEN’

We decided to focus on three vowels in the speech of the candidates. Using words which illustrate typical pronunciations, we call the vowels PRICE, FACE, and PEN, respectively. All three vowels are variable in Texas, in the sense that there is a more Texan, local-sounding way of pronouncing them and another, more mainstream-U.S.-like variant that speakers can choose from every time they use this vowel.

From our spectrographic measurements we extracted all words containing instances of the PRICE, FACE, or PEN vowels in the speech of Abbott and Davis. In total, we extracted 1,062 vowel tokens for Abbott and 1,216 for Davis. The two control groups are represented at 1,239 vowel instances in total. The extracted data allows us to determine quite exactly, for the three selected vowels, the degree to which the candidates resemble either the model Texans or the neutral Texans.


The first vowel we studied is probably the best-known vowel characteristic of Texas English: it is the vowel in the words fly, rice, rise, ice, etc. According to linguistic convention, we label this vowel PRICE.

In strongly local speech in Texas, instances of the PRICE vowel are pronounced as ah: fly sounds like flah, rice sounds like rahs, etc. In mainstream U.S. English, meanwhile, instances of the same vowel are pronounced as a diphthong: the pronunciation starts on an ah sound and moves toward an ee sound, e.g. flah-ee for fly.

The Texas pronunciation of the PRICE vowel is extremely well known – it is a linguistic stereotype of both Texan and Southern speech more generally. Most English speakers can give a reasonable impression of a PRICE-vowel word in a Texas pronunciation. At the same time, speakers can always choose not to sound Texan on this vowel by choosing a more mainstream realization instead.

To calculate the degree of ‘Texanness’ in this vowel, we measured the vowel height for each instance of this vowel at the beginning (nucleus) and at the end (glide) of the vowel. We subtracted height at glide from height at nucleus and thus obtained a value corresponding to the vowel’s degree of modulation: the higher this value, the more mainstream, or the “less Texan,” the vowel produced. The lower this value, the more Texan-sounding each instance of the vowel.


The second vowel in our study is the FACE vowel: the vowel in words such as chase, faze, hey, eight. It, too, is a diphthong — a vowel that starts in one position and ends in another — and it, too, can be produced in either a more Texan-sounding or a more mainstream-like way. The more traditionally Texan pronunciation has what is called a “lowered nucleus”” (lowered beginning). For example, to a non-Texan, a very locally Texan-sounding pronunciation of the word chase sounds almost like chise. Similarly, faze sounds almost like fize, hey like hi, eight like ite, and so on.

As a feature of Texas English, the FACE vowel with a lowered nucleus is not very much talked about. Unlike the PRICE vowel, this vowel is much more likely to fly beneath the radar of conscious control. This means that a person who is intentionally ‘putting on’ a Texas accent, but who does not speak it naturally, might miss this feature: they are much more likely to consciously control the way they pronounce PRICE vowels than they are FACE vowels.

Our measure to determine the place of each FACE vowel instance on the scale from a more mainstream to a more Texan realization was the same as the one used for PRICE: we subtracted height at glide from height at nucleus. This time, higher values indicated a more local-sounding realization, and lower values indicated pronunciations that sounded more like the mainstream-U.S. norm.


The third vowel in our study is PEN. Actually, the conventional label for this vowel is DRESS (in words like bed, bet, Ed, etc.), but we narrow down our context to words in which this vowel is followed by a nasal consonant: m, n, or ng. This means that we only studied words like men, end, ten, them, remember, etc., but not bed, rest, bet, etc. (we actually found no instances of this vowel being followed by the ng consonant). Therefore, we changed the label to PEN. As a feature of Texas English, PEN vowels are changed to sound like short i: pen can sound like pin, men like min, ten like tin, and so on. When the PEN vowel gets pronounced in this way, phoneticians say it is raised relative to its more neutral, mainstream pronunciation.

The raised PEN vowel is the ‘youngest’ feature of Texas English: young people throughout Texas, even in big cities like Houston and Dallas, use this feature – while the PRICE and FACE features are more conservative features, i.e. they tend to be used more frequently by older, more rural speakers of Texas English. Like the FACE feature, PEN raising is not constantly talked about among Texans: while some speakers of Texas English are aware that their speech shows this feature, there are also many other speakers who don’t realize that theirs does. That is to say, like the FACE vowel, this feature of the Texas English accent is somewhat less stereotyped than the PRICE feature (in which PRICE vowels are flattened).

Our measure to determine the degree of Texanness for this vowel was, simply, height at nucleus: higher vowels sound more locally Texan, and lower vowels sound more mainstream.


The method laid out here allows us to measure the Texanness of a given speaker’s speech by degrees. In our blog post Vowel Power: Local Accents and Stylistic Versatility in the 2014 Race for Texas Governor we show it can be used to compare the speech in context of the two current candidates for Texas governor.

Vowel Power: Local Accents and Stylistic Versatility in the 2014 Race for Texas Governor

by Lars Hinrichs, Axel Bohmann, and Erica Brozovsky

As Texas prepares for this year’s gubernatorial elections, we at the Texas English Project wondered: What role does language play in this campaign? Beginning with Barack Obama’s first presidential campaign, the linguistic performance of major candidates has attracted a new kind of interest from linguists, journalists, and political commentators. While candidates’ ways of speaking have always attracted commentary, Obama foregrounded a new dimension of political speech: stylistic versatility.

Greg Abbott

More specifically, it is Obama’s ability to code-switch between an unmarked, mainstream-American speaking style and speaking styles that are identifiable as coming from the African-American community. Samy Alim and Geneva Smitherman have written about Obama’s language use both in the New York Times and in their 2012 book called Articulate While Black: Barack Obama, Language and Race in the U.S.. They show how Obama’s stylistic versatility sets him apart from other, more traditional candidates for president. For example, they argue that Obama’s 2012 opponent, Mitt Romney, was perceived by the public as flat and invariable, relative to Obama, and that this was due in part to his relatively homogeneous, monostylistic way of using language.

Wendy Davis

The two Texans who are currently running for governor are not part of different ethnic groups, as was the case with Obama and his opponents in 2008 and 2012. Still, we asked ourselves:

  • Does language use distinguish political candidates at the state level the same way it does at the federal level?
  • Are there meaningful differences between candidates in terms of linguistic-stylistic versatility even when they both belong to the same (majority) ethnic group?
  • Is linguistic versatility typically more prevalent in the speech of Democrats, as the example of Obama might suggest?

In order to be able to think about these questions in more detail, we decided to conduct a study of publicly available recordings of both Greg Abbott and Wendy Davis. We set out to find evidence of stylistic versatility in each. It should be clear that we would not expect stylistic variability to be expressed the same way in the two Texan candidates as it is in the President. In other words, we would not expect to find, in either Greg Abbott or Wendy Davis, code-switching behavior between mainstream U.S. English and an ethnically specific variety such as African-American English. However, both candidates spent much of their formative years in Texas: Greg Abbott, who is 56 years old, was born and raised in Dallas County; Wendy Davis, 51 years old, moved to Fort Worth with her family at age 11, coming from Rhode Island. So while Abbott’s exposure to Texan-sounding English in childhood started at an earlier age, Davis was immersed in a Texas-English-speaking environment at a sufficiently young age to still be influenced by it, enabling her to acquire familiarity with, and possibly partial proficiency in, the Texan way(s) of speaking English.

While both candidates have some access to Texan speech forms, they also both grew up fully aware of the supra-regional standard of American English, to which they were exposed through the media, their education, their proximity to a large city, and so on. Thus, both Abbott and Davis have the linguistic resources to vary between more Texan-sounding and more mainstream-U.S.-sounding speech. We can with good reason assume that, like President Obama, they will find it expedient to speak in a more Texan-sounding style in some situations and in more mainstream-sounding style at other times.


The Texas Accent Index

In our first pass at the data, we formed a “Texas Accent Index” (TAI) for each of three vowels which may be pronounced differently according to whether the speaker has adopted a Texas-style speech variety or a more mainstream-U.S. variety. The vowels chosen were i as pronounced in the word price, a as pronounced in face, and e as pronounced in pen. For example, the vowel i also occurs in the word right: a more Texan pronunciation of this word might have a pronunciation which we could spell raht. The vowels a and e also show differences between Texan speech and a mainstream-U.S. variety.

The index was calculated as follows: for each vowel, the total extent of possible variation was taken to expand from the lowest to the highest value for that vowel in our entire dataset. For each of Abbott’s and Davis’s vowels, then, we calculated its position on that scale as a proportion of the overall range of variation. Thus, we obtained a value between 0 and 1 for each vowel from the candidates. The scales were oriented so that, for each vowel, values closer to 1 indicated that realizations were “more strongly Texan-sounding”, while values closer to 0 stood for realizations that were “more like mainstream U.S. English”.

Figure 1. Texas *Accent Index* for the PRICE, FACE, and PEN vowels. *Higher* values indicate more local-sounding realizations of these vowels.

As Figure 1 shows above, Abbott’s mean TAI scores for all three vowels are higher than Davis’s. This observation is in line with every hypothesis we had about the data: more local-sounding speech is typically associated with males more than with females, with older speakers more than with younger speakers (although the age difference between Abbott and Davis is only five years), and in addition, Abbott has the biographical “advantage” of having spent his entire childhood near Dallas, which gave him greater exposure to Texas forms of speech, compared to Davis’s late start at age 11.

Returning now to our original interest in versatility, we will look at each vowel individually, and subsequently try to see if we can explain any variation in the two candidates’ use of these vowels in the different speaking contexts that we captured by our selection of data.


Results for the PRICE vowel are shown in Figure 2 below. For this vowel, the larger the F1 delta, the closer to the mainstream American English pronunciation a given pronunciation is. Thus, in our graph, lower bars indicate more “Texan” pronunciations. The two horizontal lines show the means for two control groups: model Texans, with speech characteristic of a Texas variety, and neutral Texans, with speech closer to the mainstream-U.S. variety.

Figure 2. Abbott’s and Davis’s realizations of the PRICE vowel in different situations. *Higher* values stand for more mainstream-sounding realizations of the vowel.

Abbott pronounces his PRICE vowels more closely to the local, Texas English norm than Davis in every context we studied: the chart shows his PRICE vowels clustering around the horizontal line for the model Texans in every context. Davis, on the other hand, orients much more to the neutral Texans: in three out of five contexts, she even out-performs the neutral Texans, achieving mean F1 delta values that are even higher than the mainstream control group’s. In other words, Davis avoids this stereotype of Texan speech even more strongly than the neutral baseline.

It is interesting to consider how context explains both candidates’ variation for this vowel. Davis varies in a way that can easily be explained by her audience: she has the least amount of modulation in the vowel at a local campaign event in Texas, where she addresses (almost) only locals; the highest (closest to mainstream) value is achieved at the campaign announcement, a highly formal event whose audience can easily be thought of as “the whole world”, i.e. transcending Texas.

By contrast, Abbott’s variability is rather small, and the difference among contexts cannot be explained very well by audience type. If we consider only this vowel, it appears that Abbott just talks the way he talks: with a fairly low F1 delta for PRICE, close to the model Texan norm, and differences among speaking contexts seem more or less incidental.


Figure 3 below shows results for the FACE vowel. Note that the low/high mapping to the Texan/mainstream binary here is reversed: larger measurements for the F1 delta indicate a more traditionally Texan-sounding realization of the vowel; smaller measurements are more like mainstream realizations. Correspondingly, the horizontal line for neutral Texans is lower than the line for model Texans.

The results reflect the fact that the lowered beginning of the FACE vowel that arises in Texas English (the pronunciation of face could be written fice) is a less stereotyped feature of Texas English than the variation in PRICE: speakers are not as fully aware of the feature, so they apply less conscious control to it. The picture is less orderly. Even though the TAI values for this vowel tell us that overall Abbott’s performance on this vowel is consistently closer to the model Texan norm than Davis’s, his average F1 delta values are only larger than Davis’s in two out of five contexts, while Davis’s are higher in the others.

Figure 3. Abbott’s and Davis’s realizations of the FACE vowel in different situations. *Lower* values stand for more mainstream-sounding realizations of the vowel.

It is interesting that Abbott out-performs Davis most strongly on this index of Texanness at the local campaign event, a rally with the country singer Ted Nugent. Recall from Figure 2 that for the PRICE vowel, Abbott and Davis were basically on a par for the “local event” context. This finding might be explained by the fact that quite possibly, even Abbott applies some conscious control to his vowels: he may have been overcorrecting against assumed expectations at the Ted Nugent event, intentionally avoiding overly clichéd pronunciations of the strongly stereotyped PRICE vowel. And so – even though the event headliner was a rock star appealing to a conservative fan base, and even though the audience was made up mostly of locals – he did not over-perform on flattened PRICE vowels. By contrast, the FACE vowel is not as much controlled by speakers’ awareness. This may explain why on this vowel, Abbott performs exactly according to the hypothesis that local events with local audiences would trigger the most local-sounding speech.

Another remarkable fact is that in the two campaign contexts (Campaign Announcement and Campaign Ad) Davis outperforms Abbott in terms of the localness of this vowel. Davis is a speaker who, overall, avoids sounding like our model Texan control group, but for a vowel feature that has as little stereotype attached to it as FACE vowel lowering, she may – more or less consciously – converge to the Texan norm, despite her overall design.

Finally, we note that the Campaign Ad context shows basically the converse picture to the Local Event context. A TV ad for the campaign is, arguably, the speaking context that holds the greatest potential for candidates to plan their way of speaking. In other words, candidates are most likely to “perform” in an ad. In this context, we find Abbott using very clearly de-accented, or mainstream-like, pronunciations of the FACE vowel, while Davis uses pronunciations that are very strongly Texan-accented. In other words: Abbott projects a supra-regional face in the ad, while Davis projects a local, Texan identity by her language use, at least for this vowel. Meanwhile, at the local event, we would expect much more spontaneous (and less performative) kinds of language use. Here, we find Davis using forms of the vowel that are closer to the mainstream norm (i.e. closer to the “Neutral Texans'” values), and Abbott choosing Texas-accented vowels.

It appears that the two candidates’ approaches toward Texas-accented speech are each other’s opposites: Davis, despite her exposure to the Texas accent in childhood, was socialized primarily as a speaker of mainstream U.S. English. Given the chance to plan and perform her speech, as in the recording of a campaign ad, she projects a recognizably local character. Abbott, meanwhile, speaks with the Texas accent more naturally (as the local event context shows) — but when planning his performance on the recorded campaign ad, he makes an effort to show that he is also capable of speaking in a mainstream-accented way. We will return to these observations in the conclusion below.


Texas English typically raises the PEN vowel — our way of referring to the short e vowel before nasals (m, n, and ng). For example the word men in Texas speech often comes out as min; the words gentle and reMEMber, among others, show a similar variation. Even though younger speakers use this Texas accent feature very widely, it is not very strongly stereotyped: Texans do use this accent feature, but they don’t talk about it much.

Figure 4. Abbott’s and Davis’s realizations of the PEN vowel in different situations. *Higher* values stand for more mainstream-sounding realizations of the vowel.

As Figure 4 above shows, Abbott and Davis diverge very consistently on this feature. (Note for this chart that higher values indicate more mainstream-like vowels, and lower values indicate more Texan-sounding vowels.) Abbott’s PEN vowels, in keeping with his Texan roots, cluster around the model Texan baseline. Meanwhile, Davis’s measurements are not just as high as the neutral Texan control group’s — they are consistently higher than the baseline. This finding confirms not only that, as one would expect, Davis failed to adopt some of the accent features of Texas English, especially in the case of a feature such as this, which is adopted and advanced by children. It also supports our impressionistic observation that Davis has traces of a broadly West Coast accent: lowered PEN vowels are an innovative feature of California speech. Given Davis’s biographical influences (Northeast/Texas), we find ourselves only able to conjecture why she might exhibit this accent feature. It is possible that she aims toward an accent that stands for a modern, Californian, female American. At any rate, Davis clearly disregards the local, Texas-accented norm for this vowel class entirely.


In summary, we found that:

  • Both Abbott and Davis show linguistic versatility across speaking contexts.
  • While Davis is slightly more linguistically versatile than Abbott, Abbott is clearly and more consistently oriented to the norm of Texas-accented English.
  • Considering our findings for the three vowels together, Davis emerges as predominantly using a mainstream U.S. English accent, while Abbott aligns, in most contexts, with the Texas accent of English.
  • Remarkably, both candidates depart from this pattern in their respective campaign ads: here, Abbott tempers his Texas accent, while Davis puts one on, if only partly: modification of the FACE vowel is the only Texas accent feature she seems to occasionally embrace strongly; for the other vowel classes that we studied, Davis showed an overall robust orientation toward mainstream U.S. English.

Editor’s Note: The authors explain in a second post some of the details behind their analysis of Texas speech.

The Second Amendment: Our Latinate Constitution

The Second Amendment, as written, really doesn't say anything about personal protection

The New Yorker recently posted a poignant and direct commentary by Jeffrey Toobin concerning popular understanding, or the lack thereof, of the Second Amendment to the Constitution: “A well regulated militia being necessary to the security of a free state, the right of the people to keep and bear arms shall not be infringed.” Mr. Toobin introduces this phrase by stating, “The text of the amendment is divided into two clauses and is, as a whole, ungrammatical…”. Overall Mr. Toobin’s commentary provides numerous insights, but I find this introduction to the Amendment slightly in error. The sentence is extremely grammatical, and failure to recognize the grammatical relations involved opens the door to serious misinterpretation.

Begin with the claim that the sentence contains two clauses. This depends on how you interpret the term “clause”. If by “clause” we signify “a verb and a subject”, then yes, we have two clauses: “A… militia being…” and “the right… shall not be infringed”. But this interpretation of “clause” is very technical and may lead to some odd conclusions for the non-linguist. By this definition, “Bob winning was unfair” has two clauses, shown here in brackets: “[[Bob winning] was unfair]”. That is, [Bob winning] has a verb “winning”, and “Bob” is the subject, the one “winning”. And that whole notion, the winning that Bob seems to be doing, is itself the subject of the verb “was”, so we have a second clause. But that’s not the only notion of “clause”. More importantly it’s likely not the notion applied by educated 18th-century gentry.

The second notion of “clause” is “a conjugated (or finite) verb together with a grammatical subject”. Simply put, we conjugate a verb when we change its form to accommodate differing subjects: “I go” is a conjugated verb plus a subject, since if we change the subject to “he”, we must also reflect that change in the verb: “he goes”. By this definition, “A… militia being necessary…” is not a clause. If we change the subject, this change is not reflected in the verb: “… me being necessary…”, “… you being necessary…”, “… a horse being necessary…”, “… twelve reindeer and a candy cane being necessary…”. That is, “being” is not a conjugated verb. It’s merely a participle, a second-class citizen in the verbal metropolis.

We should apply this second notion of a clause in our interpretational voyage through the Second Amendment precisely because it is the definition one encounters in the study of classical languages such as Greek and Latin. At the time the Founding Fathers penned the Constitution, Latin formed a mainstay of upper-class education. Moreover, even in the unlikely circumstance that those Founding Fathers who shaped the phraseology of the Constitution had little familiarity with Latin, nevertheless the categories and structures of Latin permeated, or even provided the model for, the teaching and understanding of English grammar during that period.

One particularly important Latin construction, oft maligned by students of Caesar’s Commentaries, was the dreaded ablative absolute. The “ablative” is a Latin “case”: a way of modifying a noun to show its function in a sentence. English too has cases: “Bob” can be the subject of a verb (“Bob bites the dog”) or the object (“The dog bites Bob”). But the modified form “Bob’s” denotes possession: “I bit Bob’s dog.” By changing the ending, we alter the relation of the noun “Bob” to the other elements of the sentence. English has very few of these cases, Latin many. One of these is the ablative.

The Latin ablative plays a special role in absolute constructions: structures composed of a noun and a participle (like “being”, a non-finite verb) which are grammatically independent of the rest of the sentence, though they convey important information relevant to the sentence. Because they are absolute, i.e. grammatically independent, no particular case really suits the noun and accompanying participle: the case denotes the grammatical function of the noun in the sentence, but a noun in an absolute construction has no such function. Nevertheless Latin as a linguistic system requires each noun to have a case, just like English requires us to say “I go” rather than “me go”. And so Latin tosses absolutes into the ablative — the junk case, really, the case that Latin puts nouns in when it doesn’t know what else to do with them.

Ah, Latin. How quaintly antique with its numerous cases and its ablatives absolute. Fitting that Latin and its complicated grammar should be confined to the Halls of History — or of the Vatican! But… we have absolute constructions in English too. They’re often short and frequently informational throwaways. Imagine, for example, a debate hosted by a moderator. After a comment by one debater, the moderator might shift to the second debater by saying, “That being said, we should ask the esteemed Ms. S…”. That little introductory quip, “that being said”, is an absolute. It has no grammatical relation to our asking something of Ms. S. Sometimes we shorten it even further: “That said, we should…”.

So absolutes are all over the place, both in English and in Latin. What’s their point? Grammatically, not much, since they’re absolute. Really they provide additional information bearing on the rest of the statement (otherwise they’d be left out, right?). For example, we find in Cicero’s Oration on Pompey’s Command: mercātōribus … iniūriōsius tractātīs bella gessērunt(our) traders having been treated rather badly, (our ancestors) waged war” (5). And in his Oration for Milo: semper exīstimābitis vīvō P. Clōdiō nihil eōrum vōs vīsūrōs fuisse “you will always think that, (with) Publius Clodius being alive, you would never have seen any of these things” (28). Evidently it’s up to the reader to figure out the connection, but these examples show they can be pretty important: e.g. whether Publius is alive or not decides the question of how likely you are to have seen these things. In fact a primary use of the absolute construction in Latin is to give the conditions under which the rest of the sentence is valid. And we see this in everyday English: “My philosophical leanings being what they are, I would draw a different conclusion.” The absolute tells me what has to be true for me to draw my conclusion, i.e. my leanings have to be a certain way.

And that brings us back to the Second Amendment. The Amendment, as written, is in fact eminently grammatical. The Founding Fathers were no strangers to absolute constructions. Still in that era the height of written English style was to emulate compositional techniques of the classical Greek and Latin authors. And in the Second Amendment, everything up to the comma — “A well regulated militia being necessary to the security of a free state,…” — is an absolute construction. It is not a “clause” in our schoolmarmish sense of the word, nor is it meant to be. Grammatically it has nothing to do with the rest of the sentence: “regulated”, “militia”, “necessary”, “security”, “free”, “state” — none of these words recur in the sentence. So why is that absolute there? To provide the conditions under which the rest of the sentence is valid. That is, “the right of the people to keep and bear arms shall not be infringed” only insofar as this accords with the necessity of maintaining a “well regulated militia” to insure the “security of a free state”.

Where does this leave us? As a well constructed sentence, the Second Amendment says this: the people have a right to bear arms, inasmuch as that pertains to forming a regulated militia to secure a free state. Nothing more, nothing less. What of the right to personal self-protection? Who knows! — the Second Amendment does not talk about that. The main clause, “the right of the people to keep and bear arms shall not be infringed”, cannot be read without the preceding absolute — otherwise the Founding Fathers would have omitted that absolute. (I take it as given that they included in the Constitution only those words that they thought should be there and be interpreted; that they didn’t insert window-dressing or fluff.) Moreover, assuming the Founding Fathers were rather well educated, none of them would have misunderstood the limiting condition that the initial absolute put on the concluding main clause. Importantly it sets the topic: the militia, not the individual. We can certainly hem and haw as to the meaning of individual terms in the Second Amendment — “militia”, “well regulated”, “the people”, “security”, “infringed”, “arms” — but we should be crystal-clear as to the grammar. If one thing is manifest, it’s that the initial absolute puts a limit on the applicability of the main clause; the latter cannot and should not be interpreted without the former.

Nota Bene: The text of this post was actually written December 18, 2012. But we needed to create a blog here at the LRC in order to publish it! Of course in the meantime other scholars have had the same reaction to Mr. Toobin’s column, one of these being published in the excellent Language Log at UPenn. Regardless of any similarities, the two articles were written independently.

Tagged with: , , , ,

Social Widgets powered by