From Studyplace

Jump to: navigation, search
Miscornmumcation1: Users and machines embroiled in literacy misunderstandings on blogs
Gillian "Gus" Andrews
On some nights I sit in the parlor of my sad savaged blog and think it was only a dress rehearsal, a dry run. That I will build another blog elsewhere and make its seams tighter, armor it and therefore myself better for the world.... At certain times I persuade myself an admirable stasis is attained: my blog abides, adapts, is made worldly by its users. At other moments I feel we three stalk one another: prey and predator that have each come under my roof, my own role unknown as yet. It is then I think I hear the blog ticking like a bomb. -- Jonathan Lethem, "The Dreaming Jaw, The Salivating Ear," Harper's Magazine, October 2009



The locus of control in regards to media technology has been the subject of debate since scholars began the study of such technologies in the early 20th century.2

Some determinists claimed that audiovisual and digital media have had a more negative impact on individual users and on Western civilization than did print media; Neil Postman and older media effects theorists, including behavioralists like Bandura, were among these, while Larry Cuban and Clifford Stoll represented critique of newer digital media. A similarly pessimistic view of media cleaved more closely to Marx: Chomsky and Herman, Bagdikian, and McChesney see an inevitably oppressive effect when governments and corporations become involved in media production.

Then there are media and technology scholars who, while they believe technologies' effects are neutral or positive rather than negative, still see these effects as monolithic. Many writers about the Internet, for instance, fall into utopian predictions of the change they expect it to enact; Benkler is one of these. McLuhan, while not as down on the broadcast media he wrote about, still believed their effects to be dictated by the shape of their technologies ("the medium is the message.") Innis also claimed the power of shaping society for specific communications modes, saying oral communication supported one kind of society, written another. This is also a good deal like the argument made by Ong about text media in particular.

There are media and technology scholars, social constructivists, who beg to differ. Jenkins, borrowing directly from de Certeau, argues that rather than being passive recipients of media texts, media watchers actively reinterpret and respond to these texts with their own texts, refiguring them with their own critiques of the original texts' assumptions. Within studies of educational technology, a group calling itself the New Literacies Scholars have demonstrated that use of a given medium rarely has uniform effects, and, in fact, those targeted for literacy education with the use of media and technology frequently repurpose media and technology to their own resistant, creative ends (Street, Gee, Finders, Lankshear and Knobel, Leu, Leander, etc.)

This dissertation arises from a sense that neither determinism nor constructivism alone adequately explains what is really going on in interactions with media and technology. Certainly there are elements of media and technology which are difficult or "costly" (Latour, 1987) to resist. Certainly, though, there are also unmonitored spaces where creative, resistant uses of a medium are possible. But in which cases do each of these observations hold? What are the constellations of possibilities which enable a medium or technology to dictate how it is used or interpreted, or to enable the user to develop their own, possibly resistant, creative uses and interpretations? At the level of individual actions, what does this look like?

Deibert and Carey are two scholars of media technology who staunchly refuse to take a purely structural-functionalist, technological-determinist, or social-constructivist view of the media they write about (print and the telegraph, respectively). Instead, they describe the mutually-reinforcing elements of each perspective, anchoring them firmly in historical detail. Deibert makes a case for refusing to fall into pure determinism or constructivism:

The most serious flaw in [a technological determinist] model is that it tends to view the introduction of a new technology of communication as an autonomous force with certain definite and predictable results irrespective of the social and historical context in which it is introduced. Specific social phenomena are seen as invariably tied to a specific technology, as if the technology itself had the power to generate behaviors and ideas de novo.[...] By attributing ‘generative’ causal powers to the mode of communication, the technological determinist model tends to slight the extent to which the technology itself emerges out of a particular context and is itself influenced by social, cultural, and historical forces.[...]
But despite its strengths as a corrective to the technological determinist model, the social constructivist position has a tendency to fall into the opposite trap and slight, if not ignore altogether, any independent effects attributable to the technology itself once introduced. It is important to remember that although social forces may give direction to technological innovation, they are not completely determinant; once introduced a technology becomes part of the material landscape in which human agents and social groups interact, having many unforeseen effects. (Deibert CITE)

Bruno Latour, also, seeks details to explain when control or revolt is possible. Like Deibert, he seeks the specific effects of social configurations and technologies on particular behaviors, the "obligatory passage points" through which actors (human or machine) must pass to behave as they intend to.

Scholars seeking to understand how communication is made orderly, whether it is face-to-face or mediated, have sought out the control of this order at the highly granular level of individual behaviors, interactions, and utterances. Conversational analysis and ethnomethodology, two fields in which this micro-level control is a concern, observe that disruptions of routine activity may yield more information about how interactions are managed than does routine activity itself. (Garfinkel, Goodwin) Suchman, in particular, has applied these methods to human-computer interaction, attempting to temper the proposal made by mainstream human-computer interactions studies that human-computer interactions are exclusively the product of pre-imagined cognitive plans. (Suchman, cite) In doing so, Suchman, like Goodwin and Garfinkel, provides a way to understand how disruptions are repaired to maintain order, rather than explaining disruptions away as data noise.

The study at hand follows the trail blazed by Suchman. It takes disruptions of orderly behavior as its focus in an attempt to discover how the "correct" way to use digital media -- specifically, blogs -- is negotiated, even contested, by owners of and visitors to those pages. It seeks to understand at a very simple level -- that of linguistic function and conversational management -- how control of orderly writing is enacted on the Web. Finally, it follows the leads of Deibert and Latour in exploring how technologies become actors establishing control or creating disorder in these Internet conversations, alongside Internet-regulating institutions such as ICANN, corporations like Google and Yahoo, and a technological elite of bloggers who are employed as sysadmins, webmasters, and programmers. It will look at how all sides draw on old literacy practices as well as computer code and software to construct and navigate texts and each others' social status. Ultimately, this study hopes to sketch out the implications of such power struggles for Internet literacy practices still being developed both on the Internet and within institutions such as libraries and schools.

1The title of this dissertation comes from a kind of machine participation in literacy which is not actually addressed in this paper, but is related in the spirit of understanding how machines make hash of human meaning-making. "Miscornmumcation" is a word which was produced by OCR (optical character recognition) software, which takes images from scans of printed text and converts them into digital text files. It is, of course, not foolproof, especially when text is misaligned, blurred, stained, or otherwise unreadable by the software; human error can play in this equation.

In this case, the word the OCR software was trying to read was "miscommunication." "Miscornmumcation" was what it rendered, thereby providing an excellent symbol of machines' imperfect participation in human conversations.

2 These discussions regarding mediating technology echo sociological debates over the locus of control of human affairs in general: To what extent do cultural pressures shape what we can and cannot do? What recourse do we have when some avenues are cut off? And what, exactly, are the means by which culture shapes our actions? Each answer given to these questions has been subject to a further round of thinking. Round one: We are unavoidably tightly bound by social structures, and the material culture on which they rest. (Marx, Foucault.) But wait -- Round two. Rebellion happens. Are we not free to find spaces in which control cannot be exercised -- margins and leftovers which can support actions unimaginable in the philosophy of the state, behaviors unseen by the panopticon? Isn't rebellion always available to us? (de Certeau) Ah. But round three: how do you know when you are in a Marxist situation or a de Certeauvian one? When is control resisted or inevitable? Rather than ask how these conditions are imposed on individual actors, how do these actors, through everyday behavior, enact either control or resistance? (Garfinkel) It is perhaps the sad side effect of academic specialization that media, technology, and information departments have spun away from the arenas where these sociological debates are held, attending separate conferences and sometimes working entirely away from the generalized methods the social sciences have developed.

Phenomenon to be investigated

Online communication, as a written medium, has always left ample evidence of misunderstandings, mistakes, and resulting arguments, erupting as participants attempt to control what is said and how. Debates about the "appropriate" procedures and rules for using the new medium abound. Indeed, since before the advent of Usenet these misunderstandings have introduced a number of new terms for the disruptions of proper speech and writing into the lexicon: "flaming," "trolling," "spam," "netiquette," et cetera. But, while the definitions of earlier phenomena such as flaming and trolling are still brought to bear on Internet miscommunications today, to focus on the control and correction of "Internet mistakes" as a general phenomenon would be too broad. The analysis in this study focuses on a very specific pattern of misunderstandings and negotiations to repair them, and only in blog comment threads.

The canonical example of the misunderstandings I will address appears in a blog which was linked to by the community-run aggregator blog MetaFilter. (I say "canonical" because for those who participate in MeFi discussions, the errors in this comment thread came to symbolize and even stand for other misunderstandings of its kind.)

The blogger, Ryan MacMichael, posted the following anecdote to his blog:

Image:Maury canonical.png

A few comments into the responding thread, a stranger begins her post,

Maury, I was so impressed with what you did for the little girl with the club feet and hands, how you got a wheelchair van, and computer for her.

This commenter, like many others in the thread, disagreed with the blogger about something fundamental: that this website was a good place to contact Maury Povich. The blogger responded with a correction to this assumption a few comments later:

FOLKS, PLEASE... Maury has nothing to do with this page and he will never, ever read this page. Trust me.

While a number of the misunderstandings centered on the person who was being addressed -- often a celebrity -- there were also many in which commenters sought technical assistance with software or web services, products they wished to purchase, or general information (for example, about spiders or having a lisp). The commonality among all of the misunderstandings was the shape of the conversations they spawned, not their subject matter.

The basic pattern I was looking for in blog comment threads to include in the corpus was:

   Article ... <--response, in error(*x) ... <--correction of error(*y)

In other words, at least one mistaken response is made to the original post, and at least one correction is made to that mistake. Examples and variations on this pattern are presented in a later section.

Preliminary definitions

During analysis, it became clear that errors and corrections were being made by distinct groups. Those issuing corrections tended to agree with each other on what mistakes were being made and how they ought to be fixed; they often made comments about these errors to each other. I called this group "natives," and it included the bloggers themselves. In general, "natives" agreed with bloggers and their interpretations of how their blogs should be used. Bloggers, as the ones running the blogs, were a distinct group with privileged abilities; they could control the conversation through turning off comments, looking at traffic statistics from their server logs, and adding to or taking away from the text on the page.

The other group I called "strangers." "Strangers" were essentially defined by natives: they were identified as "doing something wrong" in how they used the comment thread. Despite the fact that "strangers" were essentially only identified as a group by natives, and did not otherwise cohere -- they addressed each other less often -- they often wrote in ways which were demonstrably similar, yet different in form and content from the ways natives wrote.

I should note that when using the term "natives," I am not referring to the term "digital native" coined by Marc Prensky. (2001) I do not mean to imply, as he does, that "natives" are younger and of a different generation than my "strangers" or his "digital immigrants;" that cannot be inferred from my data.

More detailed descriptions of the participant categories appear in a later section.


There is an argument, embodied in certain free software development projects, in these projects' founding documents (CITE IPF talk), in some academic writing, and in much of the popular discourse about technology, that Internet technology is inherently democratizing and resistant to centralized control. (Stallman, Raymond, Benkler, Shirky, etc.) Indeed, the Internet was designed to resist centralization, as servers were distributed in order to survive catastrophic attacks on the system. Were one to subscribe to the determinist media theories of McLuhan, Innis, or Ong, then, one might expect that a profusion of ways of interacting with the Internet would prove to be equally acceptable, flourishing in microcultures all over the Internet: if technology determines behavior, and Internet technology supports an egalitarian diversity of behaviors, no one kind of behavior should end up being privileged over another. Literature on fan cultures online and on student uses of Internet applications have diligently documented the proliferation of online activities; scholars of technology within schools of education have been enthusiastic about the possibilities of harnessing students' online reading and writing for literacy. (Jenkins, Gee, Lankshear and Knobel, Steinkuehler, etc.)

However, as established in the introduction, neither the determinism which predicts technological effects nor the social constructionism which celebrates the diversity of online communication accurately portrays Internet discourse. Flame wars still erupt; the uninitiate are derided as "n00bs;" on Boing Boing and Gawker sites, those whose posts are judged to be disruptive are "disemvowelled," making them harder to read. Misunderstandings occur, and those involved set out to determine who was in the right.

In order to better understand some of these misunderstandings, it will be useful to visit a few bodies of literature. Brian Street, a preeminent thinker in the field of New Literacies, should be countenanced here, and not only because of his refutation of Ong's determinist claims about the power of print. Street models ways of directly connecting didactic literacy efforts to governments, churches, and other powerful groups, while keeping diverse ways of reading texts in focus at the same time.

Which powerful institutions exert de facto control over the supposedly unregulable Internet? Sassen's and Lessig's arguments against techno-utopianism provide useful pointers towards the institutions which may be involved in making determinations of which ways of using the Internet are right and wrong.

Finally, Latour's and Aarseths' unpacking of the ways in which humans and machines co-create texts and argue for authoritative ways of reading them will be critical not only to understanding the influence of machines themselves, but also to keeping an eye on how actors like programmers, Google's search algorithm, or ICANN may be at work in situations where they do not appear to be present -- where their influence is extended by the machines they help shape. Latour's understanding of the moves which must be made to devalue or promote a particular way of using machines will also be useful.

The New Literacy Studies

The field of New Literacies has made the case that literacy is not a neutral, monolithic entity; people have different ways of reading and writing in different and changing circumstances, which carry a different charge depending on the situation. (A pedagogy of multiliteracies: Designing social futures. The New London Group - Harvard Educational Review, 1996) For example, it isn't necessary and might be viewed as inappropriate to write an office report in iambic pentameter, but iambic pentameter is sometimes called for by teachers when they give assignments about Shakespeare. Many newspaper articles and school teachers have expressed frustration that students sometimes use text-message shorthand in their school papers, while young people sometimes mock their elders for signing off their casual IMs, text messages, or Facebook wall posts with the too-formal "Love, Dad." What "literacy" -- text-oriented participation recognized by local participants as appropriate -- means in each of these situations is different. With new technologies, its malleability is increasingly visible.

Brian Street is one of the scholars of literacy to whom the New Literacies scholars point as foundational. The great strength of Street's work is that it makes a direct connection between everyday reading and writing and powerful institutions such as churches, government, and other policy actors like international development organizations. In doing so, Street makes a case for considering literacy as a genuine form of capital.

In Social Literacies, Street describes his research on the uneven acceptance of government-mandated literacy programs in Iran. Street points out that when those aiming to foster literacy in an "illiterate" population ignore the reading, writing, and speaking practices in which the "illiterates" already engage, the population tends to pick and choose from the elements of the literacy program they are faced with, developing literacy practices which are not fully identical to the imposed practices and abandoning those which do not fit their existing structure. This observation may serve as a warning to anyone developing literacy instruction programs.

Street's attention to power and resistance unearths a wealth of variety in resistant literacy practices. He presents a catalog of government and missionary programs which ignored existing literacies, with the result that their intended pupils developed practices mismatched to those intended by their teachers. Street gives examples from Pacific islands and Africa, as well as Iran, which indicate possible forms of off-target adoption. In Madagascar, the Roman alphabet was borrowed for traditional legends to give them the power of Christian stories. In Fiji and Melanesia, writing was incorporated into religious ceremonies as a means of contacting the gods. In Nukulaelae, writing was used to express feelings -- particularly romantic ones -- which were considered inappropriate if spoken aloud. Street's exploration of these ways of selectively resisting or adopting literacy practices suggest that a diversity of outcomes is possible beyond a simple dichotomy between literacy and illiteracy, and this diversity has its consequences for participants.

Street notes that repurposings of texts frequently involve manipulation of their "paralinguistic and pragmatic features," "their formal appearance, decoration, covers, etc.," as well as deviations from hegemonic syntax and semantics. (p 92, Street, 1995) This observation suggests attention to non-textual elements -- layout, for example -- as well as linguistic elements when analyzing the comments left by participants in blog comment threads.

Street gives a specific example of one kind of text specifically used by the educated to distinguish themselves from the uneducated in his study in Iran:

in a grade three school text book there were two accounts of the discovery of fire, one taken from Shanameh and one presented in contemporary 'scientific' language. The former was treated as an imaginative, but basically simple-minded account... while the latter was represented as what we now know to be true, employing modern science in an essentialist and rather fundamentalist way.... Many city-based Iranians... would use the distinction between myth and history as a marker of their own superiority over 'primitive' villagers: they could tell 'fact' from 'legend' whereas the peasants could not. A major source of these 'facts' was the school text book...
(Street pp 64-65)

And yet, Street notes, the tradition of reading practiced by students in their modern classrooms discouraged questioning, discussion, and interpretation far more than the country's earlier Qur'anic schools, or even than parents did when sharing Persian epic tales in the home. Instead, students were asked to memorize and accept what was written as "fact." (Street p 64) From this Street concludes that "not only does modern literacy foster uncritical belief in specific, 'modern' renderings of the world, it also contributes to a weakening of the kinds of sensibility and scepticism that may have been fostered in oral tradition." (Street p 66) This is a direct refutation of Walter Ong. Street demonstrates, countering Ong, that literacy itself does not impart particular mental traits or skills on the reader. Rather, it is the social context of particular ways of reading and writing which entail certain habits of thought, and associated interpretations of social standing. This is worth attending to when considering one group's derision about another group's mental ability based on their perceived illiteracy.

Even while primarily studying readers of published, one-way, non-interactive texts like textbooks and the Qur'an, Street has argued that simply studying the reading and writing of literacy learners is not sufficient for understanding literacy phenomena: he advocates a close analysis of the non-interactive texts and the way they are produced as well:

Ethnographers can no longer simply arrive in an 'isolated' village and study only local practice: ethnographers in the contemporary world, as they study local literacies, will require some knowledge of the central literacy tradition in the country studied, including both folk traditions and the immediate cultural background of those who write the modern texts for village children. There is scope here for a fruitful combination of literary critical and anthropological approaches. (Street p 53)

This idea of traveling to multiple sites of literate engagement with the same text evokes Boon's call for an "extra-vagant" ethnographic practice, moving from site to site as locals make reference to people and practices elsewhere. (CITE) In terms of studying blog comment threads, it supports the idea of studying both those setting up the blogs and those visiting and commenting; it even suggests attention to the actions of programmers such as search engine and blog software developers.

Street's focus on power relations manifesting through "specific cultural meanings and practices," and his encouragement to study those who write the texts read by literacy learners dovetail his work neatly with Sassen's and Lessig's accounting for the bodies that regulate the Internet, and with Bruno Latour's descriptions of how certain scientific texts and associated practices come to be treated as "fact" while others are dismissed as mere hypothesis.

Power and the Internet

Many recent New Literacy Studies projects focus on the "specific cultural meanings and practices" alluded to by Street, enumerating the whats and hows of the literacy communities they are studying. Many do not go much further. Some, but not all, follow through on Street's more powerful observation that literacy is "an ideological practice, implicated in power relations." Among the latter are Lohnes's study of technology use in and out of college classrooms (Lohnes, 2008), and Finders's case study investigating the failure of one teacher's goals for her creative writing classroom. (Finders, 1997) Notably, both of these studies investigate power structures involved in educational institutions.

If we were to look for power structures impacting reading and writing outside of schools, where would we look? Say we sought to understand literacy practices online, as many New Literacies projects do. What are the power structures which might shape the ideologies of "good" and "bad" writing online? At what moments could power relations be expected to define which writing is acceptable and which is not?

In seeking to answer these questions, this section looks far afield from the usual sociolinguistic or anthropological stomping grounds of literacy studies. I will look to political, economic, and legal theories about the restructuring of epistemological gatekeeping institutions with the advent of the Internet, in hopes that these theories can clarify what is at stake in reading and writing in different ways online. Specifically, I will focus on Sassen, Shirky, Lessig, and Benkler. This reckoning with broader infrastructural change is an attempt to live up to Street's demand that a study of literacy must entail "closely detailed accounts of the whole cultural context in which those practices have meaning." (Street, p 29)

While the freedom and neutrality of the Internet have been widely touted, both Sassen and Lessig refute this view, describing the Internet as subject to what Sassen calls "a de facto management." (Sassen, p 331) She makes a case that government standards, the increasing influence of large corporations, and the longstanding presence of bodies managing numbers and addresses on the Internet provide this de facto management, even though more formal "regulation" of the Internet is not really in effect. Like Lessig, she notes that this management is generally accomplished formatively, through infrastructure and code, rather than after the fact. It is in reckoning with the written infrastructure of the Internet that the potential for differentiated Internet literacies arises.

Sassen, drawing on Pare (2003?), identifies number-and-address-regulating bodies, specifically ICANN, as central to the de-facto management of the Internet. Regulation of numbers and addresses is perhaps the only centralized management of the Internet that takes place; it is certainly the most important, as navigating to the server you wish to find is wholly dependent on address-based routing. Like other forms of authentication, IP addresses and URLs are critical to verifying online identity. (Sassen, p 332)

Of course, reading and writing are at the root of this. What is an IP address but a string of numbers? A URL but a string of characters? And the process of resolving a URL into an IP address takes a particular kind of reading, accomplished by code which is written, read, revised, and interpreted by computer programmers. IP and URL literacy could be considered the most powerful literacies on the Internet; they are central to making the Internet run. Certain computer scientists have been central players in establishing and running domain management, and thus have some de facto power. In general, those who work in technology are likely to be more fluent in the language of numbers and addresses than those who do not.

Particular players have fought to gain a greater say in domain management than others. "Companies want to establish a rule that they are entitled to any domain names using their trademarks;" cybersquatters buy up these domains and attempt to profit by selling them back to interested parties. (Sassen, pp 332-333) Controlling a domain name tied to your brand is also related to authentication. Brands are a means by which people are accustomed to authenticating the quality of a product in physical space; carrying that habit over into digital space, they might well expect to find the same products when typing in a URL or entering a search term. So following Castells's (2000) logic that critical hubs in the global "space of flows" are built with capital from earlier spaces of flows, we thus see that bodies which had strong financial influence in earlier economies also have a greater say in the new literacies of power.

It could be argued that numbers and addresses are akin to Lessig's "architectures of credentials:" structures which help Internet users verify each others' identities. (Lessig, p 31) While Lessig is mostly speaking about the credentials computers use to validate users for high-stakes reasons such as classified information or financial transactions, locating information such as street addresses, ZIP codes, and area codes have traditionally been used in verifying whether a party should be privy to particular information (to whom does mail get delivered? who votes in which precinct? who pays more for a phone call? who is allowed to get a visa to another country? And anyone who has been homeless can tell you that without a street address, it is difficult to find employment).

Perhaps, then, we should consider a spectrum of credentials, from low-security to high-security, where some are sufficient for delivering junk mail while insufficient, on their own, for being issued a passport. Even the use of an email address could be considered a form of credentials. Once a server validates an email address, it is enabled to pass messages along to it rather than marking them as undeliverable. This gives the holder of that email address the ability to pass messages to the intended recipient uninhibited.

This transaction may seem so ordinary in our lives as to be trivial (and Lessig, also, does not pay email much mind as an identifying feature). But in fact, email addresses have important ramifications for computer security and functionality, as well as impacting on users' ability to manage their messages and thus their time, because of spambots. Spambots are pieces of software developed to "crawl" from website to website looking for mailto: links ("contact me" links on websites) and email addresses. They then return these addresses to someone who sends unwanted email, or "spam," to them. Ultimately, knowledge of a valid email account gives spammers the "credentials" they need to access email account holders and flood their inboxes with information about off-market drugs, travel "deals," insurance, and pornography.

Again drawing on the idea of geographical addresses, we can turn Sassen's idea of Internet "zoning." Sassen has described the growth of zoning as promoting "cybersegmentations" -- a proliferation of mini-Digital-Divides which are independent of issues of access to the Internet. (Sassen, p 332) She has elaborated on the nature of these cybersegmentations as she discusses the "mediating cultures" which shape Internet use: "specific cultures and practices through and within which users articulate the experience and utility of electronic space.... inflected by the values, cultures, power systems, and institutions within which it is embedded." (Sassen, pp 347-348; also Sassen 2002)

Much has been made of the "un-zoned" nature of the Internet in academic and popular writing. From protecting children from pornography to keeping your health, sexual orientation, and other touchy subjects out of the hands of potential employers, perceived crises on the Internet which are touted in the mass media often arise from this lack of zoning. But both Sassen and Lessig note that this is less and less often the case; technologies to enforce zoning and identity certification are gaining a greater hold online.

A telling example of the way external power systems and institutions shape online traffic has been provided by danah boyd in her discussions of how teenagers' use of Facebook and MySpace was shaped by college attendance patterns. Initially, Facebook was only available to students who had a .edu email address from a college or university. As a result, boyd found that teens with siblings in college, or who aspired to go to college themselves, were more involved in Facebook than teens from less-educated families. Having a Facebook account became a status symbol equated with higher education. By contrast, MySpace was first adopted by musicians and artists in urban areas, lending their own forms of cachet to that site. While the design and affordances of each site -- sober and simple on Facebook, customizeably eclectic on MySpace -- play off the preferences of each of these groups, it is not the technology itself that shapes traffic patterns. Rather, the social decisions of the designers -- permitting access to anyone or only to those in the .edu domain -- shaped these cybersegmentations. (boyd, 2008)

Questions of who is old enough to see or hear certain content, or who may participate in discussions in walled-off zones also raises questions of register, or who is assumed to be the audience. Shirky and Benkler address the changes the Internet is making to public communication in terms of the relationships that can be formed by a speaker and an audience: while one-to-one or one-to-many were previously the only communication modes available, the Internet adds many-to-many communication to the range of possibilities.

Benkler suggests this change has an impact on how messages are received and interpreted by audiences. Previously,

Consumers... would treat the communications that filled the public sphere as finished goods. These were to be treated not as moves in a conversation, but as completed statements whose addressees were understood to be passive readers, listeners, and viewers. (Benkler p 180)

While Benkler sees public sphere messages prior to the Internet as being delivered authoritatively, there were still separate "registers" for speaking privately or secretly. (Shirky, p 89) Knowledge of these registers, and sub-types of registers aimed at particular audiences, are viewed by literacy educators and scholars as important to reading comprehension.

The advent of many-to-many media technologies complicates our repertoire of registers. As Shirky notes, many private conversations are now publicly available on the Internet, making for some public consternation; much of the "Internet drivel" derided by the high-minded is actually simply speech in a private register. (Shirky pp 85-88)

Shirky, for one, does not think the public register developed under the mass media regime will disappear because of new technology. Like Castells, he argues that power and resources accumulated under one system do not simply redistribute evenly under a new system; rather, they tend to remain vested in those who held them in the previous regime. (Castells 2000) Attention is one such resource. "Fame," writes Shirky,

is simply an imbalance between inbound and outbound attention, more arrows pointing in than out.... The famous are different than you and me, because they cannot return or even acknowledge the attention that they get, and technology cannot change that. (pp 91-94)

The more people paying attention to one famous person's communications, the more that famous person must ignore incoming communications simply because of the sheer volume of time and cognition involved in reciprocating. "[I]gnore," writes Shirky, "becomes the default choice." (p 93) This argument seems to invalidate Benkler's insistence that

The Internet allows individuals to abandon the idea of the public sphere as primarily constructed of finished statements uttered by a small set of actors socially understood to be 'the media'... and separated from society, and to move toward a set of social practices that see individuals as participating in a debate. (Benkler, p 180)

In Shirky's view, "finished statements" will still be with us in spite of new technologies; there is simply no responding to attention-endowed, famous people in a way that will elicit their response.

Only given a complete cultural tabula rasa would the Internet have developed into the democratic utopia so many early writers hoped it would be. But of course, such conditions do not exist. What we get instead is power structures embedded in the culture of the universities and technological labs in which the Internet was developed, and the institutions and companies by which it is still managed. Acculturation in these settings gives certain users a head start, while others' written presence online marks them as coming from "the wrong side of the tracks." ICANN and other organizations regulate the names and addresses at the core of Internet functionality, and those readers and writers who know the literacy practices of these organizations are the ones who have the easiest time finding their way around, giving directions to resources, identifying themselves and others, and making the technology efficiently do their bidding. Users who understand the changes in public and private registers wrought by the Internet are able to effectively protect their private information, while those who assume that old private and public registers persist expose themselves to potential exploitation.

Machine writing

While Street's work provides support for describing what is going on, evoking how literacies are being constructed, and understanding how this relates to identity and social status, Latour's work gives a base for describing how different parties construct and defend their literacy practices by drawing on machines.

In Science in Action, Laboratory Life, and We Have Never Been Modern, Latour picks apart the ways in which scientific, journalistic, and political texts refer to each other in order to make the strongest claim possible; how they make use of unopened, unquestioned "black boxes" of facts which have already been determined in order to bolster their position; how they attempt to make new facts by mustering some texts or evidence in the face of others; how they lose credence when other texts and people cease to refer to them. Depending on how these processes go, he says, a given argument or idea "will be incorporated into tacit knowledge with no mark of its having been produced by anyone, or it will be opened up and many specific conditions of production will be added." (Science In Action p 43)

Latour has by now applied this model well beyond the scientific papers with which he started. His models of authorship and readership are easily translated to other spheres, including the Internet. The latter could be applied in one of a few ways. Applying this model directly, one could look at the ways in which people argued for the priority of particular web pages, and how they supported their arguments with appeals to specific technologies.

Further, and more innovatively, though, one could apply Latour's networked model of meaning and apply it as a model of how Google's PageRank works: texts' reference to other increases their credence and use by others as pages are listed higher and higher in the search rankings for given terms. As I will demonstrate, this process appears to contribute directly not to establishing control and clarity, but to confusion, in these comment threads.

Latour sees machines and instruments as key to producing texts in a scientific context. Scientific instruments such as physiographs, seismographs, EKGs and so on "transform a material substance into a figure or diagram which is directly usable by" human readers. (Laboratory Life, p 51) While Latour tirelessly advocates for recognition of machines and other inanimate objects as actors in communicative exchanges, his is not a strict technological-determinist position, in the way that McLuhan often gets read. Rather, Latour routinely emphasizes how reliant on human engineering, and then on human explanation, are the texts produced by machines. These texts are not merely objective scientific fact about their raw materials; in order for their to be treated as facts by the community handling them, the community must come to consensus over the "correct" interpretation and meaning of the texts. (see eg Science in Action p 67-69; Laboratory Life p 50; We Have Never Been Modern p 23-24) And, as he describes it, this consensus is a long, slow process of recruiting allies and making counter-claims. Latour ultimately sees technologies as carrying both explicit and implicit messages with them. The interpretation of these messages is ultimately up to the audience, in context.

Are all interpretations of a medium's message fair game, then? Latour says some interpretations may be favored over others. Some writing machines, he says, are black boxes, and arguing against them "is costly" -- the person contesting the facts the machine produces must be able to muster alternative facts by purchasing and deploying other machines. (science in action p 69-70) It will be worth tracing the use of machines in these comment threads to justify literacy. Attention should be paid both to bloggers', readers', and strangers' treatment of the technologies involved and to their treatment of the things they write about as factual or questionable. If the author of a blog also works at Google, or a multimillion-dollar celebrity has his or her own web portal with a number of staff maintaining it, how does this imbalance of resources and access affect the claims and requests made by Internet searchers who find their way to random comment boxes, seeking to discuss issues they care about? How does familiarity with the workings of writing machines like search engines and blog software affect bloggers' defenses of their ways of reading on the Internet? How does unfamiliarity with these machines affect commenters' defenses of their own ways of reading?

On the Internet, search engines now write out their own interpretations for us on the nature of different texts and their importance. This is revolutionary in Latour's world of writing machines: while Boyle's vacuum pump, an example Latour gives, was only qualified to comment on signs from the natural world, the search engine comments on other textual knowledge, producing ranked lists which are sometimes millions of entries long. Machines are writers now more than ever before.

Both blogs and search engines take the raw material of databases and produce a variety of kinds of pages (blog indexes, front pages, RSS feeds; search results, map pages, image results, etc) which can be used by people in different forms. Without this transformation into other texts, the raw material of a database (not to mention the 1s and 0s making up the code underlying it) would be overwhelming to the reader. Of course, the output of a search engine still requires the reader to add input, and then to interpret what the list means, just like the other machines Latour writes about.

Regardless, to consider these comment threads without considering how search engines and blog software affect their creation -- or considering the literacy practices of those who write this software -- would miss a major component of the interaction on these threads. And this goes back to something Street writes: any study of literacy must go beyond the site where the "illiterate" or other reader is interacting with a text:

"Ethnographers can no longer simply arrive in an 'isolated' village and study only local practice: ethnographers in the contemporary world, as they study local literacies, will require some knowledge of the central literacy tradition in the country studied, including both folk traditions and the immediate cultural background of those who write the modern texts for village children. There is scope here for a fruitful combination of literary critical and anthropological approaches." (Street p 53)

It will be important in this study to follow what and how computers are writing, who has set them up to write as they do, and how these writings are interpreted. How are computers being referred to, and how is their output being used? How does this compare to the use of earlier forms of literacy (books, letters, television) as a support for generating reliable evidence?

Aarseth agrees with Latour that in many situations humans and machines work together to produce texts. He calls the product of this partnership "cyborg literature," saying "it is therefore in need of a criticism and terminology with less clear-cut boundaries between human and machine, creative and automatic, interested and disinterested." (Aarseth p 134) Aarseth helpfully lays out three possible ways in which humans and machines may work together:

"(1) preprocessing, in which the machine is programmed, configured, and loaded by the human; (2) coprocessing, in which the machine and the human produce text in tandem; and postprocessing, in which the human selects some of the machine's effusions and excludes others. These positions often operate together..." (Aarseth, p 135)

The ways we are likely to encounter these interactions when looking at the social life of software include preprocessing, when looking at how search engines and blog software are developed; coprocessing, when bloggers and commenters write on blogs; and postprocessing, when those navigating the Internet click through links, or when bloggers select pages to link to, which alters these pages' PageRank.

Some Internet users, particularly those involved in the creation of software, are more aware of the cyborg nature of Internet texts than other users. As a result, they pay closer attention to machine writing elements when interpreting a text. Thus, discussion of how machines are involved in writing the texts here -- and in turn, how people are involved in writing those machines -- will be important to the project at hand.

Aarseth also offers some thoughts on reading which resonate with those of the New Literacy Scholars. Aarseth's chapter on cyborg literature has to contend with the fact that a lot of computer-generated dialogue and prose is nonsensical, from most human points of view. It often does not make syntactic or semantic sense. Despite this, "the naive human participants... in these 'conversations' are capable of projecting sentience, even intelligence, onto their mechanical partners." (Aarseth p 130) Aarseth does not consider that readers of these cyborg texts may be seeking not just basic human intelligence, but also the social clues which they can use as means of orienting themselves to the purpose, tone, and genre of a text. And in reading on the Internet, that futile struggle to make sense of cyborg patois may be a regular occurrence as readers fail to comprehend search results or blogs.


The starting premise is that interpreting the significance of action is an essentially collaborative achievement. Rather than depend on reliable recognition of intent, mutual intelligibility turns on the availability of communicative resources to detect, remedy, and at times even exploit the inevitable uncertainties of action's significance." (Suchman, p 86)

Like Lucy Suchman's analysis of the situated actions of Xerox copier users, this dissertation makes use of conversation analysis and ethnomethodology. These two analytical traditions work under the assumption that, rather than referring to abstractions of social order or cognitive plans developed in advance, participants in a given interaction cause meaning and action to cohere locally by referring to people, things, and words in their immediate environs. This understanding of social order has the advantage of both being able to identify the effect of a machine in the interaction (technological determinism) and to make empirical observations about the ways in which participants are working to create social order (social constructivism).

The primary analysis undertaken for this study was micro-level, as is most ethnomethodological and conversational analysis. Functional linguistic analysis aimed to understand how these commenters understood to whom they were speaking, through what channel, in what order; it includes a reckoning with the sociological field of conversational analysis, as well as work with categories laid out by Roman Jakobson.

To identify the most fruitful parts of the corpus for such close attention, however, grounded thematic analysis was undertaken first. This method of finding themes that arise naturally from the data helped to identify which functions of speech (for example, making sure the "channel" was working or determining who was participating in the conversation) were most problematic. In addition, grounded thematic analysis turned up references to powerful actors (AOL, Google, and ICANN, for example) who helped shape literacy practices, and also gave a picture of the ways blog readers and strangers positioned themselves and each other as readers and writers. Some descriptive statistical analysis was also run on information provided by commenters on their age, sex, and geographic location (however incomplete this might be), to sharpen the picture of who natives and strangers were and how accurate their claims about each other were. 1

I used atlas.TI's qualitative software to code, search, and develop themes. Additionally, IBM's ManyEyes online applications proved very helpful in developing visualizations of common words and themes. Visualizations from ManyEyes, particularly the "word tree" feature, will appear throughout this paper. For visualizations of turn-taking, I extracted data from atlas.TI and developed graphs using GraphViz.

Finally, some affordance analysis of the blog software and search engines implicated in these misunderstandings was necessary in order to understand the participation of machines in these conversations and the resources they provided for making sense of the local situation. This analysis is presented in Chapter Four, Machines In Written Conversation.

This study was completed with publicly available documents; no observation of these Internet users was undertaken, no surveys issued.2 A text analysis such as this one will obviously not yield a complete picture of why and how people arrive at a particular blog and comment. It will not, for example, make it clear what parts of a website the visitor looks at before they comment, or how long they spend on the page; that is a study for the likes of Jakob Nielsen, who has fruitfully been using eye-tracking software to answer that kind of question. (Nielsen 2006)

Similarly, a text analysis will not give a clear picture of the searches which brought strangers to these pages to begin with. The project at hand did begin as a project to study how "strangers" were using search engines, with the idea that some in-person observation would be necessary. However, when it became clear that a great deal of information was available from the blog comments alone, the study's focus was narrowed to a textual analysis of the threads.

This study will not generally countenance strangers' motivation to search, whether they feel their question was answered once they have found a page, or what other resources they might use. These are questions posed in the field of library and information science, and that field is still evolving its own methods for answering those questions. Nor will it make use of in-classroom or controlled-experiment studies on literacy and literacy instruction which are sometimes performed in schools of education. While I may make use of these other studies' findings to understand what is going on here, I will not make use of their methods myself.

Despite the limitations of textual analysis, there is a great deal which can be learned from it. Some of this information gives clues about the user's understanding of what page they are on, where users came from, and how they find pages on the Internet. And, as suggested by the New Literacies scholars, it will paint a picture of how blog readers and other visitors position themselves and each other as readers and writers, with indications of power relationships.

1 I had initially planned to also obtain referral logs for the threads used by contacting the blogs' authors, and working with them in order to set up tracking as necessary. Unfortunately, no bloggers managed to return referral logs, and due to technical issues I was not even able to retrieve them for the one thread in the corpus which came from my own blog.

2 This was not for lack of trying. During the pilot phase of this study, I did make attempts to contact those who left email addresses on the blogs in the corpus, in an attempt to administer surveys or face to face observations. I received no replies. This is probably attributable in part to the fact that anyone who puts their email address up online is subject to the whims of spammers, who send out bits of code called crawlers or spiders to harvest email addresses and send them back to the spammers. Due to the age of many of the comments from which I collected email addresses, the spam which those addresses were probably subject to, and the fact that my own messages were sent out in bulk, in a way that may have triggered spam filters themselves, I would not be surprised to learn that none of these emails got through. I did not, however, make further attempts to contact commenters using the phone numbers or home addresses which a number of them left online. By the time the email tactic proved unworkable, I was in the process of writing IRB forms and changing my methodology, and I deemed it better to forgo methods beyond textual analysis of publicly available documents.

Functional linguistic analysis: conversational analysis, Jakobson and metalanguage

For a few reasons, conversational analysis was not the only form of linguistic analysis employed in this study. Suchman uses conversational analysis to understand the construction of mutual intelligibility between users and the machine, but she notes that she does not discuss "the wealth of prosodic and gestural cues" that have been turned up by analysts of conversation because "the case of human-machine interaction is so limited that the basic resources, let alone the expressive subtleties, of human interaction are in question." (p 86)

It is worth noting, too, that in their Simplest Systematics, Sacks, Schegloff, and Jefferson make it clear that there is a good deal about conversation that their model does not cover. Specifically, they stated that in order to be generalizeable, their model needed to be both insensitive to and flexible enough to fit a broad range of contexts. (Sacks et al, 1973?)

The analysis at hand will look both at human-machine and human-(machine mediated-)human interaction. Because its subject is a series of texts, unsupported by any other record of human behavior, looking at turn taking alone would limit an understanding of what is going on here. As a result, it is worth opening up some of the linguistic boxes which Suchman, Sacks et al left closed.

What is in these other boxes? Roman Jakobson delineated other elements of language in his essay on metalanguage. Jakobson's model provided a useful rubric for understanding the other arenas of disagreement in the dataset. Themes arising from grounded theoretical analysis indicated that attention to context would be important; specifically, indexical references about the addresser and addressee and references to the website at hand were highly contested between natives and strangers. Additionally, since one of the major claims made by natives was that strangers were "illiterate" and since the writing of others was the major resource used by all parties here, poetic and metalingual commentary was also abundant in the dataset, so I will discuss that as well (though mostly as it relates to questions of channel-appropriate use, or genre).

Both conversational analysis and ethnomethodology believe that the rules followed by participants in a given interaction are most visible when someone steps out of line, when order is disrupted, and when participants publicly correct these errors. Conversational analysis observes that order is maintained in conversation through local, in-time management of turn-taking. Goodwin expanded on how this consists of pauses, direction of gaze by the speaker, gestures, proximity, and other cues. Sacks et al describe how turn-taking follows patterns which, while not consciously available to participants, act as rules in most face to face conversations. The moments when these patterns are disrupted thus give the best indication of participants' understanding of the proceedings.

It could be said that Jakobson's model can encompass Goodwin's model of conversational management and Sacks, Schegloff, and Jefferson's simplest systematics of turn-taking; Jakobson's is a broader way of describing communication, acknowledging content and style as well as the mechanics of maintaining contact between people:

Image:Jakobson metalanguage model.png (CITE)

The contact or phatic elements of communication have much to do with the maintenance of attention which Goodwin writes about. Goodwin also describes how conversational participants determine addressers and addressees through the cues he describes. The referential or context element of Jakobson's model is also implicated in indexical references to "I" and "you," among other things, and this is also addressed by Goodwin.

The disruptions found in my grounded theoretical analysis of the data were frequently indexical (deictic). This echoes Varenne's emphasis on deictics as important to ethnographic work in general:

The relation of person to utterance is not, as such, a linguistic event, though it is one that all human languages mark.[...] since languages vary in the way they mark this relation, it makes very good sense to argue, as Silverstein does (1976) that attention to deictic structuring may be a privileged route to 'culture' [...] With some exceptions, a word like 'mother' -- particularly when used in address and in direct speech -- does not refer to a substantive quality of the addressee; it refers to the (social) relationship of the speaker to this addressee.
(Varenne 1984 p 222)

Because all signs pointed to deixis, I relied on Jakobson's definition of indexicality, and coded all comments for their use of first, second, and third-person indexical modes of address. Deictics or indexicals have been referred to by New Literacies scholar Don Leu and others; I will address this angle on this particular part of speech later.

The quintessential example of the errors presented here -- the Maury Povich example, which sparked conversation on MetaFilter about similar misunderstandings, and came to represent the phenomenon to many readers -- is essentially an indexical misunderstanding. The blogger essentially writes "I, blogger; he, Maury Povich." Some subsequent commenters conflate these two positions, taking some cue from the Internet to believe their orientation to the conversation should be "you, Maury Povich." The distinction is one discussed by Varenne (1984): the third person (he, Maury) is not by necessity available to the conversation at hand, while the second person (you, Maury) assumes a party who is available to the conversation. So an indexical approach to this corpus seemed highly fruitful. Accordingly, I coded who was being addressed.

Another of Jakobson's functions -- the phatic function -- suggested itself as a useful focus. Examples of questions about and misunderstandings of channel were numerous in the corpus. I coded for the following phatic issues as they arose in the text:

  • expresses doubt about reliability of channel
  • is this the correct channel?
  • this is the correct channel
  • this is not the correct channel
  • this channel (the website)
  • what is this channel

Additionally, I coded for two more attitudes toward the text -- "if you are reading this" and "[x] is not reading this" -- which again are channel questions, but also enter into questions about addresser/addressee.

It is important of course to remember Jakobson's admonition at the beginning of his essay on metalanguage:

Although we distinguish six basic aspects of language, we could, however, hardly find verbal messages that would fulfill only one function. The diversity lies not in a monopoly of some one of these several functions but in their different hierarchical order.
(Jakobson Metalanguage CITE)

Thus, while I coded for messages focused on phatic and indexical issues, these codes were not exclusive and did not indicate the absence of other functions of language. I did also attend to metalingual code, context, poesis, and questions about the addresser and addressee; however, I did not usually code these as such. To look at context seemed indistinguishable from grounded theoretical analysis; it was very much a question of content themes arising from comments.

Metalingual or code analysis, meanwhile, was very similar to analyses I might make about literacy -- the ways commenters wrote, what they said to each other about that writing, and overt references they made to other forms of literacy. So codes in those domains were coded following grounded theory and literacy themes, alongside coding of phatic references and indexical shifts.

Phenomenon to be investigated (detailed)

As has been stated earlier, this study's focus on disruptions arises from observations by both conversational analysis and ethnomethodology that correction of errors in everyday behaviors can reveal unspoken rules being followed by participants. (Goodwin 1981, p 71n, Garfinkel CITE) Because order is maintained locally, and not by reference to abstract social structures, it is important to pay attention to utterances which bloggers and commenters identify as mistakes, not just ones identified by the researcher.

In this study, I included threads which contained one of two kinds of indications of an error.

The first, and most common, was a comment from the blogger or from a reader that someone was not participating in the site correctly. The fundamental shape of these exchanges was as follows:

Article ... <--response, in error(*x) ... <--correction of error(*y)

The ellipses are meant to indicate that any number of non-erroneous responses could elapse before the response-in-error or correction-of-error turns occurred. *x and *y indicate that the errors and corrections could happen multiple times each; they often do occur multiple times, interwoven with each other. And finally, the <-- arrow is a reminder that the pattern of these exchanges is listener-selects-previous, not speaker-selects-next.

Here are some illustrations. First, one of the most basic exchanges in the corpus, a three-turn-long example from the blog EducateDeviate. The blogger's post began thus:

Image:Simple error example.jpg

The comments began with one from a reader, followed shortly thereafter by a correction from the blogger:

Image:Simple error example 2.jpg

Subsequent responses on this thread (there were three more) did not address any errors, so this was the only indication that the blogger thought something was amiss. (Note that her tone towards the error was positive and constructive; this makes it unlike blogger and reader comments from most of the threads in the corpus.)

Most exchanges about errors on a blog were much longer, and included many errors, many corrections, parodies and jokes about the errors from bloggers and readers, and even claims from strangers that errors were not actually erroneous. The following thread excerpt appears on a post which began simply with a link to a report on the Bill and Melinda Gates Foundation and a quote about the foundation's success. Deane is the blogger who made the initial post.

Image:Long error example 2.jpg

The exchange above is typical: a commenter posts a comment requesting assistance, offering a partnership, trying to contact a celebrity, or making some other kind of statement which the blogger thinks is out of line. The blogger then identifies the reason s/he thinks the comment is erroneous, attempts to re-direct the commenter to the correct outlet for his comment, expresses frustration at the comments, hypothesizes why this is happening, or mentions another place where s/he has seen this happening. Readers of the blog may parody the "clueless" out-of-line commenters (as Boyink does above), continue each other's jokes (as does the second comment attributed to Michael Zaloguin, which, based on tone and style, I am guessing is not posted by the same person), sympathize with the blogger, make their own hypotheses and links to other similar phenomena, etc. These themes are generally repeated many times in the course of a thread.

So in the above sequence, the pattern is:

Article ... <--error <--correction <--error <--correction <--error <--parody correction <--parody response

As you can see, the patterns quickly become complicated. Beyond the basic Article ... <--response, in error(*x) ... <--correction of error(*y) pattern, the shape of a given online conversation of this type is nearly infinitely re-arrangeable, as one might expect to find in any close analysis of speech.

The second situation which qualified a comment thread for inclusion was the blogger's indication outside the comment thread that they felt some commenters were acting erroneously. Unlike the threaded responses above, these tended to have idiosyncratic shapes, as bloggers ended up correcting errors in a variety of locations on the page rather than correcting in a comment "turn" which would appear to be a chronological response.

Here are two examples taken from the corpus:

Image:Go away sign example 2.jpg

In the above post about how Santiago Calatrava works, the blogger includes a link titled "Want to contact Calatrava? Click here." The link takes the reader to this page:

Image:Go away sign example 3.jpg

These notes clearly indicate that 1) the blogger emphatically believes that commenters -- like "purvi" above -- are making errors in his blog's comment thread, 2) the threads in particular where he sees this happening are the Bill Murray, Santiago Calatrava, and Mohamed al-Fayed threads (all of which are thus included in my analysis), and 3) the blogger wishes to correct the errors commenters are making. This is in contrast to trackbacks, which comment outside of the original page and do not aim to correct an error.

Here is another example of a correction on the original thread but outside of the comments:

Image:Go away sign example.jpg

The red-and-yellow banner above, with the disclaimer beneath it, appears on a post in which the blogger relates his own story about cancelling eFax service, and a number of commenters subsequently demanded he cancel their service (as if he worked for the company). This is one of the clearest and most explicit examples of this kind of sign from a blogger about the "errors" he sees on his site, even specifically setting the topic of the thread, and explaining "If your comment shows that you don't understand this, I will mock you in public."

So, even though the Calatrava thread and some others did not include any indication of an error posted within the thread itself, they were included in the corpus because of signs like these indicating that the blogger was actively attempting to correct a perceived error.

When I solicited more data for my corpus on MetaFilter and the AoIR email list, I received suggestions of a number of possible candidates for the corpus. Some of these are included in the corpus. But a handful are not included here, because within the comment threads themselves, no blogger or commenter had indicated that anything was amiss. Outside viewers and this researcher might have identified that something was wrong, but it was "harder to put your finger on." There was simply less data to indicate what the problem was; there was no third turn with a correction.

Frequently, in the threads which I did not include, the blogger and blog readers were identifiably employing a different sense of indexicality than "strangers" were. A good example of a thread I did not include in my analysis appeared on my own site. I had posted a rambling personal journal entry in which I mentioned the name of the actor Ashton Kutcher in this context:

[...]I still get mail from a writer for Glamour who periodically calls out for people to participate in her stories. This probably stomps on some sort of copyright issue, but here's this week's pitch; I thought you'd enjoy the further dive into stupidity. (I keep wanting to try to get Wade involved in one of her stories, because he's so intractable and would probably cause a ruckus, but I think we tried once and it didn't work. Maybe his hair isn't "floppy" enough. Oh, lord, all that suggests to me is more Ashton Kutchers. People, that haircut doesn't work for everyone, especially those with thick necks... but I digress.)[...]

In my sentence, the indexicality of the words "more Ashton Kutchers" was something along the lines of "when I think of [the abstract concept of] 'floppy' hair, I think about [, in abstraction, the actor] Ashton Kutcher." Ashton Kutcher was not necessarily assumed, in the context of the sentence, to be available as a participant in the conversation.

Among subesequent comments, this one appeared:

hey Ashton we think you r sooo hott!!! hehe!! [...] we want to tell you are number but we dont know if you are really ashton. but we still love you. We love your show punk'd and we think your a great actor. We want you to talk to us online except for shybaby might not be on that much because her Dad took it away.[...]

In this comment, the indexicality of Ashton Kutcher becomes second-person: Kutcher is addressed as "you." Clearly, the commenters are in doubt as to whether this is the correct form of address, but press on with this indexical understanding despite. A couple of other commenters treat "Ashton Kutcher's" indexicality the same way.

To people who have read the entire page, from my journal entry to this comment, this is clearly a mismatch between the blogger's and commenters' understanding of indexicality. I did not establish the page as a place where my own indexicality was "I, Ashton Kutcher." However, there is no guarantee that commenters have read that much (and indeed, there is plenty of evidence that they haven't). And the misunderstanding is never clearly acknowledged on the page by me or by any of my regular readers. I can identify an indexical "error," but there is no correction, yielding less analyzable data on what each side thinks is an appropriate way to behave in this thread. Thus, I decided to focus on the richer data sources. Had I included threads in which there were no error corrections, I would have had at least half again as many threads to analyze.

Another set of data I ended up not including was examples in which the blogger or another reader had identified the error in a trackback to the post or another link from elsewhere. Here is how trackbacks work: when someone links to the thread in question, the blog software automatically registers this. It often usually includes an excerpt of the post in which the link appears in the post being linked to. Here is an example from a blog called KWC, which I considered for my corpus but did not use:

Image:Trackback example.jpg

Here, under the "TrackBack" heading, we see that the blogger from the original post (about the movie Holes) has posted a note describing the thread, elsewhere on his blog. In this note, he does indicate that he thought the commenters on the Movie: Holes thread were not using his blog correctly. However, at no other time in the comment thread on the Movie: Holes post does he contradict what the commenters are doing (namely, writing to or about an actor who was in the movie). Thus, I did not include this thread in my analysis.

This decision is a little arbitrary. TrackBacks often do appear on the page, so they could easily be considered a part of the discussion and part of the error correction. However, TrackBacks are not usually a deliberate participation in the life of a page, as they are posted elsewhere and do not automatically appear on every comment thread (depending on the blogging software); they do not always appear chronologically in the thread, making an understanding of turn-taking difficult; their indexicality is generally different, as the writer is not addressing the error-making party directly; and in general, they are pretty much ignored by participants in the thread where the supposed errors are taking place. Thus, I did not include threads in which the only indication of an error came from a trackback from elsewhere. Likewise, I did not include threads which did not have reckoning of an error within them, but which were referred to me by a reader, blogger, or a link discovered on another page. These instances also did not provide enough concrete, analyzable data to be useful to this study.

Some time into my data collection period, a friend and fellow blogger pointed out to me that many technology-related websites receive requests of a variety which regular readers have classified as "plz send me teh codez thx." In these threads, strangers are usually posting to a discussion about a particular software, programming language, or operating system. They ask to be sent a piece of code, ostensibly to complete a programming project for work or school. The regular readers object on a few grounds: one, this is not considered the right forum for such a request; two, it is considered cheating. Oftentimes the regular readers also make derogatory comments about the language used in making the request (hence "teh codez"), sometimes linking this "bastardized English" to India, whence the requests for code are assumed to come.

I did not include "plz send me teh codez"-themed threads for a few reasons. First of all, unlike most of the threads in the corpus, the codez-seeking strangers are coming from a background which is hard to distinguish from that of the regular readers: both are programmers or aspiring programmers, regardless of any aspersions cast about the technology sector in India. Second, it is pretty hard for an outsider to confirm why the regular readers are up in arms about the perceived mistake. The subject of the discussion is usually esoteric, and the thread is jargon-laden. Most of the requests made by strangers in threads I included in the corpus are accessible to anyone living in contemporary Western society with at least a sixth-grade education: popular media, celebrities, shopping, basic Internet skills, job applications, information about insects. With a more esoteric topic, it is to be expected that strangers might misunderstand the proper approach -- and among those misunderstanding would be this researcher, who did not trust her own aptitude for understanding the nature of the slights being committed on programming threads.

Leaving these threads aside, I turned my attention instead to threads of a more general, popularly accessible nature.

The corpus

The collection of documents studied was bounded by the following criteria:

  1. the comment thread must be on a blog, not a forum, social networking site, listserv archive, guestbook, etc.;
  2. the researcher AND bloggers or their readers have identified at least one commenter as "doing something wrong;"
    1. This identification of an error has been made within the page itself, not just by linkage from elsewhere, using a trackback, or by personal communication to the researcher;
    2. The identification of an error is made through a sign posted by the blogger on the page, the blogger or readers commenting in the thread, or both.
  3. the topic is within the realm of popular knowledge and not overly technical.

Comment threads to study were gathered through a process of referral from other bloggers and Internet users. A number of the pages were culled from three original threads on the community portal Metafilter, where MeFi readers discussed similar misunderstandings and added others they had found. These were the Tuesdays with Maury, Jeremy Jordan Loves Demon Dogs, and How Hawkish threads. Generally, I followed any link out from these pages which led to other discussions of this phenomenon, finding new threads in the process -- a sort of snowball sampling.

In addition, once I began the data-gathering process I put out requests for additional threads like these. My requests went out on the Association of Internet Researchers (AIR-L) mailing list (yielding only one response); on the site where I showcase threads like these, (where return readers provided new links with some regularity); and on Metafilter itself, where I started an Ask Metafilter thread. The visibility of the Ask Metafilter thread within the MeFi community was heightened when MeFi editors Jessamyn West and Matt Haughey publicized it in a podcast. I also followed up with a few Metafilter readers who commented on the original three, asking them for URLs they had mentioned but not linked to. Finally, I gathered a few examples of misunderstandings from casual conversations with others interested in the topic, many of whom were bloggers themselves (as I have developed the habit of asking any blogger I meet whether this has happened on their blog). 1

In sum, this corpus included 39 comment threads. These contained a total of 3,572 unique comments. The number of comments on each thread ranged from two on the shortest thread through 713 on the longest. The topics of the threads were of a general nature (what Jim Gee has identified as within "vernacular" or "everyday" knowledge (Gee 2004)). Comments from strangers can generally by categorized into about a half dozen topics, some with subtopics:

  1. Attempt to contact a celebrity
    1. Expression of appreciation
    2. Request to meet celebrity
    3. Request for assistance
    4. Business proposition
  2. Attempt to get on a television show or in a movie
  3. Request for assistance with technology or cancelling an account
  4. Job/scholarship search
  5. Shopping, sales, purchases (hoppity hop, TV show promoted items, new barbie dolls, wedding dress)
  6. Folklore (riddles and chain letters)
  7. Other general information (insects, speech impediment)

See Appendix B for a discussion of comments which were not considered.

The timeline below depicts the specific topics of the threads (by color), along with the temporal duration (though not the textual length) of these comment threads, from the date the blogger first posted the article through the last comment written. Some of these comment threads are still open, so the last comments here often reflect when I began gathering data for this study, not when the discussion ended or the blogger closed comments. The bulk of these threads were gathered in May and June of 2008. I later discovered that I had gathered incomplete data from some threads, so I had to go back and download them in their entirety again. Additionally, at a few points Atlas.TI corrupted the data, and threads had to be re-imported and coding started over.

1 In my initial proposal I also wrote that I planned to try out a few more automated ways of finding these mistaken comment threads. I experimented with search strings entered into Google which yielded high numbers of contested comment threads on particular topics. These search strings consist of a phrase common to many misunderstanding posts (the best yield came from "your biggest fan", in quotes; "watch your show everyday" would probably also have worked well, as I will explain in the section on literacy words) as well as the string Movabletype or Wordpress (as these are names of blog software and usually appear on blog pages) and some modifiers designed to eliminate results in which "your biggest fan" shows up as a song lyric.

This was actually quite successful; I have posted a number of the threads I found this way to, the blog I have been curating with examples similar to those in my corpus. The Gumbaby blog also received suggestions of similar threads from regular readers.

However, I decided not to include the Google-found or recommended threads for a few reasons. First, they often did not fit the criterion of the third conversational turn: no native returned to say that strangers were making an "error." Second, because of the phrases I was using to search, it was likely that the threads would have skewed heavily towards seeking contact with celebrities or TV shows, which I felt might further unbalance a dataset which was already heavy in that direction. Finally, the corpus was plenty large without potentially dozens more threads. The corpus reached a critical mass of new threads towards the end of January 2009.

The Google search method might have helped balance the selection bias of starting my snowball sample on MetaFilter, however. This is an intriguing and promising method which might be worth a follow-up study, with a slightly different focus, in the future.

Participants: How they were identified and coded

By the criteria established by Goodwin (1981), comment threads can be defined as conversations. Turn-taking occurs; there are different positions of speaking and hearing which participants may take depending on who has the current turn and what they are writing; speakers and hearers change positions and, while doing so, support or "ratify" each others' roles by how they behave and speak.

But due to the shape of the Internet medium, slight changes must be made to Goodwin's terms in order to understand what is going on with more clarity. Caveats must be added to the traditional understanding of conversational analysis, particularly when it comes to turn-taking. The "speaker" and "hearer" positions defined by Goodwin persist in a comment thread, but they should be conceived of somewhat differently because of the medium's nature. Below, I define the terms to be used in this study.

Speakership is still sequential over time; viewing the thread, only one person holds the floor at a given time. However, it is important to keep in mind that when viewing the thread, the utterances of previous speakers are also available, not just the most recent turn. The reader ("hearer") of a blog is able to return to the post that initiated the conversation, or simultaneously view an unspecified number of other utterances previous to the last turn. Any of these may be an utterance to which s/he responds as hearer.

In the abstract sense, the ever-present nature of earlier comments transforms the position of the hearer dramatically. If there are multiple speakers who might be responded to at any time, this also implies the possibility of multiple hearer positions: each speaker's presence implies a complementary hearer, and each speaker may address a different hearer. In practice, subsequent commenters may also choose to act as hearers to any of a range of previous speakers. The latter turns out to be important, in situ, to the way commenters take turns; I will discuss this later.

Beyond the ways asynchronicity and persistence change the nature of being a hearer, the fact that blogs are located in the very public space of the Internet also has a tremendous impact on the shape of the hearer role. A blogger is not able to specify who is able to act as a hearer to her initial utterance in a blog post (outside of setting a password or robots.txt file to limit access). Theoretically, the hearer could be anyone with Internet access. In practice, of course, who ends up being a hearer is shaped by the speaker's own social networks (through linking and other means of cultivating readership), the usability of blog software, the implementation and use of RSS feeds, the algorithms and storage capacity of search engines, and myriad other factors. The indeterminate nature of an asynchronous readership makes for a kind of Schroedinger's Cat audience; an Internet speaker posting asynchronously is generally unsure if she's got a live hearer or a dead one.

So: Goodwin's participant positions of speakership and hearership are weakened by the medium. Additionally, the medium does not support gaze and position, which Goodwin and others describe as means by which participants indicate their role in a conversation. Because of these limitations, one specific means of organizing conversation specified by Goodwin -- verbal ratification -- is most important to study here as classifiers of speakership and listenership.

As I coded the data, four mutually-ratified conversation participant positions arose, in two camps. These positions were blogger, blog reader, stranger, and hijacker. The negotiation of misunderstandings shaped these positions. These are, however, not generalizeable universal positions like the speaker and hearer positions found in conversational analysis; their genesis is more ethnographic, grounded in this data, than it is linguistic.

The "blogger" position was ratified by human and machine participants. The blogger was always the one to initiate the conversation; the blog software created the page and placed the blogger's opening remark before all other comments. Some blog software also identified the blogger when s/he left a comment in the thread, either visually or simply with the name stamp; some did not. Because of the capabilities of the software, bloggers had the ability to delete or modify the utterances of others (a tactic rather outside the scope of traditional conversational analysis) and in some cases they noted that they did so. Other participants usually did not have this ability (the exception being blogs which were run by a small editorial group). Bloggers were also ratified by some commenters as the person who had set the topic of conversation.

The commenters who ratified the blogger as speaker/topic setter were identified as "blog readers," or "readers" for short. I coded comments which ratified the blogger's topic, when cleaving to the blogger's established orientation to/indexicality of that topic, as comments from "readers." These ratifications were taken to indicate that these participants meant to participate in conversation with the blogger.

Because bloggers and readers were in agreement about how to approach the topic, it seemed useful to have a category to include them both. Thus, I will speak in this paper about "natives" as well, referring to both bloggers and their readers at once. Using the term "natives," I am not referring to the term "digital native" coined by Marc Prensky. (2001) I do not mean to imply, as he does, that "natives" are younger and of a different generation than my "strangers" or his "digital immigrants;" that cannot be inferred from my data.

The third group was identified by bloggers and readers. I came to code them as "strangers." They were seen by bloggers and/or readers as committing errors in conversation, and ultimately not participating in the blog's conversation, or "doing the blog wrong." What ratifications and errors looked like is described in greater detail in sections below.

Additionally, I designated some commenters not as strangers but as "hijackers." There were a few threads in which the blogger threatened, pleaded with, or admonished strangers to leave, but eventually gave up trying to enforce the "correct" way to read the site. These included a thread where women tried to sell their wedding dresses (the original post was a funny anecdote about a wedding dress sold on eBay); one where people with lisps sought help (the original post was about the programming language LISP); and a discussion on a fan blog of how to audition for Harry Potter movies (where the blogger insisted she did not actually have any contact with or influence over the casting process). Sometimes bloggers welcomed strangers on hijacked threads to continue their discussion, supported their ongoing dialog, or admonished other natives to leave the strangers alone. In these threads, provided strangers didn't go against the blogger's continued attempts at framing the discussion, strangers were coded as "hijackers." When a thread was hijacked, strangers tended to return to comment more often and address each other more than strangers did in other threads. Hence, when I ran statistics, I ran hijackers separately from other strangers. Hijackers can most simply be understood as error-makers who are tolerated by the blogger.

Goodwin says of his participant categories that they are ambiguous; at times it is not clear whether someone in the vicinity of a conversation is a participant or a nonparticipant. Likewise, it was sometimes not clear whether a commenter was a reader or a stranger. Some comments did not clearly ratify the indexical orientation of bloggers/their readers or that of strangers. These participants I coded as "reader status unclear;" they acted as hearers, but not as the addressors/addressees of either camp, nor were they ratified by either.

Beyond speaker categories analogous to Goodwin's, I also coded apparent commenter demographics when the commenter had provided such information. Where the commenter mentioned it explicitly, age was coded into intervals (earlier ages being coded by American school divisions, e.g. 0-10, 11-14, 15-18, 19-21; then 22-29, and in subsequent ten-year intervals). Where the commenter gave other cues (mentioning grandchildren, school enrollment status, or date ranges, for example), the commenter was coded as adult, child, or senior.

Gender was coded where the name, handle, or email address given by the commenter suggested it (for example, "" was coded as female, while "manstraw" was coded as male). Of course, this is only represented gender; this being the Internet, it is generally hard to verify the gender of a poster. The exception was the gender of bloggers; this was triangulated from the "about" pages of the blogs, and is assumed to be more accurate than the gender of strangers and native readers. Gender of commenters was otherwise taken at face value.

Email addresses were the only data I coded using Atlas TI's autocode function; I used the regular expression [A-Za-z0-9._%\-+]+@[A-Za-z0-9.\-]+\.[A-Za-z][A-Za-z][A-Za-z]?[A-Za-z]? to find them. Duplicates were removed, as were email addresses provided by commenters which were clearly not their own (i.e. they were offering them as a means to contact someone else, such as pro wrestling stars).

Commenters sporadically mentioned other information, including street address and nationality; these were also coded. Geographic data was taken only when commenters gave it, and as a result is incomplete; only 350, slightly less than ten percent, of comments included this information. This data was generally not taken from commenters' email addresses or URLs, except in the case of country-code-specific addresses (, for example). It was, however, inferred from phone area codes where possible.

Repeat commenters were noted, to make a rough count of unique participants. Because there is no guarantee commenters did not use others' nicknames (and there were a few clear times when they did), all counts must be taken with a grain of salt.

A common question from newcomers to the corpus is "How do you know these were real? How do you know people weren't putting you on?" It is of course not possible to know for absolute certain whether these comments came from "trolls," people who participated in these threads solely for the sake of getting a rise out of others. However, extended exposure to the mass of comments made it possible to identify many parodies, jokes, and other misleading comments. I coded some comments which just seemed too outrageous to be true as "reader status unclear," and they are thus left out of many of the analyses in this paper.

A great many natives amused themselves by parodying strangers; sometimes parodies outnumbered serious comments of a given thread, from either natives or strangers, by a large margin. It was not always easy to detect, but there were a few identifying features.

The existence of parodies was confirmed by natives, who sometimes noted it on the threads and used server logs and trackbacks to back up this observation:

Welcome MeFi and other people(Oct 19) Oddly, today I had the thought that the page's humour could probably get dugg with a description of "Google not evil?" or some such and as soon as I thought of that, I noticed all the comments were looking more like parodies (though it's hard to spot the difference when the original comments can be rather ... obtuse). I soon realized that MeFi had posted a link to the page causing a slight implosion of our dear service providers.
(OK/Cancel Google thread)

At times, it was difficult to detect whether a commenter was a stranger or a joking native, especially considering that some natives signed their comments with the same name as a stranger who had commented earlier. An example of a parody made of one of the very first Maury comments:

dear maury
My name is tolmattie Ganesh I'm 17 years old I think I'm having a baby. I have lots of problems. Can you give me your adress so I cound write to you.
Posted by: Tolmattie Ganesh on February 27, 2003 10:19 AM
dear maury
sorrry for the flase alarm it turns out im not having a baby be/c i poked mtself witth a cotehanger and put a hole thru my ovarys. i sitll luv ur show, i watch evury day.
thank you, tolmattie
Posted by: Tolmattie Ganesh on May 24, 2004 12:34 PM

Obviously this problematizes the categories of native and stranger; how could I be certain that one was a joke and one was not? In the few cases of this sort, I drew on cues such as patterns in spelling, capitalization, and punctuation (this is no analysis of the Federalist Papers, obviously, but the errors seemed intentionally dense in the second comment); dates separating the comments; and elements of the poetics/context: while the first commenter was distressed and making unproductive choices, the second comment had a deadpan tone not seen in the first ("i poked mtself witth a cotehanger" blithely segues into "i sitll luv ur show").

Some indications that a comment was a parody appeared in the email or URL fields, such as or, the latter being a reference to a MetaFilter joke. Another parody comment (which ended with a request for AOL to turn off the commenter's computer and feed the fish as well as canceling) included the username "" and the password "password." The latter suggests the native making the joke is attentive to password strength, and believes that strangers who find themselves on the wrong website also make the Internet literacy mistake of choosing a password which would be easy for criminals to guess.

References to the online communities which had directed natives to a site were often a good indicator that a comment was a joke. For example, a number of commenters referred back to MetaFilter or Memepool:

Maury, I am so glad I got your details via Mefi, this is worth my $5 sub alone
(referring to the subscription fee MeFi eventually added to help pay for the site's server costs).
Shoot, theirs a lots of people who rote to this sight. I hope u can help me freind we rilly rilly needs you're help. I would also favor to bestow a special gratitude to Memepool for the exceptional link. Keep it up, gang!
(Overhaulin thread)

A few parodies involved replies making fun of public or literary figures, such as this one from the thread where strangers attempted to contact Microsoft founder and president Bill Gates:

Hello everyone!
I was pleased to see that you all want my money. The power trip I get from seeing you beg makes me laugh evilly. I have decided that instead of giving it out to anyone, I will use the money to buy all of you a free version of the latest brainwashing devices made by Microsoft.
I assure you that any security flaws in the mind-control head caps will be fixed soon. We have diverted all of our remaining resources from Longhorn to fix the security issues. As such, our attempt to force our users into once again paying us too much money to get a "necessary" and (by the time it's done) outdated OS will be delayed once again, probably until 2010 or so. Luckily for Microsoft (and by trickle-down computing, the user), the mind-control beams will make sure that none of you care, and will buy it anyway, no matter how many times you get hacked or slammed by worms and spyware.
-Bill "El Presidente" Gates
by Bill Gates April 14, 2004 11:04 PM

This comment makes light of Microsoft's software security and performance -- a popular pastime in open-source software communities on the Internet -- as well as of Gates himself.

Natives sometimes gave indications on one thread that they had viewed other threads in this corpus, frequently having found it through MetaFilter or another news aggregator blog. Maury Povich's name, for example, showed up in jokes across a range of these threads, as in this one from the Overhaulin' thread, where it was otherwise a non-sequitur:

can u fix me car cool. it broke
Posted by: Marvy Povitch <> at May 21, 2004 05:01 PM

Likewise, a seemingly authentic Overhaulin' plea showed up on the thread about the LISP programming language which was hijacked by lisping teenagers seeking a cure. This is not so surprising, because natives had begun to post links and trackbacks about the Overhaulin' thread on the Maury thread; on the latter, however, there is no other indication that readers of the Lisp thread read or participated in the Maury thread.

In the Maury Povich thread, natives had traded jibes with a commenter named Meagan who asked Maury to find her father and brother (and who subsequently spelled her own name wrong -- MEAGN -- in a later comment). The MEAGN meme became a running joke in its own right.

Beyond references to similar threads, a number of natives' jokes included common references to pop culture phenomena. Many of these were to Internet-specific memes. References to the popular Flash animation "All Your Base Are Belong To Us" appeared in the Ketchup thread and the OK/Cancel and Gadgetopia Google threads. References to 419 email scams appeared in the Cancel Efax, Maury, Bill Gates, and Answers to Riddles threads.

Along with clear parodies of other threads, heavily misspelled comments, and cruel self-characterizations by apparent strangers (which I will discuss in a section on natives' "othering" of strangers, below), these were all taken as indicators of a humorous set towards the content. Thus, I coded most comments like these as jokes written by natives.

To wit: Parodies were present, but there were reliable means of ferreting them out, and I did my best to do so. Joke comments were often separated out from others in the analysis.


While any demographic differences in this dataset can likely be attributable to the snowball sample method by which they were gathered, there were some distinct differences between strangers and natives which may be of interest -- and considering the possibility of a gendered and community-specific digital divide which they present, are somewhat troubling. Beyond that, these demographics should be taken primarily as descriptive. At times in the analysis they will be referred to in calculating the comparative frequency with which natives and strangers exhibited particular behaviors.

Strangers were far more likely to identify themselves as female than as male. Even if all of the commenters who did not give an indication of their gender were actually male, there would still be more female than male strangers, by a large margin.

Total apparent numbers of unique visitors, by gender and role

Gender unclear
Stranger 312
Reader Status Unclear

Meanwhile, blog readers -- those taking the side of the blogger as to what was the "correct" approach to reading the thread -- appeared more likely to identify as male (though this would not be the case in the unlikely event that all commenters who did not reveal their gender turned out to be female).1

This might or might not indicate that the blogs included in this study are outside of the norm. The latest numbers from the Pew Internet and American Life study show men and women are about equally likely to say they do currently read blogs. However, when asked have they read blogs in the past, men are more likely to respond in the affirmative. Pew reports:

We suspect that this is due to the male-heavy nature of the initial blog readership population--men are generally heavily represented among the early adopters for most technologies, but women catch up over time. Due to the way the second question is worded, it captures some of those (largely male) early adopters who are not captured in the first question.

This dataset might well reflect this historical trend, considering most of the comment threads included here were begun earlier in the history of blog software. (For perspective, Blogger, an early developer of blog software, was founded in 1999, and bought by Google in 2002.)

One might argue that the topics represented in the blog threads included here tended to skew who participated in which discussions. Breaking the threads down by topic, this appears as if it might be partly true. Strangers and hijackers were far more likely to represent as female on celebrity-themed threads than they were to represent as male. Hijackers were also much more likely to represent as female on shopping threads. And in almost all themes, readers were more likely to represent as male. The exception was folklore threads. Folklore threads had the smallest number of comments, and one of the threads belonged to a female blogger whose female readers -- including the author of this study -- participated heavily in correcting strangers.

Percentages of comments from unique visitors who identified as male or female, by reader status (gender unclear not included)


Total comments

There were also differences in age information provided by natives and strangers. As indicated in the table below, blog readers (distinct from bloggers here) never gave their exact age, and only provided general information about being legal adults indirectly (by mentioning their weddings, children, etc.) By contrast, more strangers and hijackers mentioned their exact ages.

Age of commenters by status (percent of total in parens)

Blog readers

22 (1.69%)

9 (1.71%)
7 (0.54%)
1 (0.19%)
49 (3.76%)
71 (13.50%)
49 (3.76%)
39 (7.41%)
32 (2.45%)
4 (0.76%)
21 (1.61%)
1 (0.19%)
240 (18.4%)
190 (36.12%)
10 (1.48%)
10 (0.77%)
2 (0.38%)
7 (0.54%)
1 (0.19%)
11 (0.84%)
6 (0.46%)
1 (0.08%)
Total unique commenters 1304 526 675
Unique commenters mentioning age
455 (34.89%)
318 (60.46%)
10 (1.48%)

These numbers are skewed by a few factors, particularly among hijackers. The longest hijacked threads were about getting rid of a lisp, getting a role as a student in a Harry Potter movie, and selling a wedding dress; thus people of an age to have those concerns -- teenagers and adult women -- are overrepresented among hijackers. Women buying or selling a dress in the wedding dress thread were assumed to be of legal marrying age unless otherwise indicated. Girls on the Harry Potter thread openly discussed their exact age as they believed it had a bearing on whether or not they could be cast in a role. Both of these factors account for the higher percentage of hijackers giving or referring to their age. Aside from these thread-specific reasons, there was little prompting for strangers to give their age, but they did anyway; this may be related to strangers' tendency to divulge more, specifically in terms of locating themselves in time and space, as I will discuss in a later section.

Using what information was provided by natives (including bloggers) and strangers, I mapped out the location of commenters to look for patterns:

Natives Hijackers Strangers

On those maps, darker letter icons stand for greater certainty about where the commenter lived (at the city or street address level); medium-colored letters represent certainty to the state or regional (Bay Area; southern US) level; and the palest letters indicate certainty only at the national or continental (Europe) level. The information about each point is available by clicking the letter icon.

Arraying these data against US census data about population density, it appears that the distribution of strangers roughly correlates with population density levels. However, bloggers are slightly more likely to live in northern or western-coastal US states. It is worth remembering that these trends also correspond to Manuel Castells's observations about new sources of capital being built on older ones.

One final note, of interest considering natives' calling strangers "illiterate," which I will discuss later: Among the strangers, there were not only a number of college students and professionals; there were also at least three published authors. One had written an academic text; another, a book on computer networking.

Of the 23 bloggers whose posts were included in this dataset, only four were female. This is indicative that the dataset is not representative of bloggers as a whole; Pew reported in 2006 that bloggers were generally evenly split along gender lines. (Fox, 2006) Pew notes that more than half of the bloggers they surveyed in 2006 were under 30; nine of the 23 bloggers here could be confirmed as between the ages of 30 and 40, four under the age of 30, and one was 46 (according to information on their blogs or elsewhere on the Internet). This information was gathered in 2009, however; some of these threads were posted when more of these bloggers were in their 20s. All in all, the age of bloggers in this study seems to approximate Pew's average.

Professional information was available for 20 of the 23 bloggers. Of these, 12 had done some work in programming, web development, systems administration, human-computer interaction, or some other field related to computers. Most of the bloggers listed more than one vocation. Five had done some work as writers or journalists of some sort. Three were in academia; two in communications; one was a librarian, two were cartoonists, and two were songwriters.

Basic demographic information about the bloggers, taken from their "about" pages

Basic information about the bloggers, taken from their "about" pages

1 The female author of this paper, of course, goes by a male nickname; this did at one point reveal some assumptions of a male commenter on the Mary Kate and Ashley Olsen thread from her blog:

Eric wrote: Gus- If you think everyone is so lame for coming here and discussing MK and A, then why do you keep coming back? None of these people give a shit if you like the discussion or not, your posts seem more pathetic than any of the other one's i've read, so why dont YOU get a life and quit coming here? You know just as well as I do that if MK and A pose for playboy you'll be first in line to get your copy so quit being a moron. -Eric


gus <> wrote: Just to clarify: a) I'm a straight woman. b) This is actually my website.[...]
Eric wrote: Ok GUS nice name by the way, are you sure you arn't a dyke? lol

Thus was a gendered gaze with particular interests contested by the blogger and commenter. This stranger clearly perceived the thread as one where straight men were intentionally gathering to regard the Olsen twins sexually; when this was contested, he did what he could to repair the damage to his understanding of the context.


The analytical modes described up to this point in the paper -- conversational analysis, developed by Goodwin, and the functional metalinguistic frame developed by Jakobson -- were primarily developed to describe face-to-face conversation between two human beings, though some work has also been done on phone conversations and (in Jakobson's case) literature. Because of the focus on face to face conversation, discussions in both traditions have primarily referred to simultaneous human interactions, which are locally managed. In face to face conversations, participants speak contemporaneously, using physical cues like gaze, gesture, and local indexical references which can be parsed by other participants using sight and hearing. Phone conversations are still at least synchronous, allowing participants to take cues from interruptions and pauses and make local repairs in their wake. Meanwhile, because literature is assumed to be a sort of one-way conversation, Jakobson glosses over issues like turn-taking, repair of interruptions, negotiation of context, and indexicality; his model does not take on conversation.

What of the Internet's forms of asynchronous conversation, then, which are preserved over time and space like literature, but which disrupt traditional understandings of "local" not only in spatial (like the telephone) but also in temporal terms? Neither Jakobson's nor Goodwin's tradition has, to the best of my knowledge (which is pretty effing limited), begun to address conversational management under these conditions.

This dissertation's corpus attempts to demonstrate that, because they are taking part in asynchronous, aspatial conversations, conversation participants on the Internet face difficulty in managing turn-taking, mutual context (referent, including deictics/indexicals and addresser/addressee), and to a lesser extent the identification and maintenance of a channel (phatic) as well.

At times, this difficulty is revealing about Goodwin's assumptions regarding conversational management, suggesting new models for understanding even face to face conversation. At other times, it suggests directions for software development, including interaction design and semantic web applications. Finally, it suggests specific challenges to equitable global-scale communication, and outlines the task of the New Literacies in attempting to remedy social disparities.

Before embarking on an exploration of the disruptions in blog conversations, however, it is necessary to develop an understanding of mediated conversations and how computers participate in them. It is important to begin by recognizing that (like all media, from hieroglyphs onward,) computers enable us to extend our communications in time and space beyond our own physical abilities. While extending our abilities this way, computer applications also make significant changes to how we can contribute, as well as making their own contributions to conversations, like the classically Latourian actors they are. Search engines alter what is considered in a conversation as spatially "local context," supplanting it with an averaged global ranking system instead. Meanwhile, various other aspects of computers, from blog software down to the central processing unit, manage the timing of a conversation and contribute their own organizing elements. These computer interventions into the conversation management skills we all learn on a face-to-face level from infancy inevitably disrupt our expectations about conversations when we venture online.

Innis: Space, time, and mediation

The extension of human communication in time and space was not made possible by Alan Turing, Ada Lovelace, Vannevar Bush, or even Al Gore; it predates the Internet and computers by millenia. Marshall McLuhan and Harold Adams Innis both wrote of communications technology as extensions of existing human capabilities: not just of speech and hearing, but of consciousness. (CAREY, MSE, CITE)

Innis's work was particularly useful in coming to the finer points of writing technology, as he wrote about paper, parchment, papyrus, and clay tablets: all media capable of transmitting writing and images before the advent of printing, but each with its own advantages and drawbacks. Of this analysis, Carey writes:

Innis argues that any given medium of communication is biased in terms of the control of time or space. Media that are durable and difficult to transport--parchment, clay, and stone--are time-binding, or time-biased. Media that are light and less durable are space-binding or spatially biased. For example, paper and papyrus are space-binding, for they are light, easily transportable, can be moved across space with reasonable speed and great accuracy, and they thus favor administration over vast distance.
("Paragraph #10," in Harold Adams Innis and Marshall McLuhan Multimedia Study Environment, James Carey)

Innis had some interesting ways of gathering different communications media into these two categories and thereby attempting to demonstrate their impact on the shape of a culture. For example, he called speech a time-biased medium, thinking of it as hard to transport and store; yet he held it responsible for the long-term memory of societies relying heavily on tradition. ("Temporal Bias," in Harold Adams Innis and Marshall McLuhan Multimedia Study Environment, James Carey Material: Written by Pavel Schlossberg)

Innis's reckoning of the effects of space-bias and time-bias seems a bit dated by now, if not confused from the outset. Among other things, telephones, voicemail, and other computers have now made speech easy to store and transport, extending the voice in space and time. The proliferation of multimedia texts on the Internet, meanwhile, has opened up anarchic possibilities, angering the music, movie, television, and even book publishing industries by making their texts harder to administrate over a distance. Space-bias and time-bias do not have effects that are cut and dried.

Leaving aside Innis's wrestling with effects, however, his attention to time and space seems a very productive approach. Most media have the ability to sustain communication, either in time or space, past the inborn ability of any human communicator. Considering that the first forms of communication which human beings learn to manage -- language, gesture, and gaze -- are limited to the synapses, lungpower, motor control, etc. of human participants, what are the implications for producing, interpreting, and managing orderly conversation when these abilities are surpassed by our messages?

I will now explore this question by examining the impact of computer mediation on space- and time-related functions of language in communication: turn-taking, indexicality, establishing context, and specifying hearers and speakers.

The machine takes a turn

This study, as I have mentioned before, is closely akin to Lucy Suchman's analysis of user interactions with Xerox copiers. Suchman's scrutiny of these interactions through the lens of conversational analysis and other methods for understanding locally-managed human interactions provides a guide to the phenomenon at hand. Her contribution is treating machine actions like turn-taking in a conversation.

Conversational analysis has traditionally taken the "local" -- the immediate circumstances of participants -- to be the place where conversation is managed. As stated earlier, the tradition of this research field has hinged on turn-taking as pivotal in organizing conversation and other speech acts. (Sacks et al, 1974; Goodwin, 1981; Mehan, 1979) The rules of turn-taking are understood to comprise, among other things, means to decide who speaks next, and means to deal with overlapping speech, such that only one person ends up speaking at a time. (Sacks et al, 1974; Goodwin, 1981)

In this section, I will treat turn-taking in online interactions in two ways. First, the machines here take their own turns in Suchman's sense. Second, they also ineluctably manage human turn-taking as well.

Search engines, URLs, and context

An analysis of the data in this corpus along the lines of that performed by Suchman gives some insight into what otherwise might look like completely nonsensical online behavior. It will be important to look, as she does, at the machine as if it were engaged in a conversation with the user, taking its own turns to contribute to mutual understanding by providing and drawing on the local context. I will give a brief description of Suchman's work first, then explain what "context" often means when users interact with search engines, one of the types of online application which appear most prominently in this study.

Suchman's work follows social researcher Garfinkel and the conversational analysts. She wields Garfinkel's explication of the "irremediable incompleteness of instructions" (Suchman, p 112, ref Garfinkel, 1967, Chapter 1) to argue that an understanding of technology-user behavior as following pre-formed plans -- the fundamental argument of most human-computer interaction research -- is inadequate. Like Garfinkel, she also rejects social science's idea of abstract social structures, arguing that actors only ever execute purposeful action based on the context in which they are currently acting, taking into account difficulties, misunderstandings, and other cues which arise. Plans, instructions, and mental models are only some of the resources used by actors to make sense. To make a complete account of interactions' context would take infinite description; such an account is essentially impossible.

In Suchman's case, the context sensed by one participant in the interactions she studies -- a copier machine which has an embedded "expert system" -- is very limited. This particular machine only responds to the user -- only "takes a turn" in the conversation/interaction -- when the user performs an action that it can sense, such as opening a document cover, moving a tray, or pressing a button. It then attempts to "map" these actions to the appropriate response provided in its code. As a result, she notes, turns in her human-machine interactions are relatively predetermined.

In the case I am studying, however, machines are endowed with a tremendously detailed context: the databases underlying a search engine. Search engines were most likely the means by which the majority of strangers in this study found the blogs upon which they commented; ample evidence exists in the corpus to support this, with bloggers providing referrer log information about strangers, and strangers describing the steps they took to find the site.

Despite the broader resources available to the machine participants (search engines) in the conversations here, the context they make available to the user can be more disruptive to sense-making than it is helpful. The context does not necessarily match user expectations: instead of being local, it is made up of all kinds other people's contexts, everywhere -- a mass averaging and ranking of global contexts. This is particularly problematic when one considers Garfinkel's dictum about context (which Suchman cites): "not only does no concept of context-in-general exist, but every use of 'context' without exception is itself essentially indexical." (Garfinkel 1967: 10; Suchman 2007: 81)

Because search engines' consideration of context is global, it is subject to global power imbalances like those observed by Castells; powerful nodes have a greater say in what constitutes the "correct" context than less-powerful nodes. And all of this context, of course, is built on texts and how they are written: headlines on websites; their links to other websites; the "crawler" code that traverses these links, and the search engine databases they send those links back to; and Google's PageRank and other algorithms.

It is perhaps best to start with a description of how search engines work, and how this makes them the Internet's premier arbiters of context. This will involve discussion of the human participation in shaping search engine results; returning these results is ultimately a cyborg effort. Search engines are shaped by human coders writing their algorithms, by the vagaries of human-created websites, and by the structures of their internal data routing and server architecture.

First, a basic primer in the mechanisms by which search engines generally operate. Each has its own specific peculiarities, but most now share a few certain things in common.

Search engines replace earlier models of organizing massive amounts of knowledge about what is available on the Internet, and specifically the World Wide Web. Earlier models tended to be indexes, human-organized, with human contributions determining what ended up classified how. (Battelle, 2005)

Human labor could not keep up with the explosion of online content, however. In the early 1990s, Matthew Gray developed The Wanderer, a new automated tool for cataloging online content which became the model for most future efforts. The Wanderer, like the crawlers ("robots," or "spiders") which are its heirs today, was a bit of code programmed to search the Internet, proceeding from one site to another by following links. (Battelle, 2005)

It is worth noting that every link, itself, must make reference to an index in order to function. A link to another page is of course contextually indexical; to a human, it means "that page there, the one at the address provided in the code," and the human user can interpret that the linked-to page has some relationship to the page s/he is reading. 1 To Google's search algorithm, a link means "the address in this link has a semantic relationship to the words in this link." The engine then counts the link as a vote for this relationship, to be weighed when a user enters a search term. I will now describe how these relationships are built by the machines.

URLs are literally index-indexical, in a database sense: they require machines to look up the referent of a string of letters, numbers, and other symbols, and send participants (both human and machine) to the associated page. To domain name servers and domain name clients -- the parts of the Internet involved in looking up where to send you when you enter a URL, and what content might be there -- a link is indexical for a specific set of data, in a particular part of a file structure, on a particular physical server or servers, somewhere on the Internet. This also has indexical ramifications in terms of ownership: knowing a URL is the most accurate way of knowing who owns a website. Legally, this information is most likely of more use than the information on an "About" page. This is why, as mentioned in the literature review, Lessig and Sassen talk about domain and address management as key to understanding control on the Internet. From the reader's perspective, a URL can thus help define a sense of who is reading and writing on a webpage; from an educator's perspective, using "Whois" lookups ought to be a higher priority learning goal than looking for "About" pages or top-level domains like .edu or .gov.

Following the links as they are resolved by name servers, crawlers send the information about sites back to the search engine's databases. These databases consist of a couple of parts. In the case of Google, which caches all of a page's text, that text resides in a few places. The first is a database -- an index! -- which holds the entirety of the page's text. If this was the only place Google stored that information, scanning to find the search terms on the full text of every single page on the Web would take far too long.

So Google's second database is one in which words or other strings of text are indexed individually. This index includes a pointer to the places where the word appears, along with its ranking information, which is usually based in part on how common that word is on a page and where it appears (in the title vs. in a footnote, in a link, etc.). 2

Different search engines use different algorithms for calculating how to rank the results they serve to searchers. The ways these rankings are calculated appear to have been important in driving strangers to the blogs in this study. I will now briefly describe how.

Google's initial unique advantage in how it generated rankings was PageRank, a system described by Page and Brin in their initial papers on the engine. (1999 1 and 2?) PageRank weights a given page in search results based on how many links it receives from other pages, and what is in the text surrounding those links. Like looking up citations of an academic paper (a process which was actually essential to Brin's development of PageRank), this ranking system gives a sense of how influential and useful a web page is. (Battelle, 2005)

Despite PageRank's improvements over previous search ranking mechanisms, it has a flaw which is highlighted by one of the threads in this corpus. PageRank is a popularity contest, and is thus susceptible to a certain kind of ballot box stuffing. There is no other explanation for Jonathan Coulton's thread "Please Please Cancel My Account" weighing in as the #3 hit on Google for the search phrase "cancel my account" (when it is not entered in quotes -- put it in quotes and it appears as #2).

Coulton is a musician made popular on the Internet; he sings songs about Internet things, math things, and other nerdy things. As a result, there are close to 2,000 pages linking to his blog as of writing (using Google's advanced search for links to a page), and this includes large aggregator sites such as BoingBoing and LifeHacker, and Internet/television celebrity Wil Wheaton, among others. Were Coulton just some random parodist playing in his garage, it is unlikely his very short blog article with a link to a recording of someone canceling their AOL account would appear on the first page of the Google search for "cancel my account." It is not the importance of his commentary on canceling accounts which raises his post in the rankings for this search term; it is who he knows, or rather who knows him.

As Lessig, Sassen, and Castells point out, de-facto networks of power on and off the Internet inherently drive more traffic to established nodes of power, whether that traffic consists of capital, transportation, or media views. (CITES) Coulton is the beneficiary of such traffic, and I would estimate it is likely the other bloggers in this corpus do too, to a lesser extent. Most of them work within the field of IT in some way or another, and it is likely their PageRank benefits as a result.

It is not likely just individual bloggers' aggregate PageRank which drove traffic to these pages, however. In many cases, natives also cast their "votes" for raising the PageRank of the specific posts in this corpus, as well, by linking to them from their own blogs and from aggregators like MetaFilter, Something Awful, Digg, and Memepool. The Maury Povich thread on and the Overhaulin' thread on were likely the biggest beneficiaries of inlinking; they saw a great deal of discussion on MetaFilter, many trackbacks, and many comments within the threads themselves which indicated that natives had found their way to the threads from links on aggregator sites. These in-links often further cemented the indexical linkage of these threads to the terms "Maury Povich" and "Overhaulin'", as these terms were included in the links themselves. To wit: by discussing the "errors" they saw on the threads they linked to, natives made the pages more semantically important to Google, and inadvertently assured that more people seeking Maury Povich were likely to end up on those threads.

It has been demonstrated that the distribution of many aspects of links and associated behavior follow a power-law distribution, suggesting that the balance of attention received by websites is highly unequally distributed. Hindman et al (2003) refer to the handful of studies which have demonstrated this, including Barabasi and Albert's (1999) observation that a few major central hubs are more linked-to than smaller websites, and Huberman et al's (1998) findings that traffic mirrors this distribution of links, with more hits by Internet surfers going to a few sites. As Hindman et al observe, "the fact that anyone can place information online creates problems of scale that only a few of the most successful sites may be able to overcome." (2003)

Google is not immune to this pattern; in fact, it is built on it. Hindman et al have called Google's enabling of this power-law distribution "Googlearchy." They write,

The tendency of surfers to 'satisfice' - to stop after the first site that contains the sort of content sought, rather than looking for the 'best' results among hundreds of relevant sites returned - makes this 'winner take all' phenomenon even stronger.
(Hindman et al, 2003)4

PageRank is, of course, only Google's mechanism for ranking pages; it is proprietary (and this has other ramifications I will discuss at the end of this paper). Other search engines use other ways to weigh a page's relative importance.

One factor in the calculations of many search engines is considering headlines to be a good indicator of what a page is about. Traditionally, this meant scanning a page for text in very large type (with h1 tags, high number of points, etc.). However, this can be ineffective on blogs, which often have whimsical names (put in large type) and an ever-changing series of headlines, sometimes written with a literary or personal bent, which do not simply denote the associated posts' content. (Nielsen, ) Thus, in this corpus, we see an article about Google's doctoral internships, posted to the blog OK/Cancel, apparently ranking very high when strangers search for "Cancel Google": both words appeared in large-point font. For the same reason, my blog was once the #1 result on Google for "dancing the Crip Walk;" my blog's name has "dancing" in it, and I once wrote a piece mentioning how some of my students at the time were doing a dance called the Crip Walk. As a result, I received many comments from people asking how the dance is done even as I wrote in the article that I did not know how to do the dance.3

Any number of bloggers who wrote articles about celebrities with the celebs' names in the titles (sometimes as the only words in the title, as in my post on Mary Kate and Ashley Olsen) have come to regret it, as it often draws people seeking to contact those celebrities. Even when the blogger invites discussion about a celebrity -- such as David Dalpiaz's post "Can We Talk About Avril Lavigne For A Minute?" -- the discussion often goes in a direction the blogger did not expect or want.

The number of times terms appear on a page has also historically been used to rank pages. Thus, blogger Josh Larios, who is a former systems administrator, began seeing a good amount of traffic from people looking for the actor Josh Server: his name shows up many times on his blog, along with many mentions of the servers he works on.

This discussion may appear to be more high-level, addressing industry practices and demographic patterns rather than the specific behavior exhibited in this corpus. However, the effects of vague titles, links, and content have not gone unnoticed by interface usability researchers as presenting sense-making difficulties for users. HCI researcher Jakob Nielsen cited all of these problems in a discussion of blog software and usability. (Nielsen, 2005)

Why, though, do titles, links, and search engine results stand out as causing particular problems for users trying to make sense of online content? As I have tried to demonstrate, it is what the machine makes of them which is problematic. Like Suchman's "expert systems," search engines have a very limited access to the context of their users' queries.5 Instead, they try to match these queries to their databases, sometimes inappropriately, using algorithms which abstract context from the local situation of a user in a way which maintains power imbalances, ignores specificities, and often relies on the poor headline- and link-writing skills of those who are most active contributors to the Internet's content.

Yet despite the contextual mismatches they have delivered in the corpus, in the understanding of strangers, it appears search engines are answering the (implied questions of their) search strings "(where is) Maury Povich('s website?)" and "(where can I) cancel my account(?)" This is largely due to human expectation that one utterance in a conversation is generally a response to the previous utterance. I will now speak a little more about rules for turn-taking in conversation, which lead both to misunderstandings of search engine responses, as explained in this section, and of human responses, which computers also ultimately manage.

1 Whatever that relationship may be. Early theory about hyperlinks (CITE?) attempted to pin down absolutes about the "meaning" of a hyperlink within a text, but I have never found such approaches to be flexible enough to reckon with the range of human practices of linking.

2 Web search engines do not just have one copy of the cache database and the index of individual words/text strings. Instead, they have multiple redundant copies of each of these, which are periodically synchronized. This makes for an intriguing and often-overlooked reason why search engines do not provide a uniform, logical, pure version of the truth: because their databases and indexes are not always synchronized, two people in two different parts of the world can send the exact same search string to a search engine at the exact same time and not receive the same results.

3 The Crip Walk thread was really the first of its type to pique my interest (and sometimes irritation). Sadly, is not included in the corpus because I could not find a copy of it intact, though it would have been a fabulous addition to the corpus, evening out some gender and thematic imbalances; many male-identified commenters wrote in with questions. I eventually got so many comments on this thread that I tried to make it into its own page separate from the blog. I tried to reshape the discussion with links to other sites about dancing and activism. The move changed how commenters read and used the page, so it might not have been truly comparable here. In addition, it became divorced from the blog infrastructure, and as a result did not survive a move to a new server. :(

4 There was some speculation that Google adjusted its algorithm to deal with the explosion of Internet content which blogs represented in the earliest years of the twenty-first century. Certainly, there were some corners of all media, including online forums and many journalistic outlets, in which blogs were decried as full of noisy, low-quality content which was decreasing the quality of search results; this led to calls for Google to treat blog content differently. Speculation about Google algorithm alterations centered around Google's purchase of Blogger, with a writer for the Register guessing that Google might develop a blogs-only search engine, and a Yahoo employee drawing conclusions from the ranking of his own blog, which had once been a top hit for the search term "jeremy" but which sank below the ranking of his home page following a known change to Google's algorithm. However, outside of the jealously-guarded confines of Google's nondisclosure agreements, this cannot be confirmed or denied, and remains speculation.

5 This does look likely to change somewhat in the near future. Wolfram Alpha, Bing, and other attempts to develop "semantic web"-oriented search engines all make more attempts to engage the context of texts more thoroughly and provide more refined results. These attempts apparently still draw on "generalized context," however; Garfinkel is likely to laugh. Google, interestingly, recently declared its intent to "personalize" searches. This will ostensibly make use of personal data gathered from users themselves. This would be one step closer to really understanding the personal context of search queries. Nonetheless, the machine's access to the local physical space of the user remains limited; desktop computers are still not great at seeing like eyes, much less smelling, tasting, checking galvanic skin response, looking at gestures, judging proximity, knowing which way is up (ok, so your iPhone or Wii will do better at this), or working out what each of these senses means within the sociocultural surround of users AND their current search term. Computers are still not as good at judging interactional intent as other people are.

Communication software manages human turn-taking

This section will further flesh out an understanding of technology's active role in online communications: not only does it interrupt spatial context, as discussed in the last section, but it also disrupts conversation management by influencing temporal aspects. It will also suggest that these temporal disruptions present a fundamental change to the original speaker-selects-next conversational mechanism described by Sacks, Schegloff, and Jefferson; this analysis suggests that instead, the more accurate mechanism describing conversation is listener-selects-previous.

Conversations on newsgroups, forums, blog comment threads, email, and instant messages fit some of the traditional requirements of Sacks et al for what can be considered a conversation (as opposed to, say, ritual speech). In these online conversations, as in face to face conversations, communicators take turns to speak, mostly speaking one at a time, unlike in other text mediums such as books or newspapers, which are much more unidirectional in their participation. In online communications, as in face to face conversations, participants have relatively few constraints on what they say, how long an utterance is, whose turn comes next, the number of people who may participate, etc.

Conversational analysts have made some attempts at applying the method to online conversations. (Marcoccia, 2004; Ornberg, unpublished, 2003; Harrison, 2008) Their studies have underlined the effect of spatial and temporal disruption on traditional face-to-face conversational management techniques. Among the elements disrupted are overlap between speakers, and the potential for interruption; the cues and possibilities for choosing to respond; and, as a result, opportunities for correction. Conversational analyses have begun to note the specific ways technology is implicated, but the picture they have provided of technology's participation in online conversation is incomplete.

Marcoccia (2004), Ornberg (unpublished, 2003), and Harrison (Handbook of Research on Computer Mediated Communication 2008, ch. LIV) have each addressed the ways in which computer-mediated communication changes participants' abilities to hold conversations in the way that they would face to face. What these studies have not addressed is that, in fact, when a computer mediates communication it often takes over the responsibility for managing turns. The methods and implications of this feat are discussed in this section, with examples from blog comment threads in the corpus of this study. Because blog comment threads are not unique in disrupting the conventions of face-to-face communication, I will also use illustrations from other forms of mediated communication (including some which are pre-digital), and the observations of Marcoccia, Ornberg, and Harrison, who were working on instant messages, email, and forums -- near-kin to blog comment threads.

Turn-taking mechanisms in computer-mediated communication differ significantly from spoken turn-taking mechanisms. This is the case for three reasons. As stated before, media can expand the reach and persistence of the conversation in space and time. Further, some media lack particular affordances of face-to-face conversation, keeping communicators from making use of gesture, gaze, proxemics, and other spatially- and temporally-bound cues they are accustomed to use in maintaining conversational order. As I will now explain, in the case of automated media (computers, for example), the mediating machine may also participate in the conversation itself, manipulating turn order, the content of the message, and other elements. Such is the case with the blogging software which helped construct the pages included in this corpus.

It might be tempting to assume that because they allow for conversation between two human beings (distinguished from humans and machines as discussed in the last section), mediated communications inherently support the same local control of turn-taking, by participants, which is possible in face-to-face conversation. It might also be tempting to assume that, because they all allow for reciprocal communication, all forms of computer mediation -- IM, email, comment threads, etc -- provide equal support for managing turn-taking. A brief review of some technologies from the past few decades problematizes these assumptions; it will also shed some light on what blog software does and does not do.

Exhibit A: my first experience communicating with a modem. I do not remember the particular software being used, so I won't be able to ascribe the interface to a particular package; I can say, however, that my friends Robert, Misasha, and I were making use of dial-up modems with DOS machines (belonging to their fathers), in the late 1980s.

Sitting at one end of the line, Robert and I were faced with a nearly blank screen, working from the command line (text presented one line at a time); there was no graphical interface, no buttons to click. Commands entered to connect Robert's modem to Misasha's remained onscreen as Robert and Misasha began to type to each other. If Robert and Misasha typed at the same time, the computer would show letters on the screen as it processed them -- meaning it was impossible to extract which text had been entered by Robert, and which by Misasha. The result was a small jumble of computer commands and bits and pieces of words, periodically separated by a line break or two as my friends attempted to distinguish what they were writing, to make it more readable.

I should note that it is quite possible Misasha, Robert, and I did not know how to use the modems in an optimal way, and thus this was somehow an unorthodox use of the modem. Regardless, this use was one of the possibilities among distant communicators at the time.

Exhibit B, another command-line technology, which I began to use upon entering college: the "write" and "talk" commands on UNIX server shell accounts. These were also very simple means of communication; there was little on the screen except for words.

If I used the command "talk kceF95" from the prompt when my friend Kellan was online, UNIX would clear the screen and divide it in half horizontally. Anything I typed would appear in the bottom half of the screen; anything Kellan typed, in the top. An improvement over the modem: we didn't have to do anything to disentangle our own words from each others'. (And, actually, another form of communication supported by telnet accounts -- the "write" command -- provided less support for distinguishing different participants' contributions from each other and other text on the screen, while posting lines one by one in a way which made individual contributions clearer than they were by modem alone.) However, like the modem, Kellan's letters and mine would still appear as they were received by the machine. This meant we could still be talking at the same time, in violation of Sacks et al's observation that in ordinary conversation, "Overwhelmingly, one party talks at a time." Sometimes, being unable to hear each other as we would be able to in face-to-face conversation, we would in fact talk at once.

From these historical examples, we see a few things. First, interruption has at times been possible in online conversation. Second, these interruptions did not happen as they did in Sacks et al's model. They did not cause one party to stop immediately, as parties could not hear each other, and utterances by two parties could become mixed up to a point where distinguishing who said what was impossible.

These early examples -- by contrast with current modes of computer-mediated communication -- give concrete evidence that computers are by necessity programmed to actively manage turn-taking in mediated conversation. In fact, this mediation is inherent to the way computers' processors work. Based on the order in which input arrives at the CPU, computers constantly make decisions about which speaker spoke first, even as they present two speakers' input so quickly as to appear (to the human eye) to be simultaneous (as in the modem and telnet talk examples, above). As a result, machine management of input inherently locates conversation participants indexically in time; this is its most salient feature. It is worth noting that this is a quality of computers but not all media. Land-line telephones permit two conversation participants to speak and listen to each other simultaneously. By contrast, many two-way radios and speakerphones stop one party from being heard while the other speaks; however, this has to do with use of the channel than with decisions made by a central processor.

As I have argued, a computer always acts as a participant when it is the medium of conversation. At times, this means its decisions about who spoke first have the final say in determining overlap: it time stamps the input, keeping it separate from other input which may have arrived at the same time, and displays it to the user in chronological order of which input arrived first. At other times, as in the telnet talk example when users can see both participants' input at the same time, users have more leeway to react to and control for overlap, pausing in their input and waiting for another speaker to finish.

When computers separate out input and determine its order by time stamp, there is always the risk that the turn order understood by participants will be disrupted. In a public blog conversation, two participants could, unknown to each other, respond to the same comment at the exact same moment. Depending on the software involved, instead of making it clear that each of these is a response to the same comment, the computer might mark one response as arriving before the other, and make it appear that one of the responses responds to the other response.

Because of the interference of the computer in maintaining temporal indexicality, participants cannot rely on the traditional conversational assumption that responses follow each other in time. Instead, attention to the computer's contributions -- timestamps, etc. -- is an important part of making sense of the conversation, a contextual/indexical/deictic literacy which takes the place of understanding gaze, proxemics, and other cues.

By the current time, of course, a wider range of software applications support turn-taking in ways that eliminate the hard-to-decipher overlap my friends and I faced with modems and UNIX shell accounts; they structure turn-taking in a way that is more analogous to face-to-face communication. However, these newer online turn-management systems are still not wholly equal to face-to-face conversation. I will discuss why in the following paragraphs, incorporating Ornberg, Marcoccia, and Harrison's analyses of these conversations.

Sacks et al state that the end of a word, phrase, clause, or sentence is an invitation for a change of turns, depending on what has been said and the rules in play. When my friends and I used modems and the UNIX shell chat clients, this was still almost a possibility. But Ornberg and Harrison both note that because of the asynchronicity of the online communication they studied (text chat and email, respectively), changing turns in the middle of someone else's turn is not really possible in all online conversation software. As Harrison points out, a word, phrase, clause, or sentence does not offer a transition until the writer hits "send," the message is sent to the various layers of managing software, and it appears for others to view.

This has ramifications for other elements of Sacks et al's conversation model. First, it enables multiple listeners, rather than just one, to react as if they were the addressee of the Turn 1 utterance. In asynchronous programs like email and comment threads, nobody knows if someone else is talking. Their all "talking at once" does not pose a problem for other listeners, either. Even if the computer issues two utterances a simultaneous time stamp, it will present them as if one came after the other. Again, this presentation can only be perceived by participants after the fact, once the messages have been posted by the machine. (Harrison, Marcoccia, CITE)

As a result of this asynchronicity, listeners are unable to interrupt, and thus are unable to limit the length of another's turn, correct each other in real time, change the topic, request more information, or do a range of other things which might help repair disruptions in the conversation. These affordances of Sacks et al's original system are missing.

Harrison, Marcoccia, and Ornberg all note that an even more fundamental way in which online written conversations differ from face-to-face conversations is their lack of support for gesture and gaze. Gesture and gaze were described by Goodwin as a means of identifying who the next turn-taker should be, and also who the audience of an utterance is.

To make up for this handicap on physical cues, conversation participants find other ways to specify an audience. Harrison finds that current speakers select next speakers in email by including using someone's name either in the body of a message or in a greeting, or quoting a previous speaker to call them back into play.

Marcoccia, studying a pre-graphical online form of communication (newsgroups), claims that "when a participant sends an initiating message, he/she cannot select a recipient and is constrained by the system to 'speak' to no one in particular." (Marcoccia p 140) This is not altogether correct. What initiating speakers lack, for the most part, is the ability to keep other potential audiences from listening; but this is as true of speaking in a crowded room as it is on the Internet. Marcoccia notes that the address marker he found most often in his data was "quelqu'un," (anybody), and wants to tie this to the medium's perceived constraint on audience choice. (p 140-141) Still, as Harrison noted, speakers have the leeway to verbally address others individually in a newsgroup or blog. Let us say, then, that addressing a general, undefined audience is a default supported by public newsgroups, blogs, and forums; it falls to the user to be more specific. If the user is not more specific, it leaves that much more room for misunderstandings.

Ornberg suggests these difficulties explain why she found questions going unanswered in text chat. In the face-to-face conversations described by Sacks et al, a question is a means for one turn-taker to select the next. This is also possible online when a speaker addresses a particular user directly; however, this does not always happen, and that is what Ornberg saw. She writes that the vagueness of questions asked without gesture or gaze to direct them at a particular participant may have lessened the social pressure to respond. Thus, the usual system for selecting the next speaker breaks down.

Thus, in Internet conversation, there is not necessarily a bias for the speaker preceding the current speaker to be the next speaker, as Harrison has noted. (2008) Because the discussion is asynchronous and not carried out in the same place, there is no social pressure for the initial speaker to respond to the second turn. Harrison also suggests that the priority of Sacks et al's first three rules (1a-1c) is eliminated online; current-selects-next does not outweigh self-selection, current taking another turn, or no response at all to be delivered.

The bias in Internet conversations seems not to be speaker-selects-next, but rather listener-selects-previous. The option to respond at any turn transition point is at the whim of the listener. For that matter, there is no guarantee that the original speaker will return to the discussion to read the response to their first turn -- even if the original speaker is the blogger starting off a comment thread.

According to Sacks et al, the previous-speaker-should-speak-next bias allows for misunderstandings and other errors to be dealt with within the next turn. The lack of such a bias on the Internet thus raises the possibility that misunderstandings will go untreated for long periods of time. It is up to listeners to seek them out; there is no compulsion for anyone to return to the discussion to fix them.

Another consequence of this disruption to the "turn-in-a-series" understanding of conversation could be syntactical. Sacks et al say "that some aspects of the syntax of a sentence will be best understood by reference to the jobs that need to be done in a turn-in-a-series." (p 723) One might thus expect that participants in an online conversation might well misunderstand the meaning of words used in others' turns based on syntactical misreading.

To sum up: To accommodate asynchronicity, aspatiality, the CPU's needs for sequential processing of information, and older software's models of representing conversations coherently onscreen, the code written to run blogs, forums, email, chat software, etc. takes over the responsibility for managing conversational turns, disrupting the human ability to manage conversational turns locally. Participants adapt to the difficulty of first-speaker-selecting-next online by adopting a listener-selects-previous mode of turn-taking.

I will now describe in greater detail how the blog software involved in creating the documents in this corpus compares to other contemporary online conversation systems in terms of its support for turn-taking management. Blog software, like older online conversation software, presents completed turns in the order it receives them (with any code-mandated alterations to that order, such as blocking certain users or "moderating down" and hiding comments which have been disapproved by other readers), and repairs overlaps through this ordering, making them appear to be sequential. The code adds beginning and ending information about authorship, timing, and the direction of the response (the latter only when it has been coded to manage threading). It may give special priority (position or highlighting) to the author of the writer who initiated conversation.

Marcoccia and Ornberg both note that digital conversation is often made up of multiple conversations. Marcoccia calls these "polylogues," noting they exist in other forms of communication as well. Polylogues generally have a "lack of collective focusing and the existence of varied focuses" (Marcoccia, 2004), and digital conversations often share this fragmentation.

The first task of anyone attempting to understand turn-taking in online communication is sorting out these conversation threads. (Ornberg, CITE) Some users participate in more than one of these threads at once, and earlier turns can be responded to dozens of turns later. "[I]t can be concluded," Ornberg writes, "that when defining turn-constructional units in this medium, one has to think in terms of content rather than in terms of form." (p 7)

This is not entirely true in all cases. There are other ways multiple conversation threads are sometimes indicated. By now, some software provides excellent support for indicating which speaker is responding to whom. This support often involves computer participation in turn-taking: the machine adds text and other visual elements without human input.

The following are some examples of machine participation in conversations which support an indication of who is responding to whom. First, some "threading" support from email, in the Gmail client:

Image:Patti email conv 1.png


Image:Patti email conv 2.png

Within the text of the email itself, we see that textual features have been added by the computer in order to indicate who is speaking -- whose turn it is. First, we see a header which tells us who took this email turn (my aunt Patti), to whom she is speaking (me), and at what time she took her turn. All of these elements will appear in Patti's email whether she wants them to or not, and my email client will most likely pass this information on to me.

Next we see an indication that this is a forward ("Fwd:") and thus the content is not likely entirely written by my aunt (though it is possible she made some alterations, and it is even hypothetically possible she changed the entire thing). This is another piece of information that Patti does not need to enter; the computer does it for her. A reaffirmation of the forwarded nature of the email is made partway down ("Begin forwarded message:"); an earlier (as evidenced by the timestamp) version of the text follows, starting again with other contact information (another of my aunts). This forwarded information is demarcated by a vertical line running along its left-hand side, which again is added by the computer to indicate forwarding. (Earlier iterations of email software indicated forwarding with a carat before every line, like so: > Multiple carats would be used to indicate quotation by multiple earlier writers: >> .) We see an indication that this mail was previously forwarded: in the reiterated "FW:" in the subject head of this enclosed message, in another "---Forwarded Message:---" warning, and in yet another indented, side-lined block of text. All of this machine-generated input serves to mark turns taken by other participants.

Interestingly, though, my second aunt (Quinlan) has made it hard to understand who is speaking if one ignores the forwarding cues and only pays attention to the text. She has changed the answers in the forwarded survey underneath the second "---Forwarded Message:---" bar, without getting rid of Aunt Patti's greeting. This makes it look as if the subsequent answers might be Patti's, not Quinlan's (though they are different from the answers above, which might be Patti's -- or might belong to the originator, Clay, if the All-Initial-Capsed rules provided in the text have not been followed). The remaining indication that the answers to the second round of questions are Quinlan's is the way they break the side-line, whereas the questions do not -- giving them the same amount of indentation as Quinlan's introduction.

Email clients, in addition to email servers, also provide some graphical cues for understanding turn-taking:

Image:Threaded email.png

Before we even open the text of an email, we see that certain messages are responses to others that have been received; they follow them, and are indented. The subject lines also indicate that they are about the same subject. This indication of threading, which is not available or a default in every email client, is particularly useful given that users are able to manipulate the text within the message and thereby do away with indications of which earlier messages they are responding to.

To sum up, email has long provided some measure of graphical support for understanding turn-taking, both on the level of multiple messages and within a given message. However, the effectiveness of this support can still be muddled by writers.

More public forms of computer-mediated communication have adopted similar means of indicating turn-taking. Here is a comment thread from YouTube:

Image:Threaded youtube comments.png
YouTube's threading system offers a few affordances for understanding turn taking. Comments replying to earlier comments are indented and have a grey bar at the side, as seen in the email forwards above. Additionally, comments made by the poster of a given video are highlighted in yellow. The system allows the user to pick which comment they are replying to by clicking the "reply" link by any given comment -- perhaps understood best here as "listener-selects-previous" rather than CA's traditional "speaker-selects-next," as while the first commenter cannot indicate next except by adding a tag question to elicit response to another commenter, a second or later commenter can choose who they are responding to by clicking a specific "reply" link. One might wonder if public Internet communication works this way also because of the nature of Internet audiences; it is harder to select-next as in many cases you don't know who is listening, and who thus might reply next. Perhaps it is easier to make undirected statements than to solicit responses; selecting-previous is simply more fruitful.

Slashdot, a long-running computer news site, has evolved a threading system which looks like this:

Image:Threaded Slashdot comments.png

Not only do comments here have subject lines helping to indicate what they are responding to; not only are comments nested, with indents, enclosing boxes, and small L-shaped graphics to indicate which comment they are replying to; but there is a disclaimer after the header for the original article making it clear that the comments are not written by the original article's author. There are also "reply to this" and "parent" buttons which allow the next commenter to reply to a given comment or the original article without scrolling up to the top of the page. Slashdot has not always had this system; it has evolved over time to respond to confusion raised by flat threading.

The blog software used by a majority, if not all, of the blogs in this study provides far fewer affordances for understanding the flow of turn-taking than the email, YouTube, and Slashdot examples provided above. The blog software does not indent or otherwise graphically distinguish responses to other comments; it generally does not distinguish the author of the original post from other commenters; it sometimes does not distinguish quoted text from an earlier turn; and it presents multiple conversational threads linearly by time stamp, rather than teasing them out by topic.

For example, here are comments from, whose author suggested eight comment threads included in the corpus:
Image:Thread on Communications from Elsewhere.png

Thus, the reader who comes to a blog comment thread needs to employ much more interpretation, some of it blog-specific, in order to make sense of a conversation there. These blog comment threads leave the participant to determine who is responding to whom, and who the different speakers are, with even less help from the software itself than they might be used to in email, forums, or YouTube; participants may also need to abandon some of their expectations about threading.

On blogs, prospective next-turn-takers also have to puzzle out how to make it clear they are responding to a particular earlier comment, if they wish to do so. The machine will not support them in doing this; they must make a direct verbal form of address to other participants.

Marcoccia notes his participants faced similar problems; as he says, "the interface can only indicate the sequence in which a message is located, but not which previous message of the sequence it is an answer to." He presents a number of examples from his corpus in which readers make errors or express confusion in attempts to understand turn order. (It should be noted that Marcoccia is not altogether correct about the interface, at least in this day and age; newsgroup readers currently have the option to make use of feed readers which add on a number of the affordances offered by Slashdot and email, discussed above.)

Marcoccia raises an important question: "which parameter is more important for describing a message: its content or its place in the sequential order of the conversation?" (p 126) This would be difficult to answer without a controlled experiment, and raises other questions, as well. Is relying on content rather than place a skill developed by veteran Internet users, or is everyone able to do it equally well regardless of experience with the medium? And besides, do we know which is more important in verbal conversation, to begin with? Despite the open-endedness of Marcoccia's question, the point remains: if turn place information is inaccurate, content may become more important for understanding turns in a digitally-mediated conversation. As I established in the section on search engines, context has already been disrupted by the participation of search engines, making content interpretation all that much more difficult.


The goal of the analysis in this thesis is to identify the specific aspects of language in which online communication breaks down. As alluded to in previous sections, particular elements of language emerged from grounded and linguistic analysis as problematic. These included turn-taking; understandings of poesis; and perhaps most saliently, identification of context: indexicals generally, and in specific the identification of channels, addressers, and addressees. In this first section, I will address turn-taking; in the next, problems with context.

Following the contributions of machines to this breakdown will be particularly important to identifying how participants muster and contest authoritative interpretations of what's going on, in a Latourian sense. As I explained in "The machine takes a turn" above, there were two ways in which computers participated in turn-taking in the conversations considered here. First, in Suchman's sense, the results that search engines delivered were considered by Internet users to be a cue the next actions they should take. Second, blog software managed the order of turn response between humans in comment threads.

In this section I will first discuss disruption caused by machine turns. This appears to be the impetus behind the majority of misunderstandings in the blog comment threads treated here. Next, I will discuss disruption caused by the non-binding listener-selects-previous turn-taking mechanism that applies online. I will provide some examples in which listeners clearly responded to previous turns which occurred a number of turns ago, resulting in misunderstandings.

What is the first turn?: Machine results and subsequent corrections

The fundamental disagreement at the center of all of these threads was the nature of the first turn to which participants should be responding. Natives identified the first turn as the blogger's post beginning the thread. By contrast, strangers, who appeared to be finding their way to the blogs through search engine results, believed the search results constituted the first turn and thus the correct context for interpreting what they found.

The counterargument, delivered frequently by natives, is that strangers are simply "illiterate" or "not reading." I will flesh out this critique more later. However, there is one particular case which makes it clear these cases should be considered more in the light of conversation repair, the behavior that takes place when a turn has been taken out of order, rather than illiteracy. It gives evidence that rather than not reading, strangers are doing their best to make sense of a conversation which is puzzling to them.

The case comes from jonsonblog's Ketchup of the People thread. In this thread, the blogger declared a love of customizeable mass merchandise such as bottles of Heinz ketchup. One native reader of his blog noted that Jonson could also order custom-printed M&Ms if he felt like it. The native linked to a site where an order for M&Ms could be placed.

The first comment from a stranger in this thread read:

13. I found the order for custom printed m & m's in the coupon section of the providence journal sunday paper. It said nothing about ordering ketchup first or anything about the blog. All I wanted was to surprise my 80 year old aunt who loves m & m's with this special custom order. What is this a scam or something? If it is, it's pretty cruel? Please respond.
by norma <> August 30, 2006 at 7:48 pm

Norma did not find what she expected when entering a "turn" which she expected would take her to the customizeable M&Ms page, so she attempts to understand the unexpected result by incorporating what she found -- a title about ketchup, with other references thereto -- into her understanding of how one orders the M&Ms. This explanation is not really satisfying to her, however; she doubts its reliability ("What is this a scam?"). It is worth noting that she doesn't appear to doubt her own search abilities; what she doubts is the person or channel she believes is responsible for the "scam" that directed her to that page, suggesting she mistrusts the Internet or its communicators. Interestingly, the impetus for her online quest comes from another literacy endeavor: reading the paper.

The blogger, jonson, notes her misunderstanding in a trackback ping from a new post he writes, which links to this thread:

'For those keeping score'
September 4, 2006
I'm now up to three strangers trying to order custom M&M's from my site. Two of them are here (including the original & the latest one), the third is here. Even though I normally try and fit every blog post into a category, I'm leaving this one in "Uncategorized", but only because I don't have a category for "Jesus Christ, are you people serious?"

Jonson and his native readers shift in their topic away from customizeable goods, beginning to speculate on how these commenters got there. In the second thread, Jonson and his regular readers exchange jokes and observations about the strangers. Reader Corey astutely faults strangers' understanding of how search engines take turns -- "people just can't wrap their minds around how a search engine works" -- rather than faulting their "reading skills."

How do they perceive how a search engine works, then? More strangers arrive, and, as it happens, leave clues:

15. how do you order those m and ms
by AEMELOIANCHIK September 4, 2006 at 6:08 am
1. I am trying to get to your website so I can order some of the custom printed M&M's. What am I doing wrong. Went to google and put in the website but was not able to get on the site. Was not able to enter the promo code EVERYDAY4.
Would like some help if possible.
Thank you.
Sharon Schundelmier
by sharon schundelmier <> September 1, 2006 at 1:56 pm
2. I am having the same problem. What is the answer>?
by beth ball September 6, 2006 at 5:44 pm

The second comment here, posted by Sharon, makes clear her assumptions about the page she is on: "I am trying to get to your website," she writes, assuming the owner of the present channel is the same as the owner of the M&Ms channel. She entered the URL for the site, so she expects to have opened a channel to the correct addressee for an M&Ms request.

Sharon then lists very clearly the steps she took in an attempt to get to the M&Ms page. This further clarifies the turn-taking process she expected to find, and how she satisficed with what she found instead. While these steps elicited an unexpected result for her, they were not by necessity erroneous. Instead of entering the URL in the URL bar at the top of the screen, she used a Google search field (not specifying whether this was on the Google homepage, a toolbar in her browser, or someplace else).

In many cases, entering a URL into Google does yield that page as the first result in the search -- a successful turn-taking exchange. However, the URL Sharon used was for a temporary offer made by Mars, Inc. By the time she searched for it, the company was no longer using that URL; the link was dead. The only remaining reference to it online was on jonson's first thread, as noted by native Jen in a later comment:

4. Did you know that if you type " into Google and search, your blog is the ONLY result??? It leads straight to "Ketchup of the People". Interesting, very interesting...
by Jen September 7, 2006 at 4:51 pm

Thus, that sole hit was what Google presented to Sharon.

Note this exchange hinges on Google's participation in the conversation: because she enters the URL into a search field rather than the field which has traditionally resolved domain names automatically, she solicits the input of Google's algorithm. Had she entered the URL into the URL bar, the conversation would have gone differently, though perhaps not too differently; depending on what browser and ISP setup she was using, the URL could have presented a dead end, or might in fact have sent her back to search results, possibly from Google, possibly ad-driven results provided by the ISP itself.

But she asked Google, and Google obliges, stitching together a conversation which might as well have taken place between Frankenstein's monster and a LOLcat: Norma's question about M&Ms leads Google to suggest Jonson's page; Norma tries to repair that bad suggestion by making the assumption that ketchup and M&Ms are related, or casting doubt on the reliability of the channel; Jonson expresses delight at her awkward repair; the topic of Jonson's discussion changes; more strangers appear and ask for assistance with the repair; natives provide other interpretations of the machine's turn to strangers.

As Jakobson notes, no part of conversation is purely contextual, phatic, metalingual, or so forth; most combine some element of other parts of Jakobson's model as well. The ketchup/M&Ms exchange indicated a metalingual, indexical, and contextual problem as well as one of turn-taking. The page to which the URL referred was gone, and hence the way in which Sharon (most obviously, and likely the other commenters as well) used the language/referent of the URL was unproductive. Entering the URL into the URL bar would also not have yielded the desired result, however; at present, entering that URL appears to yield a blank page (though what it really yields is a redirect to an image one pixel by one pixel, essentially invisible to the human eye, which appears to be hosted at an ISP whose URL does not resolve to an M&Ms branded page).

While the introduction of ketchup into a quest for M&Ms presents the strongest evidence that strangers interpreted their search string as the first turn, other threads in the corpus also hinted at this. For example, a number of strangers commented on a post on the blog OK/Cancel (ironically, a blog about user interface design) which was titled "Google Answers: HCI PhD program." Strangers requested to have Google canceled from their computers. Readers and the bloggers presented evidence that strangers had been performing searches for "cancel google." Essentially, by prioritizing headlines as a good indicator of context, a search engine had supported strangers' ratification of "cancel" as a major topic of the conversation, and the answer to the Turn 1 search string, rather than as a symbolic reference to interfaces which happened to be in the blog's title.

Natives attempted to restore the integrity of the conversation based on the blogger's post, rather than search results, as first turn:

Just because a search with "cancel Google" returns this page doesn't mean you cancel a Google account here.
The name of the site is OK/Cancel
The article discusses something -about- Google

We can be much more sure of the strangers' metalingual/referential error in reading their search results in these cases than in many other examples in this corpus. We know how Beth and Sharon got to Jonsonblog. We know there were likely very few results on Google for the exact search term they entered. The error here was definitely their assumption that because Google's "turn" in the conversation offered up only this site, this site was the correct result for the purpose of ordering a product. In many of the other examples in the corpus, the progression is not so clear. Strangers enter comments they think will reach celebrities and technical assistants, or purveyors of something they would like to buy. These comments, again, might appear to be pure literacy errors, or errors of channel rather like "wrong numbers" dialed on a telephone. However, in light of the ketchup and M&Ms case, and in light of Lucy Suchman's analysis of responses to the less-verbal conversational turns of copiers, I think it is worth considering that each of these misunderstandings might be turn-taking errors. Strangers take the context of their intent and their search query as the appropriate resources for interpreting what the machine returns, as they might if they were beginning a conversation with another person.

Natives' and strangers' disagreement about the correct first turn -- was it the search engine's, or the blogger's? -- ultimately led natives to a third turn, in which they expressed consternation that strangers were responding to the search engine. In the course of these third turns (which were often multiple), they also tried to understand strangers' turns at talk, and also sometimes insulted them. Natives attempted to make sense of strangers in whatever way they knew how -- really, in a different turn-taking context than the one employed by strangers, where the blogger's post was understood as the first turn. Natives' corrections hinged on a few interpretations of why strangers had arrived there: strangers' (mis)understanding of search engines, strangers' "laziness" in not reading, and strangers' innate "illiteracy."

Often, natives assumed that strangers were not reading, or perhaps were not able to read:

I saw that there were many posts that said basically 'We don't answer riddles, so screw off!' ...Basically. I honestly feel sorry for you that there are such illiterate people who don't understand when people tell them that one thing isn't the other. (Riddles thread)
It would seem like dyslexia and lisp go hand in hand... :) (Lisp thread)

Blog readers' imperative to strangers -- "read!" -- is made vivid by analyzing the comment threads using IBM's ManyEyes "word tree" application; this gives a sense of the frequency with which the word was used, and the context in which it was written:

Image:Read readers.png

"Learn to read" was a particularly common phrase. In general, natives were of the opinion that the major problem with strangers was that they were not reading the pages at all before commenting -- or if they were, they were failing to comprehend. One native on the Cancel Efax thread on Communications from Elsewhere spelled out this assumption:

The 'ignorant commenter' problem (I've had it too for various reasons, most associated with people deciding that a blog entry is the most effective means to get in touch with a person or company that's hard to get in touch with, as is the case here) is also self-selecting. Only inexperienced users who are put off by reading a dense text block of about 125 words are likely to post.[...]

This blog reader attributes strangers' resistance to reading to the length of the text rather than the technology. Length was blamed by other natives as well, not just in computer texts but elsewhere:

[...]people are stupid and they never read all the words in a post, or a newspaper, or directions, or whatever. They just see the word "Maury" and thing "OMFG it's maruy!!11!!!"
(Maury thread)

Yet the native on Elsewhere is referring to "inexperienced users" -- not people unused to reading per se, but people unused to using the computer. His conflation of Internet discomfort with other literacy phobias epitomizes natives' conceptions of "literacy," which were generally monolithic.

Natives sometimes took more constructive approaches to understanding strangers' presence, however. These natives were quick to refer to back to search engines in an attempt to make sense of strangers' understanding of the first turn. This tactic indicates two things: first, they thought strangers were responding to search results; and second, their preferred way of confirming this hunch was getting a machine to provide evidence:

Oh no...based on the comments that are trickling in here, I think this post has gotten some magical level of search engine saturation that will cause people to think that this post was by Clinton or Bush or whatever they thing when they start addressing the subject as if he wrote it...
Remember the "Bill Gates, Philanthropist" saga?
(Deane, the blogger on the Bill Clinton thread)

Natives checked where in the search results these threads were showing up; they tried search terms which seemed to match the interests of strangers, and often found that the threads were in fact highly ranked, on the first page of results if not in the top three. They reported their findings back to the comment threads, or in their original posts:

Here's why this is happening I think:
This page is number two on Google for "philanthropist." I think that people looking for money search for this term, find this page. Why they think posting here will help is still unknown.
(Deane, blogger on the Bill Gates thread)
You're the #1 pagerank of AOL's search engine, too (although it's really just Google), so that might explain something.
(Overhaulin thread)
Explanation of comments (Oct 4) This page now shows up as #1 for "cancel google".
(added to original OK/Cancel Google post by blogger)
I really couldn't figure out how everyone was finding their way to that post. But this morning I googled "cancel my account" and guess what's the number one result? Thanks Google.
(trackback to JoCo thread, written by the blogger himself)

Natives also made recommendations to each other about how adjust search engine ranking in order to keep strangers from finding the site. One blogger tracking back to the Maury thread suggested a change in blogger literacy practices, one which this thesis itself is slowly approaching:

The lesson here is that, for now, bloggers should be careful about the titles of their posts less they get too high a Google page ranking. I say ?for now? because, before too long, the porn sites will figure out this little trick...

One blog reader wrote about a piece of HTML code which could be inserted in the header of the blog page in order to keep search engines from indexing the web page in their database, and thus keep an unintended audience from finding the web page through a search engine at all:

This is what <meta name="ROBOTS" content="NOINDEX,NOFOLLOW" /> is for. ;)
Posted by: dear overhauling <> at May 23, 2004 05:10 PM

Note that manipulating the HTML in this way is a literacy practice available only to the bloggers in these conversations, and in fact in this day and age not every blogger may have the ability to do so; that requires access to some code deeper than a given blog post, a privilege not always afforded to users of hosted blog sites like LiveJournal (though LJ does, of course, afford other kinds of privacy features).

Sometimes, natives who thought poorly of the first turns taken by the search engines instructed strangers to try different search engines:

If you aren't familiar with google <>, perhaps you should be.
(Josh, Bees thread)
I can't figure out how all these people got here in the first place. This page doesn't appear until the fourth page of a Google for "Maury." Granted, I don't thin kmost of them were using google, as it's mostly used by the at-least minimally internet-savy (or those who live with them), but I doubt it's any higher on other search engines.
(Maury thread)

Using Google was a common suggestion when natives recommended search engines, suggesting that readers preferred Google's search engine to Yahoo! or AOL. However, natives believed that search engines were ultimately only as strong as strangers' skills at interpreting the meaning of search engines' second turns:

Folks, just because Google sent you here doesn't make it accurate.
(added to original post on Dean Kamen thread)

The choice of the word "accurate" is interesting here: it suggests the blogger conceives of Google as a mediated source of information. Contrast this to the idea of Google as a conversation partner.

Human-turn misunderstandings in blog threads

As hypothesized in the section The Machine Takes A Turn, above, listener-selects-previous seems to supplant speaker-selects-next as a rule for organizing conversations online. In this section I will give data to support this hypothesis from an analysis I performed of response patterns in the data. A trend appears to emerge from this analysis. It looks as if strangers, rather than not reading anything else on the page, slightly change their approach to the subject matter in response to the comments which appear on their screens when they have scrolled down to find the comment box: if they see comments which correspond to their understanding of the first turn (their search engine query), they state their query confidently; if they see comments which tell them this is not the right channel to pursue their search engine query, they temper their comments, changing who they address or expressing doubt in the channel.

Following that analysis, I will look into a few cases of human-turn misunderstanding which suggest the ramifications of a conversational rule which maintains less strict control over response time and allows for more misunderstandings to be left unresolved.

Within the comment threads themselves, exchanges in which more than two human users took turns explicitly in response to each other were rare. This may seem to contradict the basic three-turn structure assumed by this thesis, but the reason for this is a technicality. When a blogger or stranger took the third turn (recognition of an error, e.g. "Why are you here?" or "Can't you read?") it was often not specified to whom they were responding. Frequently, the subject of the response was plural (e.g., "What is wrong with you people?") So it is problematic to present many of these clear responses to something that came earlier as a response in any one exchange.

Even rarer were conversations which went on for four or more turns -- particularly where a stranger, having read a native's third turn, responded; but even when bloggers and readers responded to each other, exchanges did not generally last long. The exception was hijacked threads, where multiple turn sequences happened more often. (This is almost tautological, of course, as I defined hijacked threads as ones in which strangers were much more active and did more work to cement their own meaning of the thread.) Hijackers were distinct from other strangers in forming very vigorous communities to price and sell goods, gather information, and commiserate about their problems. These threads became conversations among willing participants, rather than angry exchanges between two distinct groups who didn't agree with each others' conversational habits.

Taking the assumption, made earlier, that online conversations (and perhaps all conversations) proceed based on a listener-selects-previous mechanism, it seemed of interest to investigate who was responding to whom, saying what, in these threads. To investigate this, I coded three of the longest threads (Harry Potter, Maury Povich, and Overhaulin') and developed visualizations of responses using GraphViz software.

These three threads were selected because of their length; at the time they were captured for analysis (at which point further comments which accrued during research were ignored) they had the most comments out of any in the corpus, the Harry Potter thread having 370, the Maury thread 716, and the Overhaulin' thread 256. Two threads which approached the length of the Overhaulin' thread were considered but not coded in part to keep the amount of coding manageable: the Utterlyboring AOL thread with 252 comments, and the On Lisp Online thread with 231 comments. Because it was largely a hijacked thread, like the Harry Potter thread, the Lisp thread was assumed to contain similar patterns, with higher numbers of interactions between hijackers, so coding it would be redundant. The Utterlyboring AOL thread was unusual: there were many more instances in which it was not clear whether commenters sided with the blogger or were speaking as strangers; and putative employees of AOL appeared in the thread, defending the company and driving the discussion somewhat off-topic. This muddied the waters of coding responses, particularly when it came to stranger and native camps. Thus, this thread was not assumed to be as illustrative of most stranger-heavy threads as the Maury and Overhaulin' threads were.

Each comment in these threads was given a number (by the Atlas TI software) and coded by the researcher to indicate to whom the commenter was responding. When a commenter responded to the original post, it was coded OP. A commenter responding to an unspecified comment or number of previous comments was coded PG (previous-general). Commenters responding to the thread more generally were coded MT (meta). Writers responding by name to a previous commenter; writers referring to a topic which was only mentioned in one prior comment; or writers referring to a topic which was only mentioned by one other commenter since the writer's last post were all coded as responding to the specific number of that previous comment.

Coding for response in this way made it possible to generate graphs visualizing response patterns; see Appendix C. In these visualizations, colored nodes indicate the order of comments in the thread. Blue lines indicate response to a specific previous commenter. Turquoise lines indicate response to the first post. Grey lines indicate response to an unspecified previous comment or general theme raised in the thread.

As can be seen, responses to specific previous comments are generally rather rare. (It is worth considering that out of all 39 threads, the threads I graphed not only had the most comments, but also appeared to have the most responses to previous commenters.) Responses to comments within five or six comments previous were more common than responses to comments which came much earlier in the thread; the average number of turns-to-response is relatively low. This begins to suggest that website visitors only read comments which are within the window once they scroll to a point where the comment box is within their field of vision on the screen (roughly the bottom of the page).

At some point a slightly different tactic in coding responses presented itself. Having coded who commenters were addressing, it began to seem as if strangers had moments of doubt when they were trying to reach celebrities or technical assistance. They would preface their comments with "I am not sure this is the right page," or "Maybe you can get this letter to Maury" -- they expressed doubt of the channel. At times this doubt seemed to be correlated with the number of natives who had recently argued this was not the correct channel. Strangers also sometimes seemed to switch en masse from addressing celebrities in the first person to addressing them in the third person, or addressing an ostensible assistant to the celebrity.

This tempering or avoidance of direct address suggested an awareness of their surroundings which natives did not believe strangers possessed. Were strangers in fact reading all of natives' admonitions? Rather than causing strangers to give up and stop commenting, did this simply cause them to temper the certainty in their comments? Or did strangers at times simply stop commenting when natives told them to leave?

In an attempt to answer these questions, I coded the entire corpus again to capture the arguments strangers and natives were making about the channel -- their phatic arguments. The codes I gave arose from general patterns in these arguments. They were (throughout the entire corpus):

  • expresses doubt in the channel (17 instances)
  • if you're reading (17)
  • is not reading (24)
  • is this correct channel (8)
  • this is right channel (9)
  • this is wrong channel (102)
  • what is this channel (4)

Next, I began to incorporate these codes into the visualizations made of turns-to-response, combining them with codes about directionality of address (first person versus third person) in hopes this would yield a complete picture of the confidence with which commenters argued for their reading of who they should be addressing on this thread. Assigning solid colors to the nodes on the GraphViz graph based on these codes offered a way to view large parts of the thread at once and begin to see patterns. I employed the Brewer piyg11 color scheme (11 colors, magenta to green) to differentiate these combined codes. (CITE) The combined phatic-doubt-and-address-direction codes were as follows:

stranger or hijacker addresses celebrity in 1st person, OR "this is right channel"
stranger or hijacker addresses show (seen as slightly less confident than addressing celeb)
stranger or hijacker addresses the show or celebrity in 3rd person
stranger, "if you're reading;" OR hijacker comment (other)
stranger or hijacker, "is this right channel" OR "expresses doubt in channel"
stranger or hijacker, "this is wrong channel"
reader status unclear, addresses show/celeb, regardless of 1st or 3rd person
reader status unclear, "if you're reading" OR other statement of doubt
native makes a joke
native comment (other)
native OR reader status unclear, "this is wrong channel"

The results of this color coding are also visible in the graphs in Appendix C.

More intensive statistical analysis would be required to demonstrate there was a definite correlation between natives saying "this is not the right channel" and strangers changing to the third person or expressing greater doubt in the channel. My hypothesis from eyeballing these visualizations is that the tendency to only read the most recent comments in the thread, suggested by the turns-to-response analysis above, does have an effect on strangers trying to figure out whether this is the correct channel. It appears that doubtful or third-person comments -- paler pink nodes -- are more likely to occur when more natives have strongly expressed that this is not the right channel within the half-dozen-or-so most recent comments. It might also be the case that the strongest correlation between expression of doubt and natives' admonitions comes when a native's wrong-website argument appears specifically at the top of the screen once the stranger has positioned the comment box near the bottom of their browser window.

At the moment these are just speculation. More effective analysis of this question would require a few changes. First, analysis should be based on the number of lines of text, not the number of comments. Comments vary in length, and thus the number that appear onscreen in a given window will vary, making the analysis I performed less reliable.

Second, some sort of estimation of the average size of users' windows should be taken into account. Screen "real estate" also affects the amount of text on screen, which affects what is visible when using the comment box.

Ideally, such an analysis would be performed in conjunction with, or brought into comparison with, eye-tracking studies of users reading the screen. This would give a really definitive explanation of what strangers are reading -- and whether or not they are taking the wrong-channel messages written by natives into account.

I will now transition to looking at specific examples of exchanges. These examples, while still demonstrating that strangers did in fact read previous comments, illustrate the possibilities for misunderstanding introduced by the openness of the listener-selects-previous model.

The first example is from the Josh Server thread, one stranger wrote:

19. Heather says:
Josh, if you're still out there. You were my first crush. I didn't do my homework until after 6:30 to watch the all that show. email me if you ever check back. <> I do realize you wrote this more than a year ago.
December 31st, 2003 at 12:34 pm

The time stamp on this comment is puzzling; it is not clear to which piece of text this stranger is responding. Prior to her comment, there are three pieces of text which could be considered to be from actor Josh Server: the blogger's original post (written on October 9th, 2001), in which he says he is not Josh Server (though his name is Josh), and two comments from strangers claiming to be Josh Server, to which the blogger has added warning disclaimers:

4. josh server <http://sorry%20guys%20no%20home%20page> says:
hey thanks guys its josh here, well hope all is going well just browsing through the pages and thought id leave a little message to get yuo guys all hyped up....its actualy qiute funny looking back on all these photos of i was crazy, i dont know what i was thinking, anyway take care.
the joshinator
[Probably not from the real Josh Server, folks. -Josh]
October 26th, 2002 at 10:10 pm


17. Josh says:
Boy, why you taking my name.....all these girls love me not u hahahaha
[Again, probably not from Josh Server.]
December 4th, 2003 at 12:22 pm

It is possible that Heather is responding to the original post, or the comment written less than a month before her own. However, considering that she refers to text which was posted "more than a year ago," and the original post was written more than two years ago as of the time of her comment -- not to mention the fact that the blogger clearly states that he is not Josh Server -- it seems most plausible that she is responding to comment 4. This may also be supported by her comments "if you're still out there [...] email me if you ever check back:" she seems to assume a commenter who might return to the thread sporadically, rather than one who is tied to the thread, as a blogger might be. Her selective attention to one of these three comments demonstrates the openness of online communication to picking a conversational turn to respond to.

It is also worth noting that if she is responding to comment 4, she is ignoring the note at the end of the comment which attempts to discredit the claim that it was written by Josh Server. And why shouldn't she? Aside from an understanding that there may be unreliable narrators in fiction, what precedent is there in a traditional literacy toolbox which helps us to interpret a text which attempts to discredit its own validity? More to the point, what precedent is there for this kind of self-undermining in conversation? Aside from Josh-the-blogger's emphasis and brackets, how are we to know that these are messages emanating from separate people? Isn't a comment usually written in such a way that it has only one author? What faith should we place in a message which seems to contradict itself, but signs its own name on both sides of the contradiction? There are few writing conventions for making such claims.

Josh Larios and the deceptive "Josh" commenters muddy the waters of context: there is a Josh who owns the blog, a Josh who is an actor, and a third party or parties trying to redirect attention intended for Josh Server. Josh Larios tries to claim authority by pointing out it is his blog, his original post, and questioning the authority of the interlopers; however, there's a lot of text on the page, and the headline (which he wrote himself!) plays into the literacy practices of the search engine, and of its users who (if the M&Ms/ketchup example holds) treat search results as "the right answer," the correct referent for their search. Josh-the-blogger is fighting a losing battle arguing for his own first post as the first turn to which other turns should respond.

Another example of human-turn misunderstanding appeared on the "Answers to Riddles" thread. One stranger, Colton, did not appear to recognize that answering riddles was not the blogger's purpose for the site (rather, she had posted, without context, a list of answers to riddles which she had found in a Cameroonian book). He was corrected by Unsinn, another regular reader of the blog, who had already theatrically corrected other strangers:

Oh Colton, if only you could answer the hardest riddle of all!
(hint: it's about information literacy)

Colton appears to have taken this as a challenge. He chose a previous riddle posted by a stranger in the comment thread, and delivered the following response to Unsinn in his next comment:

'97% of Harvard graduates can not figure this riddle out, but 84% of kindergarten students were able to figure this out in 6 minutes or less. CAN YOU GUESS THE CORRECT ANSWER?'
ive never known a kindergarten who knew what information literacy is, but if you were in harvard, that might be what you would have said

Whether Colton's answer is facetious is unclear; however, on the surface, it appears he may have attempted to pick out the hardest riddle previously posted on the page (chosen because it supposedly defeated the graduates of an elite university?) and relate it to the reader's request.

Finally, the "How To Sell A Wedding Dress" thread provided an example of the confusion which could be introduced when participants ignored the possibility of picking from among previous comments (or, as Marcoccia calls this kind of selection, polylogues).

The Wedding Dress thread was a hijacked thread; the blogger saw strangers' activity as interesting, and discouraged other natives from snapping at or redirecting strangers. Thus, there were a few multiple-comment exchanges on this thread, as natives and strangers talked amongst themselves about wedding dresses.

One exchange stood out, however, as the stranger genuinely appeared to misunderstand specific other comments (not the original post) on the thread. This stranger began the exchange with her own comment:

89. miss b <> Writes:
August 10th, 2005 at 2:28 pm
My Fiance decided we needed to postpone our June wedding back in April to deal with some "issues" he was having....well 5 months later I find out that those issues were another women. He has been cheating on me so I left him and now I need to sell my wedding dress. I didn't do it earlier because I thought he would get this shit together and we would get married. But he would rather sleep with a married whore. Let me know if your interested. Im asking for $400.

Miss b then provides two more links to pictures of the dress she is selling.

It is not clear to whom the commenter immediately after her was responding:

90. Aaron V. Writes:
August 10th, 2005 at 3:00 pm
Even though I'm getting married next week, that dress looks a *bit* small for me, especially on that model.

Indexically, "that dress" and "that model" are vague in their reference. "That dress" could refer to the dress miss b is selling, the dress worn by the jilted groom in pictured in the original post, or just about any of the other dozens of dresses mentioned in the thread to date.

Regardless to whom he was addressing his response, the implication that the commenter, who presents as male, might wear a wedding dress suggests this comment can be read as a joke in most heteronormative circumstances. However, miss b uses the turn-taking sequence of the comments to orient her indexical reading of "that dress," rather than using the name of the commenter. She reads Aaron's comment as a straightforward response to the pictures she provided:

91. miss b Writes:
August 11th, 2005 at 6:44 am
The dress is a size 16. Those pictures are from the website

Had miss b returned to comment a few turns later, after intervening commenters had mentioned or linked to pictures, her mention of "those pictures" would have been indexically weakened; we would have even less of an idea which pictures she was referring to. However, because these comments come immediately after each other, miss b's comment comes across pretty clearly as a misunderstanding of how a sequence of apparent turns makes sense in a comment thread; or else Aaron's joke, Aaron's gender, or quite simply the his unmoored use of "that."

In a face-to-face conversation, of course, this misunderstanding might have been repaired quickly. Participants might have re-identified who were the objects of each comment; they might have drawn on gesture, gaze, or pauses in the conversation to help correct the misunderstanding. However, as Marcoccia notes, there is very little pressure to repair errors in online conversations. Hence, this particular exchange ends there. (Making this sort of a marginal case, by the rules of what is allowed in the corpus: it does not include a final turn in which a native tries to clue miss b in to the joke.) It does not incur any stigma for natives to fail to respond to miss b; and who knows whether she ever noticed her error, sold her dress through this or some other site, or even returned. The lack of mutual context fails to bind the stigma of a misunderstanding to the identity of any of the participants.1

1 With perhaps two exceptions in the corpus. In these, commenters indicated that they expected a response and expressed consternation when it did not arrive. In the most striking of these, the commenter, classified as "a stranger," actually returned over the course of a few days to follow up on his first comment:

7. Adrio says:
I am 11 and i will not tell you what state I am from.[...]Could you please just give me some info and percausions?I would like to know how long after they are hatched(as adults of course)they learned to fly.If you do not know i understand. Thank you.
September 4th, 2002 at 7:43 pm
8. Adrio says:
Are you going to anwser me or not?????????????????????????????????
September 8th, 2002 at 8:39 pm
9. Adrio says:
September 10th, 2002 at 3:35 pm

It may be worth noting Adrio's reported age; as he is young, he may not have developed a sense of the lack of stigma about nonresponse. The only other stranger who expressed concern over a lack of response indicated that he was deep in his cups while he was commenting.


While the last section covered elements related to conversational management, this section will be more concerned with analysis within the categories defined by Jakobson -- specifically the semantic referents intentionally ignored by conversation analysts.

Jakobson considers context, or referent, to be its own linguistic function; in his diagram of language's elements, he separates it from channel, addresser, and addressee. However, he contradicts his dividing-up of functions by claiming also that no linguistic function is fully independent of the others.

Addressers, addressees, and channels in communications all have referent; they can usually be indexically identified in the local context by participants. This section will discuss issues which arose regarding the "correct" identification of addresser, addressee, and channel. Each of these elements of context routinely proved problematic in these comment threads, and were contested by natives and strangers.

The phatic, or channel, element of conversation was a bone of contention between natives and strangers. Natives often argued that strangers misconstrued what channel they were using. They mustered evidence, using the machines themselves, to support their case. Strangers, meanwhile, left ample evidence of their understandings of the websites they were using, which were sometimes conflated with email, chat software, or even word processors.

Simple indexicality was a huge concern in these threads. Natives and strangers argued over the correct use of indexical words. Also called "deictics," or "shifters," indexicals are words like "you," "me," "s/he," "that," "this," and others which change frequently in a conversation, sometimes (in the case of "you" and "I") changing from conversational turn to conversational turn depending on who is speaking them. Not only were these everyday conversational terms contested in these threads, but other means of indexing oneself in time and space were also at issue. Natives and strangers used email addresses, website URLs, and even street addresses quite differently to establish their identities and the identities of others.

Beyond these elements, natives and strangers expended a good deal of energy creating the context of strangers' comments. Natives were prone to represent strangers as illiterate women who used AOL. Strangers, by contrast, identified themselves as part of a community of well-meaning, God-fearing television show fans.

The net effect of these arguments over channel, indexicals, addressers and addressees was to delineate two starkly different contexts from which each group was drawing. Each group defended against the other's interpretation of proper context. This is a Latourian move: mustering evidence for texts, channels, and other actors in order to claim authority in how they are being used. The disjunction between these two contexts suggests a re-reading of Innis's observations about media, time, and space: ultimately, the time and space disruption in conversation which the Internet supports makes possible the unhappy, simultaneous coexistence of multiple contexts, and never the twain shall meet.


Any medium is a communications channel; matters regarding Jakobson's phatic (channel) function thus arise in mentions of the medium itself. In straightening out miscommunications, strangers and natives repeatedly referred to the channel. Strangers in particular generated a great deal of talk testing out, describing, or otherwise signaling their understanding of the comment channel they were using, through forms of speech, phatic-related messages ("is this thing on?", "testing, testing, 1,2,3"), and descriptions of how they were using search engines and comment threads. They directly and indirectly made it clear that they did not understand what kind of computer medium they were using, or how it was going to deliver their message. Some of them negotiated the possibilities of having other people deliver the message for them, or pressed on posting their message even when they had doubts as to whether it would go through. Natives, in turn, spoke up to correct their understanding.

Allusions to non-Internet channels

There were moments when strangers when strangers used language more common in verbal communication than to written communication. Like the response to the machine-turn as Turn One in a conversation, this lends credence to the possibility that strangers might be conceiving of their interaction with computers more conversationally than natives do.

Hello, yes I do watch your show Over Haulin alot with my wife sitting beside me.
(Overhaulin thread)
Yes I was wondering what the name of a spider is.
(Spiders thread)
WEll... im trying to prove to my lame a55 ex that hes the father of my child...
(Maury thread)
well maury i love your show and i just wanted to tell you i love you show
(Maury thread)
well if this is maury well my name is Tara
(Maury thread, posted by blogger as he originally received it from a stranger by email)

What all of these phrases have in common is that they begin with a word or two which does not do any semantic work in the message. These introductions appear to be phatic: in spoken conversation, they might serve to get the addressee's attention.

How does one get the addressee's attention on the Internet? In most live (synchronous) chats, the software does this for you; it may flash or make a noise to get attention; at the least, the text moves up one line, and the shift may catch the reader's attention.

How does one get attention in asynchronous writing -- even in non-digital print? Beyond large-font titles, boldface text, headlines, cover design, and other external or visual elements designed to catch the eye from a physical distance, it seems print has clasically assumed it already has your attention. Letters begin to draw the reader's attention with the "Dear so-and-so" salutation (discussed in the poesis section). What strangers are doing here is something else entirely, much more like spoken attention-getting, which seeks to assure the speaker can control the current turn.

Another comment on the Maury thread hints at a concern with turn-taking:

well guess i will let u go hope to hear from u soon,.
(Maury thread, emphasis mine)

This is a phrase one often hears on the phone, when the person on the other end thinks they have taken up too much of your time. The speaker could be using it as a pat phrase, but it could also indicate a sense of contemporaneity between the addresser and addressee, with an attendant concern for not spending too much time on one's turn.

How many other written texts express a concern for the reader's time? And which ones do? Mail solicitations from charities come to mind; sometimes advertisements. Do "objective" texts -- newspapers, scientific articles, legal documents -- ever voice the writer's concerns for the reader's time? 1

Besides beginning their utterances as if speaking, strangers also expressed a lot of confusion about the ways the channel was working when they referred to the channel itself. Sometimes before launching into a comment, strangers would test the channel to see if their message would actually appear:

hey can you read this?
(Lisp thread)

Sometimes strangers remained unsure of the channel even after posting a message, so they posted again to be sure their message got through:

Hello. I have had a lisp for a long long time and i want it to go away!!!!!!!!!!!! i get made fun of every day at school because of it. can you or someone tell me how to get rid of it!?!?
Posted by: someone on November 19, 2005 12:30 AM
I might have already left one... but anyways... I have a lisp that i cant stand. i get made fun of every day because of it. if anyone can help plz email me suggestions.
Posted by: someone on November 19, 2005 12:37 AM
(Lisp thread)

Further channel confusion manifested in comments posted multiple times without awareness of such posting. Some repeats were made verbatim, with the same text used repeatedly. In one case, a stranger gave a clear indication that he did not know he had posted the same text repeatedly:

Do not want Google. Take me off please.
by nelson November 4, 2008 8:41 AM
Do not want Google. Take me off please.
by nelson November 4, 2008 8:42 AM
Do not want Google. Take me off please.
by nelson November 4, 2008 8:42 AM
This is my second try. Please cancel Google from my home page.
by nelson November 4, 2008 8:44 AM
Do not want Google. Take me off please.
by nelson November 4, 2008 8:44 AM

By the time he notes it is his "second try," nelson has actually made four posts to the comment thread. The identical text in the previous comments suggests that nelson was either cutting and pasting the same phrase, or simply re-pressing the "submit" button on the comment form. Whether he did not notice the previous comments or the system was not posting them for him to view is unclear; regardless, the mismatch between his observation and the system's recording of his comments makes it clear he is having trouble making sense of the channel.

Bloggers and readers labelled re-posting as an error. In the Maury Povich thread, the blogger at one point notes that a previous stranger had posted a comment 50 times. It is not entirely clear which stranger the blogger is referring to, as he appears to have deleted duplicates. There is one copy of the comment immediately prior to his. Prior to that, though, there are four comments from the same stranger. This stranger, who posts in all-caps, interestingly does not simply copy and paste the same comment multiple times. She makes a variety of small and major changes to the text each time she posts:

Posted by: TIFFANIE SMITH on October 30, 2003 1:54 AM
Posted by: TIFFANIE SMITH on October 30, 2003 2:16 AM
Posted by: TIFFANIE SMITH on October 30, 2003 2:19 AM
Posted by: TIFFANIE SMITH on October 30, 2003 2:40 AM

Exactly why Tiffanie felt she needed to post at least four (and possibly fifty) times is unclear. The third post makes it appear she may have been interrupted by pressing the post button too soon, or else cut some text inadvertently before posting. The changes made to the third and fourth posts make it look as if she may have been trying to correct some of her spelling. The first post is much shorter than the subsequent three; it may be that she decided she had more to say.

It might seem that Tiffanie was not looking at the page to see how many times she had posted. However, her correction of her spelling, her changing of her writing tactics, and the repeated apparent cutting and pasting suggest that she was looking at the page. The difference between this stranger and the blog's regular readers seems to be a different tactic for correcting one's errors. Tiffanie leaves evidence of her multiple revisions on the site, instead of changing them all before pressing "submit." The permanence of this record appears not to bother her the way it bothers the blog's readers. It might be possible that she was treating the "submit" button as a "save" button, so that she would not lose her writing before she finally "sent it to Maury;" she might not have been aware of the permanence of the text she was submitting. However, we cannot be sure from the evidence she leaves.

In a number of cases, particularly the ones where strangers were asking to have accounts canceled, they left evidence of how they believed various aspects of the Internet and computer channels worked. The simple action of typing out a request for someone to cancel their account in a comment thread or through a text box is itself an indication of this assumption: the strangers who did so assumed the request should be made of another human being, in language, rather than, say, through a series of checkboxes and buttons on a computer's dialog box.

In the grand scheme of things, this is not an unreasonable assumption! Most of the technical assistance threads begin with the blogger's description of someone's difficulties in getting a service to cancel their account; these long-suffering natives talk online to human tech support agents. One way or another, human beings are ultimately (at minumum, legally) responsible for web service subscriptions being called into account when charges are made. The problem here is that strangers have selected an unfortuitous channel for such a request. As natives repeatedly point out, it would be difficult and unsafe if they responded to strangers' cancellation requests; strangers would need to give them highly personal information such as credit card numbers, and natives, if they called AOL for the strangers, would not be doing any negotiating strangers couldn't do themselves.

Some strangers asked to be "taken off the list" or "deleted from your files" when asking to have Google "canceled" from their machines. The misconception here is that the appearance of Google on their machines was related to some internal information at Google, rather than being caused by software installed locally on their own computers. One wrote that he thought Google had "erased" his home page, and another that Google had canceled his Yahoo account; in fact, the page was not erased and the account not canceled, but a URL in the browser preferences had likely been changed.

Here are a range of statements implying strangers' understanding of how Internet elements work, with explication of assumptions:

please get google off my web site (OK Cancel thread) Google's presence can be controlled by a person other than the stranger; possible misunderstanding of what computer element constitutes the stranger's "web site"
I had no idea that i was signing up for google home page i already have a home page yahoo. (google gadgetopia thread. This comment was copied verbatim and pasted into the comment box by another commenter; neither left contact information) "Signing up" is the means of changing a home page; a "home page" is the default page in a web browser; the home page is possessed by the computer user
Please cancel my account and home page because I have not been able to use my pc or get me email for two days (google gadgetopia thread)
The presence of Google's homepage means the speaker has a Google account; it prevents the stranger from accessing other pages, suggesting the user relies on links and not on entering URLs for navigation
please remove me from Google home page</blockquote>(google gadgetopia)
Information about the speaker is resident in Google's files, and this is why the "home page" appears; the page is akin to a (magazine?) subscription service
cancel google.... It keps tryimg to cancel my yahoo acct. I do no want google I cannot remember my password or anything else... please get off of my computer. a neighbor signed me up for google and i do no want it. go way(google gadgetopia thread)
The presence of the Google home page disables or inactivates the speaker's Yahoo account; the speaker appears to be relying on autofill to maintain a password; again, the presence of the "home page" on the speaker's computer implies an account with Google;
I want to cancel Google and put me back to yahoo Google has messed up my whole way of running my computer I want to go back please thank you--- to yahoo(google gadgetopia thread)
The presence of the Google "home page" pervades the computer system; the user has an inflexible set of schemas for computer use
I've had Yahoo as my home page for years, and reinstalled it when I noticed I wasn't gettin e-mail, and thougt Google was cancelled, but every other day it comes back up as my home page. TAKE IT OFF NOW!!!!! (google gadgetopia thread)
Google can be cancelled; lack of email implies a need to re-install software; Yahoo's home page is software to be re-installed (rather than a URL preference); re-installing software is a solution for not having the correct home page; a human agent can help
I am unhappy about Mozilla Firefox changing my Home page. It was set up for Hughes Net and you changed it to Google. If I had wanted Google I would have set it up a long time ago. I now want Mozilla Firefox set up as Hughes Net on my computer again. I hate Google (google gadgetopia thread)
Speaker (probably) correctly identifies that Firefox is involved in changing the default ("home") page; speaker implies that a human agent ("you") is involved in the change anyway; speaker denies personal involvement in changing page; a remote human agent can reset the speaker's preferences; Firefox can be set up "as" a service provider, or more likely its home page
I want to cancel my Google account. All I wanted from them was their Autofill feature. Somehow I got into iGoogle by mistake. I am so confused with everything I just want out. PERIOD. I may come back later for their Autofill feature, but for now get me out of my misery. JUST DO IT NOW! (Jonathan Coulton AOL thread)
Autofill is a feature only available to those signed up with Google accounts; a human agent can help cancel the Google account
Human agent can help the speaker remove a single interface element of a larger service which can be canceled
I did not ask for a google account I dont know how it got on my computer i WISH TO CANCEL NOW. (Jonathan Coulton AOL thread)
Speaker did make any actions which would change the preferences on her machine
i need to restart my account coz there's a problem with my account when i need to delete some message or when i need to delete any thing, it dosen't work (Jonathan Coulton AOL thread)
Accounts, like computers, can be "restarted;" difficulty deleting messages is a problem with the user's individual account, rather than the service
please remove aol from my computer. my son got you n it and now i can't get anything else. i am not pluged into the telephone line. i would like to try you out if it will wok comletely without it the telepone line. thanks (Utterlyboring AOL thread)
AOL prevents the user from accessing other services; the problem is not with the lack of telephone service; AOL might work without an Internet connection (though the speaker may be attempting to use broadband, wireless, etc)
I just want all google wiped off my computer as from now 5 /6 2008 i have been trying for over an hour now so PLEASE DO IT NOW (google gadgetopia thread)
"All Google" is something resident on a computer, and can be erased; human agent can help
Cancel the entire Google thing. It hogs everything. Can not get to the sites I want because Google overtakes them. (google gadgetopia thread)
Google's "overtaking" and "hogging everything" (taking up screen space? memory? bandwidth? again, replacing the home page?) prevents the speaker from getting to other sites
take me off your AOL account (Utterlyboring AOL thread) An "account" belongs to the company, or to someone other than the writer
I just wanted to leave my 2 cents- thanks for the laugh! Sara plz leave my e.mail address out! (Spiders thread)
The blog author, not the blog software, is responsible for posting an email address once the commenter has written it into the email field and submitted their comment

Some strangers requested responses or cancellations without giving all of the requisite information needed for someone to get back to them. For instance, one stranger trying to cancel AOL gave his password, but not his account name; one stranger on the Maury thread requested a response from Maury without providing any contact information. Especially in the cancellation cases, this seems to imply a stranger's understanding that the Internet service can determine who s/he is with incomplete information.

There were a number of stranger comments which seemed to overtly indicate that strangers believed they were writing private email, rather than posting publicly. The second comment of one stranger on the Dean Kamen thread suggests this most strongly:

Dear Mr. Kamen:
Effective immediately, I would like you to remove my letter that you posted online. That letter was supposed to private and confidential.I urge you to remove it immediately. I do not want you to use me Sir. You have never responded to my letter. Remove it immediately.
Patrick J. Udeh, Ph.D.

This commenter clearly expected his message to Dean Kamen to be sent privately, and attempts to remedy the public posting of the message. Another comment from the same thread does not indicate the stranger commenting thinks that s/he is currently sending email, but may generally confuse email with surfing the Internet:

I liked to watch National Geographic in Cable channel, then of there feature is Dean Kamen " inventor", as i watch it and hear all the story about him. I told myself this is guy is my "inspiration". His 21st century, a combination of engineering and computer knowledge. Everytime i check on my email, i see to it that I browse on his name and read all his achievement. [...]

Unless the last line indicates that the commenter has subscribed to news feeds specifically about Mr. Kamen, this comment appears to indicate confusion. The phrase "browse on his name" especially indicates confusion, as "browsing" is more often associated with web pages, not email. The stranger may be surfing the web, believing that what s/he is doing is called "reading email."

Not all strangers displayed this kind of confusion. Some, on the Maury Povich thread, were aware that posting to this thread is not the same as email, and withheld their messages while they asked for access to a more appropriate, private channel. One says she has "been trying for hours to find a site that I can send a private story on, but have failed." Another begins, "Maury, Hello,I would like to have your e-mail address." Unlike the comment from the Kamen thread above, these commenters are at least clear about the level of privacy offered by this channel.

Some strangers wrote as if to the celebrity or company, but requested an email address. While this group of strangers did not mistake comments for email, it is not clear where they thought their comments were going, or whether the comments would be public or private.

Finally, one stranger was quite frank about her confusion about the channel:

he is so cool
what kind of webpage is this
(Josh Server thread)

Channel-appropriate poesis; or, print literacies

Just as strangers sometimes used language which indicated a more verbal set towards the conversation, so too did they use written forms of address which are customary in traditional print forms, including letter-writing and newspaper want ads. While they wrote in pitch-perfect style for these print forms, natives were not appeased. Natives overlooked these stylistic forms and instead attacked strangers for their poor spelling, grammar, and punctuation.

In a number of ways, strangers hewed more closely to traditional print forms than natives. They employed traditional correspondence salutations and closings and newspaper writing styles; and they gave clues that they were not all as uneducated as natives thought.

When writing to a celebrity, strangers frequently employed structural elements of postal letters. Comments often began "Dear (name of celebrity)," with appropriate punctuation and line spacing afterwards for a standard letter. They were also disproportionately more likely to use more informal greetings like "hi," "hey," or "hello" than natives were. (That is, when natives were being serious; in their parodies of strangers, natives were about twice as likely as usual to begin with hi, hey, or hello; this element of natives' parodies was among the most accurate, while other elements of their parody were wildly off-base.)

Number of casual greetings used by each group, throughout the corpus

# of commenters in corpus
Strangers 1354
Natives (serious)
728 (all natives)
Natives (in parody)

Strangers sometimes also closed with traditional postal sign-offs, such as "sincerely," or "Best Wishes." More than one young commenter signed her post "love, x." One signed "LOVE, ME!" though she had chosen to close an earlier comment in a different textual tradition: "I WENT BACK TO THE CLASSROOM AND THE END". Sometimes, strangers signed their names within the body of their message as well as adding it in the "name" field, thereby posting their name twice.

The most traditionally flawless of the postal literacy displays in the corpus was the following letter, which even included the proper line breaks for a formal letter and included a postal address. Like traditional postal letter structure, this helps to index the speaker and reader in time and space:

Dear Maury,
I know what you are thinking. It's that time of year where everyone is fundraising. My name is Alvena Willis and I'm not just another fundraiser, but I am a twenty-one year old, African American senior at Saint Martin's University in Lacey Washington. I will be graduating spring of 2006 with a major in psychology. Just to name a few things I have participated in include the creation and development of a diversity and equity office, and even led a mission trip to Mexico where we built houses through Amor Ministries.
Next semester I have plans to study abroad in Australia in order to further develop and expand my knowledge on international views on issues such as diversity, religion, and psychology. Another near future endeavor is graduate school where I will pursue my masters in counseling with a focus on school counseling, and earn teaching credentials for psychology. I have plans to use this education to serve as a high school counselor, Pastoral Counselor, and to teach psychology. With in that time I will also work on earning a doctorate in psychology in order achieve operation of my practice.
This is what I want to do to serve my community. I am passionate about reaching my goals, and all I need is a lot of support. I am asking for you to invest in the future of our community by investing in my education.
Any contribution is tax deductible and will be greatly appreciated, and reciprocated by my actions.
Thank you so much in advance for you generosity.
Alvena Willis
Estimated Financial Needs:
[were itemized below]


Checks or money orders can be made payable to: SMU Office of International Study Abroad, for Alvena Willis, or payable to Alvena Willis.
Send to:
Saint Martin's University
Attn: Study Abroad Office
5300 Pacific Avenue SE
Lacey, WA 98503-1297
Posted by: Alvena Willis on November 21, 2005 1:02 PM

Outside of strangers addressing celebrities (and bloggers and their readers parodying those strangers), print salutations were generally not used. One exception was a commenter on the "Casting News for OotP" thread, a hijacked thread. This repeat commenter, named Charlotte, took an unique role on this thread. While like many strangers she identified herself as seeking more information about casting for the Harry Potter movie so she could be in it, she also did not directly request this information from the blogger or anyone else. Instead, she attempted to set herself up as a source of authentic information about the movie, mentioning her (talent) "agency" and giving details about the studio which was producing the movie. She tended to sign the end of her posts "With all due respect,/Charlotte Clark/ Your Harry Potter Speaker" (or "Harry Potter Helper.") She used line breaks frequently, as more often befits a letter than a comment, and began her posts with a somewhat informal greeting separated from the body with a line break. In her case, the print salutation seemed calculated to cement her reputation as a reliable authority on the topic of the thread.

These writing practices are interesting for two reasons. First, they are piquant in light of readers and bloggers referring to strangers as "illiterate." In fact, from the perspective of traditional literacies, strangers sometimes wrote far more literate comments than bloggers and readers! This discrepancy demonstrates the multifaceted nature of literacy, and underscores the inadequacy of traditional literacy education to prepare readers for the literacies of power on the Internet.

Second, it is worth noting the more physically indexical nature of traditional print salutations, which locate the speaker and the intended hearer in time and space. More on the physical indexicality of strangers' comments in the subsequent section on indexicality.

In a few of the posts, strangers drew on writing styles which are often seen in newspapers. These included advice columns and classified ads.

Some strangers appeared to sign off as if writing a letter to a traditional newspaper advice column:

[...]Please e-mail me to let me know where I could find information about [the spider], or let me know what u think.
Wondering in Oregon,
signed looking for an angel, Rosie and Cinnamon

Mike's sign-off locates him in physical space, ostensibly to help the reader determine what kind of spider Mike might have seen. Interestingly, though, these strangers use both the locating sobriquet and their real names. In many newspaper advice columns, only the sobriquet is used in place of a name, and the writer is kept anonymous. I am not certain whether newspaper readers usually give themselves these (sometimes geographically indexical) titles, or whether columnists and their editors usually add them in, so it is not clear whether, in a newspaper context, this self-addressing would be out of line. Regardless, it does give some insight into the commenter's mental model of web correspondence.

In a post where strangers were trying to buy and sell their wedding dresses, a few made use of shorthand and syntax which used to be the hallmark of classified ads. Classified ads (before the collapse of that market for print publications) were traditionally terse, because fees to run the ad were charged based on how many characters were in the ad. Standard abbreviations -- "OBO" for "or best offer," for example -- were used to conserve on space and cash. Despite the facts that there is generally no charge to post online, and that character limits are far more generous, some strangers followed these guidelines anyway:

lace corset back wedding dress with train for sale. size 12, $250 or obo. email me at
Need to sale OLEG CASSINI 5438 SIZE 6. BRAND NEW! $700 OBO

Despite allegations by natives that strangers had never seen the inside of an educational institution and couldn't write reasonable prose if their lives depended on it, there were a small handful of strangers who proved to be authors of published books, PhDs, and other well-educated types. These included the manager of Architecture Week; doctors of civil engineering and chemical dependency counseling; the author of multiple published books on computer networking, who also holds a patent in his field; and a number of people who self-published or were seeking publishers for books they had already written. Aside from those who self-published or were seeking publishers, all claims of publishing or degree were backed up by at least one other comment or page on the Internet, and in the case of published books these could be found on Beyond these strangers whose literate fluency had been ratified by publishers, many other strangers indicated they were in college, and a few implied they were on college or university staff.

Attacks on strangers' traditional literacy skills were a major component of natives' comments. Natives pointed out errors in strangers' spelling and grammar, implying these were part of a generalized "illiteracy" which was the cause of strangers' appearance on these threads. Common points that came in for criticism were strangers' failure to use punctuation and overuse of the caps lock key.

Many other comments from natives tied spelling, grammar, and punctuation -- literate production -- to the reading comprehension problems they assumed strangers were having:

[...] i have a riddle what happened to reading comprehension and also being able to write using grammar and punctuation?!?!? (Riddles thread)

Though, of course, some natives also expressed concern with strangers' reading comprehension on its own, as I did on the Mary Kate and Ashley Olsen thread on my blog:

gus wrote: Hey MK and A fans! Here's a quiz for you! Can you tell me what the following words mean?
1) Sirens
2) By-products
3) Scrawny
4) Genetic defect
5) Enticing
6) Horrible
Bonus points: Can you guess what the person who wrote the post above thinks about Mary-Kate and Ashley Olsen, now that you've defined those words?

Some natives suggested, in jokes, that it was the malformed writing itself that was keeping strangers' pleas from being read by the celebrities they were trying to contact:

Dear Shirley;
We would love to OVERHAUL your husbands truck. Unfortunately, Due to NAFTA regulations, we have to limit the number of grammatical errors on applications to five. You exceeded our "grammatical error tolerance" limit by 633, Therefor, we are unable to accept your application at this time. Please feel free to reapply once you get your GED.
(Overhaulin thread)
Maury hates kids that don't use punctuation. He sends them to punctuation bootcamp. COMMA!
(Maury thread)

One reader made it clear that traditional writing skills were among her criteria for evaluating online writing:

And please, the word is spelled "huh" not "hugh". If you're going to rant and rave, make sure people take you seriously by not using caps and spelling correctly.
(Maury thread)

Another exchange, following a comment from an apparent stranger who wanted Maury Povich to help publish her book, suggested that blog readers wanted to hold online writing to the same standards by which school writing and published writing were judged:

You sound like you have the literary capacity to be a writer. You totally got my attention with that zombified post of yours. And I can tell you READ. Cripes...
you can publish your own book. But if your writing sounds anything like your posting, no one will buy it. Publishing doens't [sic] make you money. Sales do.

Natives also used a few traditional print publishing phrases, suggesting participation in the community of those who traditionally control print. For example, Josh on Communications from Elsewhere used the abbreviation "ed." for "editor" when commenting within strangers' comments to disprove their validity.

Not only would bloggers and their readers assert their superiority over strangers by pointing out their poor grammar and spelling, but they turned the same tactic on each other to prove a point as well. Many times, readers would play the grammar or spelling card to rein in other readers who were mocking strangers:

I wouldn't laugh too hard - all of you that are calling people retards. Most of you can't even spell correctly in your messages.
(UtterlyBoring AOL thread)

Natives understood good spelling and grammar online as closely tied to the same skills in the classroom and in publishing. They saw the presence of certain of these traits as a sign of good Internet content, and their lack as denoting stupidity and laziness, while at the same time they overlooked other literate forms like letter-writing. They were generally of the opinion that spelling, grammar, punctuation, and reading comprehension were separate skills, and that writing skills did not necessarily impact on reading skills. Natives saw strangers' main problem as either one of comprehension or failing to read. However, awareness of the distinction did not stop them from commenting on strangers' writing failures (while generally ignoring their own).

Proxy channels and "trying your luck"

Beyond confusions about the technological capabilities of the channel, a few strangers left indications that they believed every comment form on the Internet could yield contact with a celebrity or a company from which they needed assistance. This usually took the form of assertions that they had tried to contact the celebrity/company before, when there was no evidence they had used that site in particular in an attempt to make contact:

dear sir, I have tried repeadely to cancel my trial period with aol and have been unable to.
I would appreciate your cancelling this for me Immediately!
Thank you...........sincerly Paul L Garcia
(UtterlyBoring AOL thread)

Another comment which indicated a flexible sense of how many channels might reach the celebrity:

My peeve is the way the "NETWORK" holds their stars in the tower, unaproachable and aloof. "Alton, Alton let down your hair hat we may climb the spiked up hair!" I can't get a reponse from "contact me". from anyone. [...] I hope the STARS realize the sacred network can be the death of them too. Alton, if the tv gods permit you give me just a quick shout. I have a question that your "live chat" gurus referred me to Google.
(Alton Brown thread)

She has tried using this comment thread -- which she does link to "blogers" (sic) in a subsequent comment, so she does label this channel as a blog-- as a channel to reach the celeb. She indicates that she has also tried channels marked "contact me" and "live chat." A brief scan of Alton Brown's website,, on's Wayback Machine around the time of her comment indicates that she was most likely not trying live chat or contact me links on that website; no such links existed on that page. Nor were these links on Communications from Elsewhere, where the Alton Brown thread is posted, so she isn't referring to this site, either. She has apparently tried a number of channels on a number of sites to get her message through.

Outside of the corpus (through a technicality; the bloggers did not identify the threads as containing errors, though others had identified the threads elsewhere as having such errors, and this was one of the threads which I found posted to MetaFilter because this phenomenon was seen as interesting), one stranger was found trying to reach skateboarder Tony Hawk on two different websites. One one, he wrote "Tony ithink this is the fourth website iv emailed u on but please com to my birthday." (emphasis mine; on )

Aside from the general conception that celebrities were reachable through the Internet (in an era whem the blog was king, before Facebook and Twitter seemed to make this more of a reality) and that many different websites might lead to such a contact, some strangers indicated they believed their messages would have to pass through the hands of other people before reaching the celeb. One stranger looking for Bill Murray indicated beyond "to whom it may concern" that she thought the letter would be vetted before it reached the celebrity:

I don't know who screens these letters, but I really hope Bill Murray reads this eventually.
(Bill Murray thread)

Especially on the Maury Povich thread, a number of strangers asked the celebrity they were addressing to help them speak to another celebrity:

I would like to meet a past maury guest tremayne any suggestions?
(Maury thread)
Maury I love your show please help me meet Tina Turner in person everyday i listens to her cds and I love to watch the movie what love gots to do with it please let Tina know that i love her very much and she change my life . love shanell
(Maury thread)

In a few cases, strangers indicate that they have read the blogger's warning about the channel, but express a belief that the blogger has an open channel to a celebrity and can pass their message on:

ok rob, i know u say this is not the site that maury will actually see, but if u find info on how i can really contact him,i would appricaite it.
(Maury thread)

One stranger not only read the warning, but even returns to mock other strangers who did not read the warning multiple times. Still, she asks the blogger, who has established he has no contact with Maury, to help her contact the talk show host:

hmnnn...I guess some people just see the picture and automatically think that this is Maury's website....let it keep on going, this is hilarious....I started laughing halfway through reading all of these and I was like, god are these people just
But, I ask you just like the smart ones did, do you possibly have Maury's e-mail address, I would really like to tell him my story and hopefully have a good outcome...
E-mail me back if you think I am worthy enough...LOL.
(Maury thread)

This commenter knows who wrote the page, and has read that it is unreasonable to write to Maury on this page; the smart way to proceed, she assumes, is by asking instead for the contact information which will yield the correct channel to speak with Maury. This bears a resemblance to a pattern observed by the Vermorels, who did research on hundreds of letters to celebrities which were sent to a British newspaper. (CITE) One implication might be that strangers who press on anyway believe that, like newspapers, blogs have some sort of journalistic privilege which unlocks the gates between celebrities and everyone else.

Some strangers were worried by the sheer volume of comments they saw posted before theirs; they thought it was unlikely their target would read what they had written:

(Maury thread)

Finally, one stranger gave an indication s/he believed s/he would have to pay to actually get a letter through to the celebrity:

OMG it is really annoying trying to find a way to contact AB without having to pay[...]
(Alton thread)

This is cause for concern considering the tremendous number of websites purporting to offer celebrity addresses for a small fee -- usually not more than a dollar, but still of questionable validity and less utility than contacting the celebrity's agency.

A number of strangers, while aware it was unlikely their comments would be read, pressed on anyway. Some expressed a sense that it was better to comment than not:

I know everyone's gonna b like "hey, he's never gonna even see your message" but, i don't really give a hippo! This is to Johnny! Not to anyone else
Johnny, if you ever read this..please email me! i'm a huge fan
(Engel-Cox Pirates of the Caribbean thread)
I'm sure you get a lot of these requests, but I'm Caddie Master at Olympia Fields Country Club (south of Chicago) - & woud love to have you come out for our Caddie Banquet[...] I know its a shot in the dark, but why not try, huh?
(Bill Murray thread)
Anyway, the chances of you reading this are next to nil, but what have I got to lose but the few seconds it will take me to write this?
(Overhaulin thread)

The last two comments model the thought process in detail: the cost for posting this comment is minimal. It is far outweighed by the potential benefit of receiving a response to the message. Like the lottery, you can't win if you don't play.

Others, expressing their doubt the celebrity was reading, seemed content to have anyone read their story; writing offered a sort of catharsis, or the chance that someone else who could help would stumble across their plea:

(Maury thread)
My brother in law needs surgery. His insurance will not cover it nor can we afford it. Without it, he will die. [...] Please, someone help us.
(Maury thread)
praying to god every day to give me away to help my mom to better her lifes and my brother to have they own clouth shoes,bed and room. i hope god bless me and my family
i now is some one out there with a good heart
(Maury thread)

The latter two comments, in fact, did not directly address Maury Povich or anyone related to his show at all, and as such, had a prayerlike quality. More on pleas with religious language in the Context section.

A few strangers exhibited skepticism based on a belief that technology could be used to fool the other strangers they saw commenting in the threads:

is this the real maury web site or is this fake please let the public know because maury has so many fans that would like to really contact him. thank you
(Maury thread)

One stranger admonished natives not to believe that Celine Dion was really eating children (a running gag perpetuated by natives on "you guys are kids and you are just been foolish don't you know it is the effects of a well designed computer program so you guys should pls go and rest i think it has been overworked."

Another stranger chastized strangers on the Maury thread, offering her own interpretation of what was going on:

You guys are crazy for writting in the comment package///
you are all being fooled by some idiot that made this up. Its a "blooper comment" why dont you read what it says on the top......Take a look before you write any sad story....

While this commenter urges "reading" the way blog readers and bloggers often do -- including reading the initial post and headline, and using the comment form correctly -- she is not reading the content the way the blogger does. She thinks the blogger "made this up" and it is a "blooper," implying that the post itself is a fake. The blogger and readers, by contrast, continue to assert the truth of the blogger's visit to the Maury Povich show described in his initial post; it is not "made up."

A third stranger withheld necessary information from the assumed addressee pending his confirmation that he was using the right channel:

i want to cancel my AOL account cant you people UNDERSTAND?? Just cancel it please my username is 30timeslucky, and I wont give u my password unless u can gaurarntee youll be able to cancel it
(Utterlyboring AOL thread)

Clearly, misidentification of channel alone is not the only factor leading strangers to make what bloggers and readers would identify as errors. It is one of many such missteps. And not every stranger misidentifies the genre of the page they are reading. On the Lisp thread, a commenter begins her comment

I'm so glad this blog is still in operation-
and that I've found so many teens like me who suffer from a lisp.

This begins to hint that strangers are making attempts to reclaim the channel or context; more on this in a later section.

Natives gave their own interpretations of strangers' technology understandings. For example, when Josh of Communications from Elsewhere provided some server log information about one of the strangers on his blog, one of his readers gave this interpretation:

This user read your sentence which incorporates the link, understood it to be an address of some sort, and equated "page" with email. Wrong, surely, but also imaginative and pattern-seeking, the acts of someone struggling to learn to work with a computer for the first time.

These discussions were among natives, but at times, natives also tried to argue with strangers' interpretations about the channel (the blog comment thread). The difference between natives' and strangers' understanding of the channel is clearest when using visualization tools from IBM's ManyEyes application. ManyEyes's "word tree" visualization gave a clear picture of how natives and strangers used words about the channel in different ways. Natives used words like "page," "website," and "site" more frequently than strangers did. They were more often referring to the site they were commenting on, while strangers using "website" or "site" were more likely referring to the page they perceived as belonging to a celebrity, or their own "home page," which they were seeking assistance with.

Here is a representation of the uses of the word "page" as used by natives. (This dataset includes only comments in which natives did not appear to be joking; in facetious comments, natives were more likely to mimic the style of strangers, and thus their use of words in those comments was more likely to resemble the word use of strangers):

Image:Page readers.png

The top branch (/comments-page) reflects text inserted automatically by blogging software; I have included it to give a sense of the other branches' weight. Looking more closely at the branches which include uses of the word by individual human users:

Image:Page readers zoom 1.png

Image:Page readers zoom 2.png

References to "this page" or "the page" did one of a few things: they asserted the page's authorship information; they explained what the page could or could not do; they explored the page's relationship to server logs or search engine results; they referred the reader to a part of the page with important information (such as a link to a page more relevant to strangers' needs, or to authorship information); or they discussed the comments left by strangers.

Compare this to uses of the word "page" by strangers, below. ("Page" alone in this dataset also yields the "comment-page" phrase from the blog software, along with one other use of the word which comes in a thematic non-sequitur about fine art on the Hoppity Hop thread):

Image:'page' strangers' literacy words for final draft.png

All of these comments come from one of the threads in which strangers made pleas to have Google "removed from their home page." Notably, none of these comments refer to the page on which strangers are currently commenting; they apparently refer instead to the first page these commenters see when they open their web browsers. Note also that the phrase "home page" did not appear anywhere in natives' comments, even on the same threads; the "home page" appears to be a literacy element which is important to strangers, but not to readers.

These stranger comments often reflect fundamental misconceptions of how web browsers work, but they also reflect some of the more nefarious realities of browsing the Internet. Control of the "home page" is ultimately up to the user of a web browser; it is a setting which can be changed in one's own preferences relatively easily. Change in one's home page is not, in this case, linked to an "account" or a web page's list or "database," as many strangers thought it was:

Please cancel all google services on my home page and cancel my account.
I do not want a Gmail account. please remove my name from your database.

Requests to someone else to change the home page setting can only yield instructions to "fix it yourself;" on demand, another user cannot change that setting without sitting down where the requester is and taking control of the computer. This is reflected in comments from blog readers on the Cancel Google-themed threads:

go to config,and than to software...than remove it from the list...and google toolbar is gone...

However, software, website, and operating system developers constantly jockey to convince the user to change their home page settings. When a user installs new software, that software may ask the user if s/he wishes to change his/her home page, to direct more of the user's traffic to that software developer's site. This request may pop up in a window which users are confused by, or which they click through quickly without thinking, thus changing the home page preference without the user's awareness they made the decision themselves. One computer scientist lumps this kind of prompt in with what he calls "evil interfaces." (Conti, 2008) Thus, while strangers misunderstand how they could be controlling their home pages, they are not entirely wrong in believing these matters are not in their own hands.

"Page," of course, is not the only word which can be used to describe the piece of text to which natives and strangers are contributing. "Website" or "site" also apply. Looking at uses of these words, more strangers do actually refer to the page on which they are writing:

Image:Site strangers.png

Image:Website strangers.png

Many of these uses express doubt or confusion as to the nature of the page they are using. A number of these comments express doubt about authorship. A few strangers contest whether or not celebrities they are trying to reach actually read this page:

i think you should all count yourself lucky that Tom probably doesnt read this site, coz if he did he would be very disapointed!!!!

Compare these results to visualizations of the same words used by natives:

Image:'site' readers' literacy words for final draft.png

Image:Website readers.png

Again, in these comments, natives refer to the page on which they are currently writing, calling strangers' attention to particular elements to enforce their understanding of who wrote the blog, who it belongs to, and who has a right to declare the appropriate ways of approaching the topic. Also interesting is the matter of proportions: out of 693 comments, natives used the words "site," "website," and "page" 196 times, while strangers used those same words only 107 times in 1369 comments. This is considering strangers without considering hijackers, but including hijackers would not change much: hijackers used these words 64 times in 849 comments, roughly proportional to the number of times other strangers used these words. Natives proportionally expended much more energy indexically locating the comment thread and its topic than strangers did.

Bloggers and readers took other steps to reinforce what they thought was the correct identification of the site. Josh of, where many threads in the corpus appeared, posted the following bulletin on each page of his blog:

NOTE: This is a personal web site, not a public message board or chat room. If you're thinking of asking a question about something, make sure that I have presented myself as someone willing to answer questions on that subject by going to the top of this page and reading it all the way from top to bottom.

Bloggers and their readers were convinced that the correct identification of the channel was as the blogger's own channel. When confronted with topical, indexical, or channel identification drift, they re-asserted their personal ownership and rights to the channel:

Despite what she or you or anybody might think, this is not a public messageboard where you can post general-purpose questions asking for help and hope some random internet stranger will write back. This is my personal journal, which I choose to make public. If someone posts a comment asking a question, I take it that they are asking it of me.
(Josh, blogger on the Bees thread)
As far as the reason I'm "on here," "here" is my web site. I own it, I run it, I pay for it, and one time I saw the Maury Show. I still don't understand why you (and dozens of other people) are looking for help here.
(Ryan, blogger on the Maury thread)
Just to clarify: [...]
b) This is actually my website.
c) I genuinely wonder if anyone who posts here is clear on the tone of the post they're responding to, and if they are, what the internal mathematics are of their decision to ignore the tone set by my initial post.
d) I also wonder if any of these responders realize this is a blog, not a bulletin board/forum, and how this affects their decision to post.
(this author's own website, Olsen Twins thread)

Rod of emphasized to strangers that the original post itself had clarified the abilities and limitations of the channel, an explanation which they had missed as they attempted to send messages to actor Bill Murray:

People are reminded that "Murray's only contact with the film business is through a freephone number" apparently [this was mentioned in the original post --ed], so, as entertaining as these comments are, you're pretty unlikely to get hold of him on this site!

1Dissertations, for example, have no such concerns; perhaps the assumption is that beyond the committee, they will never take up anyone's time.


Indexicality has been identified by New Literacies scholar Don Leu as particularly relevant to new forms of reading and writing on the Internet.1 However, Leu has been using the term only to talk about literacy as a whole. In this section, I will give evidence from the corpus that indexicality (as I will call it) and other elements of context have been catastrophically disrupted in mediated conversation: in the Internet communication described here, but probably throughout the centuries due to print, television, radio, and other media as well. As described in the machine conversation section, media expand the human ability to communicate beyond participants' immediate spatial and temporal context, requiring human participants to adopt a prodigious number of new maneuvers to make sense of a conversation. Those maneuvers will be described here.

1 Leu refers to it as "deixis;" "shifters" are of course another name for these words. I will consistently use the word "indexical," as it is easier to understand, and the semantic implications of "index" are particularly relevant to the underlying structure of software.

Disagreements in indexical interpretation

Strangers' errors were often identified by bloggers and readers as indexical. The Maury Povich thread provides an exemplar of this kind of misunderstanding. In the first post, the blogger commits the following sentence to print:

As soon as I saw Maury walk towards this guy (who, honestly, was definitely a guy through and through), I could see what was coming next.

The indexicality of "Maury Povich" in this sentence is third-person: "he, Maury Povich." A few comments into the responding thread, a stranger begins her post,

Maury, I was so impressed with what you did for the little girl with the club feet and hands, how you got a wheelchair van, and computer for her.

The indexicality of "Maury" in this comment is second-person: "you, Maury Povich." This identifies Maury as the addressee and, in traditional conversation, perhaps the next speaker. A number of subsequent commenters employ the same indexicality as the stranger. The blogger responds with a correction to this indexical assumption a few comments later:

FOLKS, PLEASE... Maury has nothing to do with this page and he will never, ever read this page. Trust me.

This is a re-assertion of the original indexicality: Maury is not available to this conversation in the first-person "I, Maury" indexicality implied by the strangers' second-person reference. The blogger asserts that Maury is not available as a participant, either as a speaker ("nothing to do with this page") or a hearer ("and he will never, ever read this page.")

There were a handful of strangers who were indexically flexible, going from addressing a celebrity to other assumed readers within the course of a few messages:

So maury please help me if you can I live in high point Nc.And some of my friends are to ashame to hang out with me because of how i look. So again please help me. And I need maurys address.
(emphasis mine)

These two successive comments are from one stranger on the Alton Brown thread:

43. Rita <> says:
I love chemistry and all Alton has to offer in the molectular way that is his own. My peeve is the way the "NETWORK" holds their stars in the tower, unaproachable and aloof. "Alton, Alton let down your hair hat we may climb the spiked up hair!" I can't get a reponse from "contact me". from anyone. [...] I hope the STARS realize the sacred network can be the death of them too. Alton, if the tv gods permit you give me just a quick shout. I have a question that your "live chat" gurus referred me to Google. Your not as dumb as your handlers would have us belive. I need your assistance, and by the way Micheal here in Cleveland says hello. Tremont is beconing.
44. Rita <> says:
Just a quick note about me. I'm 50's young and terminally pretty and ill. [...]

This stranger begins by making a general address with no specific recipient. She refers to a few parties -- TV networks, and TV chef Alton Brown -- in the third person. She then continues to address the celebrity directly, as if he is a party to the conversation. In her second comment, she goes on to address "blogers" (sic). Whether she means multiple bloggers whom she believes write on this site; multiple bloggers not assumed to be present in the conversation, but addressed in a manner approaching apostrophe; or bloggers meaning "people participating in this blog community," possibly including those who I am coding as readers, is not clear. This particular stranger also treats the channel as fluid; more on that in the phatic or channel section of the analysis.

How were participants supposed to indexically orient themselves to who might be available to the discussion at hand? As mentioned earlier, bloggers do not always write up "About" pages indicating who they are. (Nielsen, CITE)

Bloggers and readers dealt with the invasion of strangers by trying to clarify the correct indexical "you" of this particular blog, often directing them to someplace where they could more effectively contact "you, Maury Povich." They did this with written admonitions, links within comments, instructions added to the original post or elsewhere in the structure of the blog, or banners which were visually distinct from the rest of the blog's design. The latter two actions were of course only available to bloggers, who had the power to change the structure of their sites outside the comment thread.

Natives made it clear that the way they thought was appropriate to indexically figure out who owned the page, and thus who might be participating, was to look at the URL. At times, bloggers and readers attempted to share with strangers their way of using URLs to help strangers read websites the way natives did; this was a somewhat hamfisted effort:

Hey, people! You cannot cancel your efax service at this website. See that funny little thing up there that says ""? That's 'cause this is This is not Efax. Nobody at can cancel your efax service. If you can't figure out what this means, get off the Internet and don't come back.
(Elsewhere Cancel Efax thread)
Stupid fucking hicks.
If you want to apply for the show, go to this link:
Notice, it's the real site.
(Overhaulin thread)

Other less-than-fruitful strategies for pointing out the URL included putting the correct URL for Maury Povich's website in the URL field for their own comment (so it would be written as attached to their own name) or including the correct URL in a joke comment:

Maury, I had a friend once. She was the sweetest friend in the world. She helped to wake me up in the morning. She saved my life numerous times. Heckfire, she probably even killed for me a few times. However, tragically, we moved away one day and I never saw her again.
So... Maury... please... could I be on your show? Could you help me find Lassie again? Please?
Posted by: Timmy <> on April 26, 2003 11:25 AM

Josh at Communications from Elsewhere posted a disclaimer on his site very near to the comment box strangers would use to submit comments. He found this did not work, however, and more than once reported following up on that disclaimer with an email to strangers who had already left comments:

Here's what I asked this time:
There's a notice on the page you used to submit that comment. It reads:
NOTE: This is a personal web site. Leaving a comment here will not get your complaint heard by representatives of eFax, Amazon, Qwest, or any other company.
It is located between the space where you wrote your comment, asking me to cancel your efax account, and the button you clicked to submit your comment.
What part of that notice did you think didn't apply to you? I'm really, really curious. I specifically say in the disclaimer that I'm not eFax. The page itself has multiple instances of me saying that I'm not eFax, and that asking me to cancel your eFax service will do you no good. So how is it, exactly, that you thought you were on an official eFax page where you could cancel your service?
My mind is officially boggled. Help me out, here.
(Elsewhere Efax thread)

While plenty of bloggers, readers, and even strangers posted redirects, the redirects did not deter all future strangers from commenting. It is not clear from the data on the pages how many strangers did successfully make it to the correct website, but strangers' comments did not seem to ease up much after natives posted redirects. Even bulk posting of redirects did not seem to help. By a count using ManyEyes, the correct phone number for AOL's cancellation line was one of the top ten text strings in one AOL cancellation-related thread, meaning the number appeared many times on that page. Again, it was to little effect.

Disagreements on how to write indexically

When it came to identifying themselves indexically, natives and strangers took very different approaches. Put simply, strangers indexed themselves in time and physical space so specifically that natives regarded them as careless with their privacy. Blog readers, meanwhile, indexed themselves mostly on the World Wide Web only, going out of their way to obscure some information, like email addresses. Bloggers gave some general information on their "about" pages, but still were not as specific as strangers.

Strangers, unlike bloggers and readers, tended to make themselves quite easy to find in physical or data space, and determine who they were. Particularly in the celebrity-oriented comment threads, stranger comments began with a personal introduction locating the stranger individually, as well as geographically and demographically:

Hello, my name is Samantha Holscher. I am 22 years old and live in Fayetteville Ar. My reason for this e-mail is my mother. She is 51 years old and in the late stages of lung cancer. She is not doing very well and is in critical condition. My mother, Pamela absolutely adores Alton Brown. She watches his shows and is constantly talking about how much she loves him. She tells me stories about the schools he went to and how he got his start in life, and on food network. My point of to all of this is that my mother is not going to live for much longer, the doctor says it has moved into the stages of wet cancer. It would be short of a miracle if Alton brown could take time out of his very busy life to surprise my mother and just say hello. My mother would be speechless and sooooo happy to have met him. I think it would really brighten up her time left here on earth. So if you think you could help me out in any way possible I would sincerely appreciate it lifelong! My email is <>
My address and telephone number are as follows. 3938 N. Parkside Dr. #7 Fayetteville Ar. 72703 (479)502-2500
Thank you for your time and consideration.

Many posters writing to celebrities identified their age and their family status, as well as medical details. Some strangers posted their mailing address, as this young woman did; others merely posted the city, state, or country they lived in. In the threads about celebrities, the form "My name is x, I am y years old and I live in z" was common in the first line or so of a stranger's comment. In the "How to sell a wedding dress" and "Overhaulin'" threads, these details usually came after the mention of a particular car or wedding dress. Because letters to celebrities and letters about TV shows (like Overhaulin') frequently requested that celebrities and their shows do something for the stranger writing the comment, it seems that the addition of a geographic location was intended to speed the process of contact, letting the celebrity know how easy or interesting it might be to respond to this particular stranger. In the "Overhaulin'" thread, in particular, geography is offered by strangers as an attraction to the producers of the show:

If you ever decide to add some norther flavor to the show and are willing to Come to Canada. My husband's car here in Vancouver could use an 'overhaul.'
He would have absolutly no idea if you stole the truck because we live in Lake Havasu Az. and he would never guess that you would come this far to steal his truck.

Likewise, with the wedding dresses, a geolocation would serve to let other readers know how feasible it would be to purchase the dress. Geographic addresses offer the possibility of making an indexically tenuous conversation more concrete, facilitating the exchange of goods and services.

By contrast, a complete mailing address was unheard of among bloggers and readers. Readers only posted their city, state, or nationality when it was seen as relevant to the blogger's topic, which was very rare (for example, in the "Spiders! Ack!" thread, when the blogger and his wife commented that they thought they had seen black widows but were not sure of the geographic range of their habitat, readers reported where they lived as they noted that the spiders did or did not live in their area) -- or when they were indicating they were from another country/state and thought the strangers (assumed to be American, possibly Southern) were stupid.

Some commenters attempted to take advantage of strangers' willingness to hyper-identify themselves, and of their apparent lack of understanding of how to establish the credibility of others online:

fraud said on 04/17/05 @ 01:23 PM:
Hi I was wondering if anyone wants to cancel AOL could I try for you? I just need your phone #, Screen name and security question to cancel it. I'm gonna call and record the conversation then put it on the net so everyone can see how sucky they are.

This was not the only case in which the deceiver made it clear in the poster name field ("fraud," in this case) that deception was the goal of this comment. This tactic doubled as a means to call strangers out for being "stupid;" anyone who responded would not only have failed to respond to the first turn (blogger's post), but would also have failed to attend to the name of the commenter they were responding to. (I ended up coding these deceptive commenters as "reader status unknown," as ultimately they did not see eye-to-eye with bloggers; bloggers who caught these cases of phishing deleted comments or key information from them to protect those who might fall for these ploys.)

Whether or not these phishing attempts worked is not clear from the data in this corpus, but strangers had, at points, posted their credit card numbers and account passwords (on threads where they were asking to have accounts canceled), so it could be expected that some might fall for this tactic.

Outside of the name, email, and URL fields of blog comment tools, to which the blog software adds a time stamp (and may record an IP address which will not provide much indexical information to the average human reader), there is little in a blog's infrastructure to suggest that the commenters are expected to indexically establish themselves when writing a post. This anonymity has been an available feature of the Internet from its earliest days, and has both positive and negative effects, but the difficulty it poses to physically indexical understanding is an obstacle to making sense of a comment thread. It is worth noting that strangers have provided all these physical indexicals in spite of lack of blog support for them.

Natives, by contrast, carefully and lightly indexed themselves in virtual space, as I will now explain. They made use of the built-in indexical tools provided by blog software -- the email address field and the URL field -- to identify themselves. However, they did so in a way that referred more to their online than their offline lives, and also protected them from spam. This included linking to their own websites, and writing their email addresses in coded ways.

Many natives used the URL field to link to their own websites when commenting. This offered them the opportunity to provide much more context to their comments about their interests, perspectives, and who they were. For a blogger, this was a way of reiterating that they owned the website. A comment by Josh of, for example, would begin:

Josh <> says:

This link returns the reader to the main page of the blog, giving a view of the range of topics the blogger has covered recently. Josh is identified as the Josh who writes, not another commenter by the same name. He does not use his last name. However, a bit of searching around turns up an "About Me" link which provides more identifying, indexical information about this Josh:

Communications From Elsewhere is the personal site of Josh Larios, who isn't going to even try to maintain the third person for longer than this sentence. My birthday is March 15, and I was born in 1974. You do the math, because some days I sure can't.
Nicknames: on IRC I go by hades, RJL20 or Empath. My holy name is Pope Dubious Provenance XI. Most everybody just calls me Josh. Hades is from my pretentious high school days. RJL20 is my assigned contact handle from InterNIC, back when they were still assigning those. I'm not really sure about Empath. I've been using it for over ten years, and I can't remember why. I don't think it was a Star Trek reference, but it might have been.
I live in a small house in Seattle with my lovely wife Cam. We got married on August 23, 2003.
I work as a web developer for the University of Washington Bothell campus. Previously, I was a desktop support analyst for the UW Seattle campus, and before that I was a unix sysadmin for dotcom startups. I don't have a degree, although I did spend four years at the UW as a student.
I'm the oldest of three kids; my sister Mary lives with her husband in Oregon, and my brother Mike lives in Seattle. My father is a junior high Spanish teacher and my mother is a poet. They live about half a mile south of Cam and me.

This is quite a bit more identifying information than most bloggers in the corpus provided; indeed, Nielsen has noted that many bloggers do not identify themselves thoroughly on their blogs. (Nielsen, 2005) While Josh's "About" page does not give an exact street address or phone number where Josh can be contacted, it does give his last name, his employer, and the city and state where he can be found, making it possible for the user to find this information with a little additional research.

While Larios provided more information than most natives, a majority of natives' pages did include a link to an "about" page or some amount of information indexing them more specifically in time and space, including geographic location, profession, age, education history, marital status, and so forth.

In comments, natives (including readers and bloggers) were more likely to provide a link to their own page than were strangers. Only 37 strangers linked to webpages which were ostensibly their own, while 148 natives linked to theirs, and there were of course more strangers than natives. Another feature of natives' URL use which differed from strangers is the kinds of pages they linked to. In the rare case that a stranger did provide a link to his or her own page, it was more likely to be a Piczo or MySpace page. Natives, meanwhile, were more likely to have pages affiliated with a particular blogging platform, such as LiveJournal, WordPress, or BlogSpot. This brings to mind the kind of social digital divides described by boyd, who describes the ways in which MySpace is aligned with working-class and other marginalized youth while Facebook is the social networking site of choice for the college-educated. (2008)

The use of email addresses also set strangers (including hijackers) apart from natives. The following charts depict the breakdown of domains of strangers' email addresses:

Hijackers were slightly more likely than other strangers to list a Hotmail or Yahoo email address with their post. They were also much more likely to list a Gmail address; only one non-hijacker stranger listed an address in this domain. Hijackers were also less likely to list an AOL address or one affiliated with an Internet service provider (ISP). A very small number of non-hijacker strangers listed email addresses with the defunct Internet services Prodigy (two commenters) and Compuserve (one commenter).

From at least one comment, it appeared that seeing other commenters include their information influenced subsequent strangers to leave the same information. On the Avril Lavigne thread, one stranger wrote, "i see everybody give his e-mail i think i will do the same its" There also seemed to be temporal clustering in commenters' tendency to give certain personal information, or to present information in a particular pattern; strangers may have been looking to other recent posts to decide what information they needed to post.

There were exceptions to the hyperindexicality rule. Some strangers failed to provide adequate information to locate them and make them contactable; this despite bat check the fact that their messages asked celebrities or companies to contact them. One commenter on the Maury Povich thread, asking for assistance with publicizing what appeared to be a legal issue, did not even include her name. Nonetheless, hyperindexicality was the more salient feature of most stranger comments in distinguishing them from readers and bloggers.

Compare the charts of strangers' email address domains with a representation of the email addresses provided by blog readers, those who aligned themselves with the blogger's take on the topic and indexicality of the post:

For starters, the N is of interest compared to the N in the stranger charts. About 1/5 of strangers and hijackers listed email addresses, while 1/14th of blog readers did.

Further, I simply did not bother recording the domain of many of blog readers' addresses, because their more salient feature was how they were being used. Thirty-eight percent of them were clearly implicated in a joke blog readers were making:

Dear Gogol,
I am afraid that my soul-purchasing plan was poorly conceived. I am writing to ask if you would perhaps like to repurchase some of your dead peasants. I am willing to sell at half of the price I gave you.
(OK/Cancel Google thread; commenters on this thread sought to have Google accounts canceled)
George W. Bush <> said on 07/01/06 @ 07:01 AM:
Hi cancell my acct. its password mission_accomplished. thakns i dont have time im too busy running the country.
(Utterlyboring AOL thread)

Another 32% of blog readers' addresses were either altered so as to be unreadable by crawlers -- bits of code like those used by search engines to find new websites, except this code is tasked with searching the Internet to find email addresses to send spam to. These alterations included fake email addresses like "" or "" (Overhaulin thread), which refer directly to this spammer practice; or they made the addresses legible to humans, but not to bots which expect the traditional email format (for example, "" (Overhaulin thread) or "themetaverseATgmailDOTcom", the email provided to contact the blogger on the Barbie Dolls thread.)) Other addresses provided by readers were just nonsense characters. All of these constitute acknowledgments on the part of blog readers that while the blog software might require them to enter an email address in order to post, they knew they need not enter their real addresses and bring a deluge of spam into their inboxes.

The remaining 30% of blog readers' addresses could not be demonstrated to be jokes or spam filters. Ultimately, it cannot be determined what was the motivation of any of these commenters -- natives or strangers -- to leave these particular email addresses. For all we know, all of these addresses are for "junk" accounts which commenters know will fill up with spam, and they do not actually use these accounts for mail they actually want to read. However, considering that strangers tended in other ways to hyper-identify themselves and where they could be located, while natives tended to hide who they were, it seems more likely that at least the email addresses provided by strangers are, in fact, valid. While it is not conclusive, this breakdown of email providers could be an indicator of where they spent time and began their searches on the web, suggesting another line of research like boyd's. (2008)

At times, one blogger extended email address protections to strangers. At times, he shared the email addresses of strangers who had used them in email sent to him from the website (though they had apparently not given permission for him to do so) while also altering the addresses so spam crawlers would not recognize them; he would add text like "[email her at heatheryaple2003 (at ) -Josh]" to the end of a stranger's comment. This had the effect of making these addresses available to people who had relevant noncommercial interest in contacting these strangers, but not to code collecting addresses at random. Interestingly, the blogger, Josh of Communications from Elswhere took care to obscure the strangers' addresses he shared from spam crawlers only on one of his threads. This was a thread in which people were seeking more information about spiders. In other threads -- where strangers sought contact with celebrities or sought merchandise -- he did not take care to obscure their email addresses from spam as he divulged them.

This was not the only way in which bloggers obscured the information strangers provided, for strangers' protection. In the few cases where strangers provided credit card numbers or passwords (mostly on the UtterlyBoring AOL thread where strangers were trying to cancel accounts), bloggers took pains to obscure those, citing security concerns:

Editor's Note: It's bad enough you morons think I can cancel your AOL account -- but for cripes sake, don't put your Visa number in the comment form here (yes, this person actually did, and I edited it from the above).
(UtterlyBoring AOL thread)

In a few cases, bloggers also obscured the home addresses and phone numbers provided by strangers.1

1 In one case, I was indirectly responsible for the blogger taking an address down. On, the non-academic blog I was keeping to document and solicit additional occurrences of comment thread misunderstandings, I posted a comment from a nine year old girl who had commented on the Avril Lavigne thread. She had included not only her home address, but also her phone number the name of the school she attended; I redacted both of these from my post. Shortly thereafter, the blogger noticed the inbound link from my blog and deleted that information from the girl's original comment as well.

URLs as indexical orienters

In addition to establishing who "you" and "I" are in a conversation, indexicals also help confirm between participants what "here" refers to. As established in the theoretical section on URLs as indexicals above, URLs and IP addresses serve to establish ownership and location on the Internet. Thus they can be used not only to answer the question "where are we?" on the Internet, but they could also be used to interpret who "you" and "I" might be.

The latter was a tactic preferred by natives: they often pointed out URLs to strangers as a means of understanding who might be writing the first turn (the blogger's first post, in natives' view). Bloggers also made use of addresses they collected from the blogs' back ends to figure out who commenters were. By contrast, strangers presented ample evidence that they did not know how to interpret URLs -- in fact, that they did not even know what the acronym "URL" referred to.

Bloggers had literacy resources not available to readers and strangers in making sense of a thread. One such tool was the ability to look at referrer logs, or records of how people made their way to the blog. Hosting their own blogs, having command-line access to their servers, or using advanced blogging plug-ins made this information available to bloggers; all of these take higher levels of technical skill than simply navigating around the web and posting comments. Access to this data is generally restricted, password-protected along with the rest of the backend of a blog. However, readers sometimes knew that such information existed, and would ask bloggers to share it: "what kind of referers are you seeing?" (Maury thread)

Some bloggers used their referral logs to explain and make hypotheses about the presence of strangers on their sites, triangulating with search engine results. This use of referral logs turned up errors in strangers' spelling and search strategies. It also yielded more demographic data about their search practices, including which search engines they were using, domains they hailed from (perhaps indicating which ISP they were using), etc.

On checking my referrer logs, I see how Marie-Michelle ended up at my site. Using the My Way search engine to search for oprah E-mail adress <> (sic) brings up this page as the second result. It appears that the key is the misspelling of the word "address", which was also misspelled in Rachel's query from July. With the correct spelling of the word "address" my site disappears from view.
Regardless of how she found her way here, I emailed Marie-Michelle yesterday and offered her the same advice I did Rachel back in July, i.e. a pointer to the contact page on Oprah's site. I hope she gets the help she's looking for.
Posted by: John <;id=995> at November 2, 2003 08:59 PM
(Oprah thread)
Josh <> says:
According to the httpd logs, Don Foster came to this page from <> What that means is that he typed "www.efax.cancel" into the location bar of Internet Explorer, then when it couldn't find that site he typed it again into the search form on the page that IE gives you when it can't find the site you told it to go to.
(Elsewhere Cancel Efax thread)

The latter analysis indicates that the stranger in question lacks a grasp of how to construct a URL correctly, part of the powerful literacy practices around addresses in which bloggers and readers engage.

In addition, some bloggers used comments they received through their blogs to glean additional information about strangers' search practices. Josh, the blogger at Communications from Elsewhere, spent more time doing this than other bloggers.

Josh <> says:
Well, here's an interesting one:
Author : (IP: ,
E-mail : http.//
take my efax off
It just came in and was waiting for me to approve it, but I'm not going to without commentary. Here is an AOL user who has come to this page from an AOL search <>, retyped the URL I provided for how to cancel your efax service in to the email field of my comments form (incorrectly retyped, at that), and hit "Say It". I am deeply confused. How is it that this person was able to read well enough to find the URL I pointed to, but not well enough to understand that the form he was filling out was to submit a comment on the current site, not to go to a new one? I'm really, really baffled. Are people honestly this stupid?
(Elsewhere Cancel Efax thread)

The blogger here combines an analysis of his referrer logs (from which he apparently takes the URL with data gathered by the blogging software (the stranger's IP address). Using this information, he correctly identifies the stranger as an AOL user, an epithet which I discuss elsewhere in this paper.

According to the httpd logs, Don Foster came to this page from <> What that means is that he typed "www.efax.cancel" into the location bar of Internet Explorer, then when it couldn't find that site he typed it again into the search form on the page that IE gives you when it can't find the site you told it to go to.
Here is someone who really doesn't get it.
Listen up, you primitive screwheads: YOU CAN NOT CANCEL YOUR EFAX SERVICE HERE.
(Elsewhere Cancel Efax thread)

Josh here notes the stranger's repeated malformation of a URL ("www.efax.cancel"), indiscriminate use of text fields such as the location bar and search form fields, and use of an error page.

Josh also made use of IP addresses to identify strangers, in absence of other identifying information:

Anonymous says:
cancel my account
[No name or email address on this one, but it came from <>. -Josh]
(Elsewhere Cancel Efax thread)

In this instance, the blogger links to a website which a reader might use in order to connect the IP address to an actual geographic address, name, or company name. Only bloggers exhibited any awareness of sites like Arin and Internic which provide such "whois" services.

Again, the IP information is not always available to commenters other than the blogger; it is often recorded by the blog software but not published along with the comment. This is information which is useful to reading comments, but access to it is password-privileged. Blog readers did not have access to referral information, but some of them were aware it was there; one reader on the Maury thread asked Ryan, the blogger, to share what information about strangers could be gleaned from the referral logs.

Beyond using URLs and IP addresses to identify pages and users, natives wrote URLs in order to trick strangers, exercising their Internet skills for nefarious purposes. Some used them to insult strangers:

Hello everyone.
It's me, Maury. I'm sorry that I have been so slow to respond, but I've been busy producing my show and everything. Please forgive me, and go to my web page at <> if you wish to contact me. I promise I will respond to all of your requests, as soon as possible. I am only here to help my fellow man (and woman).
Maury Povich.
PS. If for some reason my website has been defaced, please go to this website instead :- <>
Posted by: Maury Povich <> on May 21, 2004 1:53 AM

This comment, while hoping to motivate strangers through the authority "Maury Povich," the celebrity sought by strangers on this thread, directs strangers to a tinyurl site which redirects to -- a contentless parked domain with an insulting URL. The reader posting this joke follows it up with a link to an explanation of the acronym TWAJS -- "That Was A Joke, Son." A similar comment in the same thread invokes Maury's name and tries to instruct strangers that his real website is, a popular site which includes an image of Bart Simpson writing "I will use Google before asking dumb questions" and the headline "All Smart People Use Google; It Appears That You Are Not One Of Them."

Other natives made use of URLs to pull stranger traffic elsewhere, apparently to prolong the joke:

You all might want to try Maury's weblog. <> He posts comments about his life and has a bunch of places for comments it's like SO AWESOME OMG.
Posted by: Paul <> on February 20, 2004 9:42 AM

This decoy blog did garner a number of comments (over 100 each on at least two threads) in its early days from strangers who were seeking many of the same things sought on the original Laze Maury thread. Whether strangers found the link through the Maury thread or through search engines is unclear. The two most commented-on threads on the decoy blog appear between the two dates on which these readers posted the drunkmenworkhere URL to the Laze Maury thread. However, without more information from referral logs the effectiveness of this redirect cannot really be made clear.

Of course, URLs were not the only form of address natives used to fool strangers; they used street addresses as well. In this case, the redirect was aimed not only at strangers, but intended to aggravate others as well:

Google User Experience Staff <> wrote: 19 Oct 2007 <>
Dear friends,
We at Google are sorry that you have had a bad experience with our services, so we'd like to get your feedback on how we can prevent this from happening in the future. Unfortunately, the malfunction you are currently experiencing makes it unable for us to receive that correspondence electronically. Please send all inquiries to:
1600 Pennsylvania Avenue NW
Washington, DC 20500
Thank you and have a great day.
Google User Experience Staff
(OK/Cancel Google thread)

-- the assumption here being that a stranger too ignorant to use a URL to orient themselves to where they were would also be unable to orient themselves by street address, in this case the address of the White House.

The bloggers on UtterlyBoring and other sites where this occurred sometimes took steps to prevent the exploitation of strangers in these cases; they deleted the contact information of readers who posted the malicious redirects, and noted that they had done so, as in this note from the blogger who received the comment from Phil Euphrates above:

Editor's Note: I've removed the email addresse from this post. Anybody who thinks they can email somebody their account information without getting scammed isn't very bright. And consider the intellect of the people posting on this forum, I don't need to see them get screwed even more. This guy posted from a Verizon IP address, not an AOL address, so it's getting filtered. My guess is email address was either of a scammer, or somebody said scammer wanted to have a lot of random annoying messages to.
(UtterlyBoring Cancel AOL thread)

Note that this blogger again uses the blogger-exclusive ability to look at IP address information in order to make an authoritative statement about the post's authenticity, as well as his ability to edit comments.

So natives made a range of uses of URLs and IP addresses: for understanding, for indexical orientation, to solve problems, to mislead, and even to tell jokes. Strangers, by contrast, routinely misused URLs in one specific place: the URL field which many blogs provide for commenters to fill in.

I say "misused" because this is not just a matter of what is seen as erroneous by bloggers and readers. Their use of the URL field can be classified as erroneous because it elicits unexpected, unintended, "broken" behavior from software, servers, or other aspects of the Internet. These actions produce error messages from the Internet itself; they do not jive with what software developers have planned for their software to do. These errors indicate strangers' confusion about the functions of interface elements, and they may make it hard for any other reader to provide the reciprocal response which so many strangers are looking for.

Most blogging software generates a form at the bottom of an individual blog post which will post comments to that blog post's comment thread. The following is the current comment form on; it is pretty much standard, and is likely the same one which most strangers used to post to the Maury Povich thread:

Image:Comment form example.jpg

The field marked "URL:" is the one which frequently seems to be misunderstood by strangers. The blogging software (in this case, Movabletype) expects the commenter to enter something roughly in the form "" (or other generic-top-level-domain suffix); most blogging software can also handle strings which begin "http://" or have other predictable prefixes. Blog software is designed to turn anything entered into this field into a viable URL which can be followed to a website by adding the prefix "http://" and sometimes the suffix "/". The link generated by this machine writing is included in a commenter's comment. Strangers' entries into this field produced the following URL strings, among others, with their comments (threads they were posted to in parenthesis):

  1. http://hey/ (Movie: Holes)
  2. http://I%20donno%20wat%20this%20is%21 (Movie: Holes)
  3. http://cancel%20e-fax%20service (Cancelling eFax Service)
  4. http://google/ (Google Answers HCI Program)
  5. http://lancome/ (Google Answers HCI Program)
  6. (Google Answers HCI Program)
  7. (Google Answers HCI Program)
  8. http://yahoo/ (Google Answers HCI Program)
  9. http://microsoftinternetexplorer/ (Google Answers HCI Program)
  10. http://maury%20povich (Maury's Blooper)
  11. http://maury/ (Maury's Blooper)
  12. http://I%20need%20your%20help%20%21%21%21%21%21%21%21%21%21%21%21 (Maury's Blooper)
  13. http://St.%20KITT%27S (Maury's Blooper)
  14. http://MelissaSweetMilaforsale (How To Sell A Wedding Dress)
  15. http://dont%20know (Spiders! Ack!)
  16. http://communitionsfromelsawhere/ (Spiders! Ack!)
  17. http:///????????????????????????????????? (Can we talk about Avril Lavigne for a minute?)
  18. (Google Answers HCI Program)
  19. http://metoo,fromethiopia/ (World Youth Congress 2008 - Need Help)
  20. http://sad/ (WWE Highlights)
  21. http://454/ (WWE Highlights)
  22. http://wwehighlights/ (WWE Highlights)
  23. http://kiolkjnhjhg/ (WWE Highlights)
  24. http://unitedkingdom/ (WWE Highlights)
  25. http://____________/ (Maury's Blooper)
  26. http://abid/ (Maury's Blooper)
  27. http://bigpond/ (Maury's Blooper)
  28. http://houston,tx/ (Maury's Blooper)
  29. http://none/ (Maury's Blooper)
  30. http:///???url??? (Harry Potter)
  31. http://sorry%20no%20email (Who Is Josh Server?)

Examples 2,15,17,23,25, and 30 give a direct indication that some strangers are confused about what the URL field is, or what it will do when filled out. This is echoed in the comment field of a stranger on the on the Cancel Google Gadgetopia thread, who appears not to have filled out the URL field on her comment:

i don't know what URL is Just delete me from your files, so I can get back to Yahoo..

Other errors suggest different understandings of what the URL field is for. Some examples (3, 12, 14) suggest strangers are interpreting the field as a form of subject heading. Consider the messages to which 3 and 14 are attached:

yvonne <http://cancel%20e-fax%20service> says:
I did not order e-fax service.[...] I do not want or need this service. 
Melissa Sweet "Mila" for sale. Size 4 (I normally wear a 5/7). Ivory. Please email me at <>, if interested. Thank You

Note that %20 is a representation of a blank space, while %21 is an exclamation mark; this is a translation performed by the blogging software to make the text entered by commenters acceptable for a URL string. Spaces and exclamation marks cannot be used in URLs.

Others (4, 5, 8, 9) also have to do with the subject of the message, but seem to be attempts on strangers' parts to identify aspects of specific problems they are having browsing the web:

Dorothy P Newsome <http://google/> wrote: 26 Jan 2007
Please remove Google from my computer
Jean Fitzgerald <http://lancome/> wrote: 26 Jan 2007 <>
Google is stopping me from shopping at my oLancome site with your pop up blocker.!! Would you quit it!!!
Joan Swinburn <http://yahoo/> wrote: 24 May 2007 <>
somebody mucked up my computer. Please take it off. It is stopping me logging on to my yahoo page. I hateit. I am 81 and cannot cope with it!!!
brenda parker <http://microsoftinternetexplorer/> wrote: 24 Jun 2007
i am very unhappy with Google being on my page of the Microsoft Internet Explorer. gOOGLE SUDDENLY SHOWED UP, i did not order it, nor did I want Google.
Please remove it. You are Phishing.
Brenda J. Parker

Example 1 suggests the URL field is being interpreted as a place to enter a greeting. Two (10, 11) suggest commenters who believe the URL field is the equivalent of the "To:" field on an email message, while example 26 displays the name the commenter signs his post with. Others (6, 7, 18, 31, others not listed here) suggest commenters think it is an email address field (puzzling, considering there is another email address field labelled as such right above the URL field); in the first three, they have entered data in the form of an email address, from which the blog software has removed the "@" symbol. In examples 8 and 27, commenters have entered the names of Internet service providers, possibly the ones they use. And in examples 13, 19, and 28, commenters have entered what appears to be their geographic location in the URL field:

Hi Maury, my name is Davin Francis and I live in St. Kitt's. [...]
Posted by: Davin Francis <http://St.%20KITT%27S> on October 2, 2003 9:59 PM

Finally, examples 16 and 22 include some element (a misspelling of the blog title, and the post title, respectively) of the blog where they are posting a comment.

There are also a few marginal cases, in which commenters have linked to pages which load fine. One stranger to the Maury's Blooper thread posted the URL One might guess is a page belonging to Maury Povich, and the commenter was simply trying to direct other commenters there. However, the URL loads a "parked domain" page:

Image:Maury parked domain.jpg

There is no real content here; what is being displayed is essentially an advertisement to buy this domain name from As of yet, nobody associated with Maury Povich's production owns this page. So while this commenter got the form of a URL correct, one wonders if s/he had actually looked at the page before posting it and why s/he decided this was the correct text string to enter in the URL field.

A few strangers actually did enter viable URLs. Some of these even entered URLs which are more relevant to their questions than the pages they were commenting on. For example, a few strangers in the "New Barbie Dolls to Interact With Virtual World?" thread entered into their URL fields, and a few strangers seeking to cancel AOL accounts entered into the field. It is unclear whether they also used these URLs to seek answers to their questions.

Similar errors appeared to have been committed in the email field on comment forms. One stranger on the Overhaulin thread began his email address "www.", indicating a reverse of the confusion about URLs described above.

The major exception to strangers' tendency to misuse and malform URLs was in hijacked threads. In these threads, hijackers extensively used the URL field to point each other towards other useful resources, and included supportive URLs in the bodies of their comments. These were largely links to products (wedding dresses sold on eBay, vendors of ride-on bouncing balls) or additional information (where to learn about speech impediments, how to get cast in a Harry Potter movie). In the case of eBay and other sales, these links were sometimes to pages controlled by the strangers commenting, but more often they were to someone else's site. Thus correct URL use patterns among strangers differed from that of natives. Natives were more likely to link to their own pages, using the link to index themselves; strangers were more likely to link to a product page, cementing the indexicality of the subjects they were writing about.

Misuse of the URL field was a characteristic of stranger comments picked up on and parodied by blog readers:

Maury, I want you to find something for me. It is the website for a man named Ryan MacMichael. Do you know where it is? I can't find it, Maury, and I know this is your website, Maury, because there's a picture of you, Maury, at the top of this page. Maury, my 75 free hours are almost up. Thank you, Betty.
Posted by: Betty <http://internetting/> on August 26, 2003 8:21 AM

Between natives and strangers, then, the use of numbers and addresses seems to be a major shibboleth. Natives proactively used URLs and IP addresses to direct traffic, find website owners, and even locate Internet users in time and space. These uses coincide with the goals of Internet regulatory bodies, like the W3C and ICANN, for numbers and addresses. By contrast, strangers, when asked to provide an address, responded by providing other sorts of indexical information: topic sentences, email addresses, names, geographic addresses, etc. This information, unfortunately, constitutes a turn which the blog software -- the next turn-taker in the sequence, as they post their comment -- cannot make sense of. Like Norma adding ketchup to her M&Ms, the blog software tries; but the URLs it produces are ultimately not indexical in the eyes of ICANN: they will pass the reader on to error pages or parked domains.

Natives describe strangers

While they were less forthcoming about themselves, natives spent a great deal of energy describing who they thought strangers were. This involved distancing strangers from the comfortable, well-educated, tech-savvy people natives believed themselves to be. While many of these descriptions ventured into stereotype, some natives did correctly identify that a majority of strangers were female. At times, natives' observations about strangers bordered on gendered attacks.

Othering strangers

Natives expended most of their energy in these threads as defining strangers as "other" than themselves. For the most part, their descriptions did not really line up with the information strangers provided about themselves; but of course, from an anthropological perspective, these fictional profiles of the Other are interesting in and of themselves.

Were we to take natives at their word in their descriptions of strangers, we would be obliged to believe that strangers were AOL-using, porn-seeking, illiterate, mentally retarded Republican American women from a range of low-income areas (the ghetto, trailer parks, rural areas, etc.) Comparison with the demographic information gleaned from strangers comments presents two very different depictions of who these people are; whether or not strangers represent their demographic information authentically on the Internet, what they say about themselves is generally not similar to what is said by natives.

Building momentum during the eight-year presidency of George W. Bush, these comment threads saw partisan commentary from natives. A handful of them implicated strangers' political beliefs in their decision-making skills:

Wow!! It's really true that the average I.Q. in the U.S. is 100>just alittle over retarded. No wonder you retardicans vote for bush.
(UtterlyBoring AOL thread)

These implications also included parodies:

Hi cancell my acct. its password mission_accomplished. thakns i dont have time im too busy running the country.
--George W.
(Utterlyboring AOL thread)

The words "liberal," "left-wing," and "democrat" were never used to describe strangers. By contrast, the word "liberal" appears once when an apparent stranger is telling off Google and asking to be cancelled, and once when one native is accusing another of being a bleeding-heart for his sympathy towards strangers:

I was touched by Waitak's comments. There is clearly deep sociological meaning in these messages, and we should all open our hearts to these people. I will be joining Waitak to seek a federal grant for funding to identify these people, so we can then give them funding to sit on their asses [...]
Posted by: Bob Liberal on May 27, 2004 12:09 PM
(Maury thread)

Interestingly, in addition to identifying readers as Republican, some readers took time on these threads to link this kind of behavior to Americans as a whole. The bulk of these comments also appeared on the very long Maury and Overhaulin threads, but a few appeared elsewhere.

Many of these appeared to be outside the US:

LoL...american company shifting jobs to india.
Its easy to cancel AOL but u americans are really duffer.
(Utterlyboring AOL thread)
Being English, it warms my heart to see so many prime examples of American stupidity. Thank you for the laughter and the smiles.
(Overhaulin thread)

Some, however, appeared to be US citizens:

This whole thing makes you realize, it's too late. America is done. Give up, the race is over and we lost.
(Maury thread)
I don't know if I should be scared to be living in a country with so many morons or be laughing my ass off at the irony of it all. I'm leaning towards the the scared scenerio.
(Maury thread)

A few individuals took time out for "othering" strangers' stupidity by race. One commenter on the Maury thread wrote "the web creator of this site told u that this is just a fan site and u continue posting your white trash stories.. reading your stories makes me proud to be chicano."

There were hints of white racism and white supremacy in some readers' comments, as in this one sympathizing with the blogger who wrote the Maury Povich post:

Ryan, don't ever go to church, dude... they'll either crucify you or expect you to heal their mongoloid inbred mixed-race children or... both.

Another commenter posted a poem implicating "Ebonics" (the popularized name of Black Vernacular English) to the Maury Povich thread, without additional explanation:

Ebonetically, Phonetically a language is born
A perversity of English, it's rules ripped an torn
A dialect not just for those who's skin is black
But for those who are poor,unintelligent,and smoke da crack

(A search for phrases from this poem indicates that it may have at one point appeared on the open-contribution as part of a definition of Ebonics.)

Even more than making racial comments, however, plenty of readers on the Maury and Overhaulin threads spent their time "othering" strangers as poor:

And what we have here is an awesome example of why white trash should not be allowed to use the internet. Learn to read, you ignorant hicks!
Dear Murry,
I am tired of livin in the getto. Please buy me a six bedroom 4 bath house an a mercedes. I want to live in an gated community to. I don't think this is to much to ast because I have never knowed my father and I am entitled to someting.

The word "trailer" shows up a number of times in parody comments as the commenter's supposed location:

I live in a trailer and sometimes I can pick up Jesus on my bee-hive hair-doo. I asked someone about this at the bait and tackle store in my back yard and they told me to talk to you. You could talk to my bee-hive hair-doo and get some good answers for those poor waifs on your show. Get real close to my bee-hive hair-doo. Push your face right up in it and scream for God. The Lord will help you. And you are such an upstanding guy you will be the Lord's instrument.
(Maury thread)

Other natives describe strangers as unemployed or homeless. The fact that these characterizations were limited to the Maury and Overhaulin threads suggests that the content of the shows themselves and the audiences they attracted were formative in natives' conceptions of strangers.

Natives described strangers as lazy. These comments ranged from hyperbolic parodies of their ability to shift for themselves on a basic level:

Murry, I hungry send me a sammich.
Plz hury eye'll starve in a week or so.
(Maury thread)

to blunt critiques of their quests for self-actualization:

Maury can't do this work for you. You have to bust your ass day in, day out, often with no luck.
No kidding! Listen to her! There is no shortcut to talent.
"Hey Maury! I never got to university because I can't do math or read basic instructions but I'd really really really really like to be a brain surgeon so could you get me, like, a degree in medecine plz? Preferably without any work or ability on my part?"

from guesses about their mindset in writing to the blogger:

Hi, I would like you to cancel my eFax service. I know that you have nothing to do with efax, but I am extremely lazy, and thought that since you seem to have had success, that you could go ahead and handle mine as well.
(Cancel Efax thread)

to evaluations of their Internet use style:

She should pick up the yellow pages and look up exterminators. If you want to chew me a new ass, feel free. I'd love to hear your justification for why I should be polite to people who ask me to do work on their behalf for nothing.
(blogger, Bees thread)

The latter is interesting in light of the many studies in the field of library science which indicate that on the whole, most people seek interpersonal advice before going to print sources. (CITE) This reality apparently does not jive with the blogger's understanding of how the Internet and other resources should be used.

Beyond "stupid," "dumb," and "lazy," natives often used terms like "idiot," "retard," or "moron" which have special meanings of mental disability in clinical psychology. These pejoratives were joined with a number of comments which suggested strangers were inbred.

Further, some natives suggested that the defects they perceived in strangers should not be passed on to future generations, including the "remove your ovaries" and "parents related by blood" asides in comments quoted elsewhere in this paper.

Dear Overhaulin',
I am a complete idiot. I post comments before reading that this is a personal blog. I have a car, and have proven that I am a danger to society because my attention span is obviously zero. Please fix my car, then fix me, because I do not want the human race tainted with my faulty genes.
Sincerely yours,
Joe & Jane Americana
/Darwin Award Honorees/
(Overhaulin thread)
I will be joining Waitak to seek a federal grant for funding to identify these people, so we can then give them funding to sit on their asses, watch Maury Povich, and eat macaroni and cheese sandwiches all day. And reproduce. The gene pool is a teriffic thing to waste.
(Maury thread)

Bloggers and their readers did not always attribute strangers' behavior to their inborn characteristics, however. They often chalked these threads up to strangers' illiteracy. While this was sometimes attributed to a misunderstanding of URLs, just as often natives commented on a perceived lack of traditional print literacy, or problems with the simple decoding of written words:

I'm really concerned that people are so willing to post such private details on an obviously public website. But hey, I'm also concerned that everyone seems to believe that Maury reads this stuff. Whatever happened to basic literacy?
(Maury thread)
(Maury thread)

All-capsed comments in particular came in for derision, as they often do among advanced Internet users. (On Encyclopedia Dramatica, caps lock is sarcastically described as "cruise control for cool.")

Finally, some characterizations of strangers by natives tended toward the obscene. In parodies, strangers were sometimes depicted as seekers of Internet p0rnography:

oh god, please help me get my google back.
I wrote earlier and you must have helped me cancel it, but now I can't find any of teh pretty naked ladies
(OK-Cancel Google thread)
(Overhaulin thread)

Other parodies graphically depicted strangers as naively seeking solutions to sexual problems, or exhibiting fet!shes which led them to comment:

Hey Maury, I'm just wondering if maybe you could have more guests come out barefoot. No reason, or anything.......just have them come out barefoot. Especially the ladies. I don't care what anybody else says, Maury, seeing bare feet doesn't excite me sexually, I just suffer from priapism sometimes. Yeah, that bare feets idea is a damn good idea. Please show more bare feet, Maury, please! Bare Feet!!
Posted by: Pervo McSickerton on November 20, 2007 2:06 AM

I will now address in greater detail two more themes in natives' othering of strangers. First, their characterization of strangers as "AOL users," though it appeared to be inaccurate; and second, their accurate observation that strangers were overwhelmingly female.

1 In what is perhaps a bit of online magical thinking of my own, I've spelled particular keywords in this section in 1337, so as not to attract the attention of unwanted spambots while this paper is still online.

Strangers as "AOL users"

One aspect of natives' description of strangers drew on a very old archetype of Internet culture: that of the "AOL user," a gormless newbie assumed -- by dint of his association with a specific Internet geography -- to be unskilled at using the Internet. As was alluded to earlier in discussions of strangers' indexing of themselves by email address, this was not wholly accurate; little more than 15% of strangers, tops, provided AOL email addresses when signing their blog posts. However, the construction of "AOL users" is an interesting study of the invention of a digital divide and of historical memory on the Internet, so I will chronicle this stereotype below.

Bloggers and their readers routinely comment that strangers must be users of America Online's internet service, the implication being that their technology skills (or simply intelligence, or even genetic makeup) are inferior:

Josh <> says:
I'm beginning to think that AOL coats their free CDs with some sort of toxin. When you put the CD into the drive, the centrifugal force causes the toxin to be released, making you stupid.
(Cancel EFax thread)
I want to point out the last person posted 50 times.
Posted by: Ryan <> on October 30, 2003 12:59 PM
Must be an AOL problem.
Posted by: Paul on October 31, 2003 2:19 PM
(Maury thread)
Here is an AOL user who has come to this page from an AOL search <>, retyped the URL I provided for how to cancel your efax service in to the email field of my comments form (incorrectly retyped, at that), and hit "Say It". I am deeply confused.
(Cancel EFax thread)
PPS OVERHAULIN!!!!!!!!!!!!!!!!!!!!!!!
Posted by: borken_angel37
(Overhaulin' thread, emphasis mine)

Sometimes the reference to AOL is indirect, with a blog reader's joke making use of an email address or IM handle in the AOL domain:

dear maury,
that brain transplant didn't work. help. help.
your fan,
Posted by: mauryfan on December 17, 2003 12:03 PM
(Maury thread)
I want to be on Maury to give my co-webmaster a makeover. His name is Ryan and he dresses like it's 1982. MAURY PLEASE HELP!!! My IM is aol_lover203. thx by
(Maury thread)

And in one case, the reference to AOL is oblique, with a parody comment mentioning the company's "free hours" offers to lure in new customers and thereby suggesting AOL users' lack of commitment to Internet participation:

Maury, I want you to find something for me. It is the website for a man named Ryan MacMichael. Do you know where it is? I can't find it, Maury, and I know this is your website, Maury, because there's a picture of you, Maury, at the top of this page. Maury, my 75 free hours are almost up. Thank you, Betty.
Posted by: Betty <http://internetting/> on August 26, 2003 8:21 AM
(Maury thread)

One blog reader insists that there is a pattern which justifies this ridicule -- though numerical analysis does not bear this out:

[...]if you look at the e-mail addreses of these people, half of them end with "".
Posted by: merlinicorpus
(Overhaulin thread)

There were, of course, two threads which the blogger began with an anecdote about canceling an AOL account. While the Jonathan Coulton AOL thread did not attract too much criticism of AOL users for their own sake (perhaps because users of MySpace and Hi5 also asked for cancellation of their accounts, broadening the potential targets for natives), the thread on UtterlyBoring included vitriolic equations of AOL use -- prior even to mistakenly commenting on the blog -- with stupidity:

Honestly, you can see the difference between people that use AOL and the ones that don't . By posting a 'please cancel my AOL' on this site simply shows their level of ignorance, which in turn goes to show that they didn't do their homework before deciding to lay their CC numbers down to AOL.

There were a few commenters on this thread who claimed to be AOL employees, and refuted complaints about how hard it was to cancel an AOL account, drawing an entirely new firestorm of comments critiquing the service and its employees which were unique in the corpus.

A bit of history can give this construct more context: Internet adoption prior to the advent of the World Wide Web tended to be tied to university computer use -- so much so that the month of September (when college freshmen arrived on campus, received computer accounts, and began to participate in newsgroups) was known as a trying time for veteran Internet users obliged to acculturate these hapless newcomers (newbies, or n00bs) into their established etiquette. (CITE GROSSMAN)

In the Jargon File -- an Internet lexicon maintained since 1975, most recently by open-source software proponent Eric S. Raymond -- there is an entry for "September That Never Ended":

All time since September 1993.... [when] AOL users became able to post to Usenet, nearly overwhelming the old-timers' capacity to acculturate them; to those who nostalgically recall the period before, this triggered an inexorable decline in the quality of discussions on newsgroups. Syn. eternal September. See also AOL!.

Following that AOL! link:

AOL! [Usenet] Common synonym for "Me, too!" alluding to the legendary propensity of America Online users to utter contentless "Me, too!" postings. The number of exclamation points following varies from zero to five or so. The pseudo-HTML
<AOL>Me, too!</AOL>
is also frequently seen.

Since at least 1993, then, have denizens of AOL been recognized by elite computer users as mannerless boors set apart from reasonable Internet users by their writing habits? One might think that as AOL's influence as an Internet service provider has waned, this characterization of its users would go away; it has not. Blog reader merlinicorpus does suggest these memories persist among skilled users:

I swear, this is what AOL did to the internet. Even if the WAS the Overhaulin' website, do these retards actually beilve [sic] that getting on the show would have something to do with filling out a comments page?
/me rolls his eyes.
Posted by: merlinicorpus

Like boyd's research project on MySpace and Facebook, [2] this suggests a divide patterned by the use of particular web portals. This pattern is much older, however, and more closely tied to early popular adoption of the Internet itself (with all the attendant gender, ethnicity, and educational status biases that entails).

Of course, no clear majority of strangers on gumbaby threads left AOL email addresses with their posts, as described in the analysis of email addresses in the section on demographics. In fact, they were slightly more likely to leave an address from Yahoo, Hotmail/MSN, or their local ISP. Thus it appears the "AOL users" construction is mostly invented by bloggers and blog readers.

There were times, however, when strangers did express a loyalty to AOL. In addition to those strangers who openly defended the service (mentioned earlier), one hijacker on the Hoppity Hop thread solicited email addresses from other hijackers so he can send them pictures -- but he prefaced this request with the stipulation "If you are a member of AOL." There was a point in the 90s when AOL, CompuServe, and other early email providers did not allow subscribers to email other systems, which were basically "walled gardens;" it is unclear if this hijacker still believes this is the case.

A few strangers defending AOL on the Utterlyboring thread presented their understanding of the history of the Internet and various services' merits:

I love AOL,i dont know why its hard for people to understand that there is a big difference between a parent company and a child one.AOL started the internet world,any new thing that comes in the internet world it comes to AOL first.Dont u see the advertisement telling u about the safety features that AOL provides,see im concerned about my safety and security,now the ball is in your court.Think like an adult,dont be just concerned about the money.

The latter statements struck deep at natives' understanding of Internet hierarchies, and one responded:

AOL started the Internet world?? LMAO! How old are you, like 12? AOL made access to the Internet easier for the dumbass masses, I will give you that.

Strangers are female

Another set of observations by natives, however, happens to have been correct: they accurately identified that a majority of strangers represented as female, a trend confirmed by the demographic analysis reported earlier in this paper. In some cases, their observation was left to stand on its own; in others, it became the basis of gendered attacks against strangers.

Natives directly and obliquely noted the trend that many commenters represented as female. This happened most on the Overhaulin' thread, where strangers were disproportionately female and the standard request for assistance with a male relative's car was so common as to appear to be written from a template (as evidenced in the word tree about the phrase "my husband has a," in a section later in this paper):

Wow, this is totally nuts. So many wives, girlfriends, mothers and even grandmothers making their case, right here in the comments.
2. Most of the postings were from women. Women who actually knew a fair bit about the cars, and wanted their men to be happy. Good Men who Provide for their Family, Men who don't get to Have Fun. Damn, it's good to see women appreciating that, and trying to get their men some testosterone-related happiness. Course if they did get the cars redone, they'd probably drop a couple hundred on baskets and dried flowers at the local craft shops, but that's besides the point. Bravo, ladies- you clearly care about your men. Who says the American family is dead? Not I. As long as some women is yellin' "Ma husband needs hiyuuus '49 studebaker feyuuxed" from an Arkansas web cafe, there's hope for moral rectitude, kindness, and culture in this country.
are women really this dumb?

A reader and blogger on the Tom Welling Is Gay thread also noted the comments tended to be female:

One question, are these so called responders straight women? Maybe they are jealous that a hottie like Tom goes for MN's top queer Andy!!!
Maybe they need to go out and have some fun, obviously their boiz aren't giving them any, haha.
Posted by: Mike | November 20, 2004 1:58 PM
[...]Yeah, their all straight women, I'm assuming. Poor things.
Posted by: Andy <> | November 20, 2004 2:45 PM
That's funny! They sound like teeny-boppers who read Tiger Beat and have lives consumed by pop culture - VERY funny!
Posted by: SparklesMpls <> | November 21, 2004 8:14 AM

And natives on that thread and two of the Johnny Depp threads described strangers as "fangirls":

Grant, care to write an entry on what happened to the intelligent conversation regarding the movie industry once your blog got linked to some obvious fangirl squee site?
(Johnny Depp thread)

The second of these referred to a pejorative phrase William Shatner once issued to star Trek fans:

I don't know why anyone would think that (a) Orlando Bloom would be reading this review and (b) after reading these comments, would be crazy enough to share his email with you wackos. As William Shatner said on Saturday Night Live, "Get a life!"

Other allusions to the gender imbalance tended to appear in parodies focused centered around women's reproductive capabilities. Particularly on the Maury Povich thread -- doubtless in part because a common segment on Maury's show is "Who's your babydaddy?", and many strangers in the thread sought Maury's help deciding a paternity issue -- many parodies from natives were written as if by low-income women seeking Maury's help:

I would like to know weither or not you could buy my babies some food. theyre tired of devouring the dead rats outside of our house. And maybe if we could have all 14 of the possible baby daddy's tested. i know one of them from this town is the reall daddy.

Pejorative descriptions like these suggest that another Maury thread native's exhortation to "remove your ovaries" be understood not just as a eugenic suggestion, but as a specifically gendered eugenic suggestion.

Strangers as a community

Grounded analysis found strangers making moves which defined them as communities, at least within the topic of each particular thread. Across celebrity-related threads, strangers used religious language; they also described themselves as communities of fans of shows and of celebrities in music, TV, and movies. To help maintain their own interpretation of the context of the threads as their own spaces, they corrected each other, delivered retorts to natives, and presented evidence in their own support.

Religious language

Strangers were prone to using language with a religious cast. By far the majority of religiously-toned comments appeared on the Maury Povich thread, with the second greatest number appearing on the Overhaulin' thread; but some appeared in other threads where strangers addressed celebrities and television shows.

Some religiously-toned comments were simply narratives about strangers' lives. Strangers referred to their loved ones or deceased people as "angels" and good works as "blessings;" they spoke of thanking God for the things in their lives which were going well. A hijacker on the Lisp thread described how she was called by God to preach despite her lisp, and exhorted others in the thread to trust Jesus to fix their problems. A few strangers indicated they were writing on behalf of ministries in which they were involved. God also featured in a few strangers' reasoning in the solutions to riddles.

However, in an overwhelming number of cases religious language was written in an attempt to evoke assistance. Just as many strangers evoked their loyal watching of a particular show in an attempt to improve their chances of a response, they also tended to thank a celebrity with a "God bless you;" this sign-off appeared elsewhere as well, including the Spiders thread, the Lisp thread, and one of the threads about international essay competitions. (Use of the name of the Lord by natives, meanwhile, was largely in vain.)

Commenters expressed a belief that the celebrities they were writing to could "make miracles come true;" that they were "angels;" they described the acts of celebrities and television shows as "blessings;" and they offered their own prayers and blessings in return for the help they hoped to receive:

(Maury thread)
(Overhaulin thread)
hey muary,i would just like to tell u that u are the best thing and an agel(sic) for sure that god made in this world .
(Maury thread)
Pls I am a student of 18 yrs old . I am looking for help. I was touched watching some of her shows and so I decided to write her cos God revealed to me that I would be blessed through her.Pls try and send me her email address on the email I stated. Thanks while I wait for your favourable actions.
(Oprah thread)

At times, strangers asked things of celebrities which were already within their power to accomplish:

Dear Maury i am writting this to ask for your help. I need a way to let four (sic) people in my life know how important they are to me. [...] I met Joe about three and half months ago at the school we both attend too. Joe asked me a month ago to be his girlfriend, and i said yes. The day i came back from the hospital i found Joe in the hallway and explained to him what the doctor told me. He looked hurt and explained to me that we would work through these minor things. But because the school made me leave that same day, i missed Joes birthday. He has been there when ever i had a problem. He even helped me when i couln't make it to the doctors alone. All i am asking is to help me let these people know just how much they mean to me. If you could help me i would appriciate it so much.

The assumption seems to be that Maury's resources for thanking people -- perhaps with material goods or a television appearance -- are greater than the speaker's. However, this kind of plea could also be read as a request for personal strength, the kind of address often made to a higher power. A commenter on the Oprah thread made a more direct request for words of support rather than material assistance, with a sense that verbal support would be forthcoming as it had been in the past:

please if you have any wors that may help me thru this time of term oil please,i admire your strenghth that got you thru the tough times and you dont know how much you have help me .i talk to you every day and i feel that you are listening with open ears tahnks so much for just being the wonderful person you are and allowing me ti open my heart for the first time in my adult life i cant tell you how wonderful it feels to let some one in

This comment also describes a sense of two-way communication between the stranger and the celebrity, who she feels is "listening" to her daily. These requests themselves take on some of the form of prayers.

The religious feel of these comments is reflected in natives' reaction to the mass of them (rather than to specific religious-toned comments):

Maury is not Jesus, he can not heal you nor will he give you tens of thousands of dollars because you feel you deserve it.

It might be a stretch to say that all use of religious language on these pages indicates that strangers view their online writings as akin to prayer; that cannot be assumed. However, one stranger's comment refers to a relationship between online writing and prayer which suggests a connection to a genre of webpage not otherwise countenanced in this study:

(Bill Gates thread; emphases mine)

A search for "online prayer" in Google turns up thousands of sites where Internet users can request that others pray for them; the highest-ranking of these at the time of writing is affiliated with CBN, Pat Robertson, and the 700 Club. A Google search for the name of at least one other stranger in the corpus yields a user account on Clicking through a few of these links yields interfaces which are very similar to the comment boxes on a blog; whether they generally post comments in a visible location on the Internet is not clear. Submitting a comment to CBN yields a message saying that staff pray over each request received.

It does appear that this commenter in particular mistook a blog for a prayer request site. Considering the very revelatory nature of other comments in the corpus, one wonders whether many strangers assumed their messages were being treated as prayer requests and would thus remain private.

Television viewing as context

Strangers often referred to television as the context of their comments: the material of the shows themselves; their devotion as viewers and the community that made them part of; their hopes; and their mutual knowledge of how to contact celebrities.

It appeared from the similarity of contemporaneous demands on the Maury Povich thread that strangers were responding to particular episodes as they aired. At times, there were small clusters of requests about finding out if houses were haunted, finding lost fathers, and other themes. This made "stranger time" different from "native time:" while natives always carped on about the same topics (why can't you spell! why can't you read!), strangers conveyed a sense of being on the same page and responding to the same concerns as each other.

Strangers commenting on celebrity threads affirmed a feeling that they were all part of a like-minded fan community -- long a mode of discourse noted by pop culture scholars -- and identified with the celebrity in question. (CITE) For example, on the Avril Lavigne thread, one commenter identified a commonality of feelings about the singer:

girl, you're exactly like me...i'm like avril too , i have the same feelings, i love the same song, nobody's home it makes me cry i love it so much, when i am sad i just listen to it to put my feelings in an order... it's unbelievebel...

Interestingly, this comment could not be clearly identified as a "stranger" thread contrary to the blogger's purposes for his thread. The commenter does not address Avril Lavigne directly (the only tactic the blogger identifies as incorrect in the comment thread); the commenter may well be responding to the original post, where the blogger mentions the song "Nobody's Home." However, if this commenter is responding to the blogger, (s)he has 1) misidentified the blogger as female, and 2) does not cleave to the blogger's detached, critical tone about the singer (the blogger writes "I would never go so far as to actually like purchase an album of hers[...]. If the videos for Happy Ending or Nobody's Home came on TV, I probably wouldn't change the channel.")

Thus, this stranger's comment helps establish this comment thread as a place for fans, counter to the original intent of the blogger. Practices like these set strangers apart from readers and bloggers, and helped mark these threads as fan territory rather than bloggers' territory.

Visualizing words related to television using ManyEyes gave the impression that strangers were using their devotion to watching particular television shows as evidence of the worth of their contributions to the conversation. In particular, a tremendous number of strangers used some permutation of the phrase "I watch your show" in writing to celebrities. This was most common on the Maury Povich and Overhaulin' threads, as these threads were very long and related to celebs who currently had TV shows.

Image:'watch' strangers' literacy words for final draft.png

The phrase was used to indicate devotion to the show; it was often "I watch your show every day/week" or "I watch it and love it." This was sometimes accompanied by a declaration of a personal relationship with the celeb:

I watch your shows every day and I think that knowbody likes you like I do
(Maury thread)
Thank you very much maury you are the greatest person and my friend and I watch your show every day
(Maury thread)
I love all your movies, Bill. You always make me laugh, and that's important in these somewhat gloomy times. I just wanted to thank you for all those laughs. I also want to thank you for giving me my first glimpse at the late great Hunter S. Thompson.
(Bill Murray thread)
you dont know how much you have help me .i talk to you every day and i feel that you are listening with open ears tahnks so much for just being the wonderful person you are and allowing me ti open my heart for the first time in my adult life i cant tell you how wonderful it feels to let some one in
(Oprah thread)

Interestingly, the frequency of their use of the word "watch" is about on par with the number of times natives used the word "read." While natives talked about reading and writing as a means of identifying where one was on the Internet, strangers spent their time addressing celebrities, expressing their hope that the celebrities would read what they had written:

Image:Read strangers.png

... or thanking them for reading, or sometimes doubting whether the celebrity is reading. The proportions are skewed: the word "read" is used about the same number of times by strangers and by natives, but recall that strangers left about twice as many comments.

If nothing else, the breakdown in how reading is discussed highlights the different expectations about audience which natives and strangers have. However, it could also be taken as indicative of the different media preferences of strangers and natives, and how these play into their preferred means of making indexical sense.

Consider also strangers' consistent use of the words "fan" and "show," also describing the TV properties they are generally writing about:

Image:Fan strangers.png

Image:Show strangers.png

The use of these words was highly patterned, with many strangers using the exact same phrases. (The phrase "your biggest fan" is so common in these threads that it can be used to reliably find more examples of such misunderstandings in blogs which were not suggested for this study.) Natives, meanwhile, used the words "fan" and "watch" less than ten times each; the word "show" shows up 49 times, and is used in arguments that these comment threads will never reach the celebrities to whom strangers think they are speaking. More on the things strangers said about watching TV shows later.

Fans expressed their faith in the inherent goodness of the celebrities they wrote to:

Chip you do great things for peaple keep up the good work.
I Love watching your show.
(Overhaulin thread)
Again Thank you for your time in this matter. You And Your Wife Are GREAT PEOPLE.
(Maury thread)

When fans did not feel they were attentive enough, it deserved an apology:

dear maury.i am one of your fans but unfortunately we can your show is not broadcasted here even with the sattelite dishe.
(Maury thread)

Requests were presented deferentially; fans made it clear their continued viewing was not conditional on receiving a response to their plea:

I know that the best birthday gift he could get would to be to get his car overhauled. [...] We will continue watching and enjoying your show. Thank you!
(Overhaulin thread)

Another phrase which shows dramatic results when visualized in ManyEyes is "My husband" -- a common phrase in the Overhaulin' thread. Women wrote in in great numbers suggesting their husbands' cars for the show (as was accurately observed by a handful of natives). This phrase frequently began strangers' comments in this thread. The overwhelming regularity with which this indexicality shows up -- "I'm writing to you for my husband, and not myself" suggests the strangers' positioning of themselves: they are not the audience, but they are demonstrating their relationship to the audience as a means of vouching for the demands they are about to make.

Image:My husband has.png

Strangers who thought they were writing to TV shows or celebrities made their case for their worthiness of TV show attention by drawing on their stories' appropriateness for television. Strangers often offered story elements about their lives which they thought would make for better TV:

Dear Maury,
Finding the right show idea is always hard. Especially when you have pepole who have too big of breast or have a killer body and want to show the world. Well I have one that isn't like the others and hope that I can reach you this way. I tried when I was 19 and mailed to a letter but I never got a response from you and I kind of figured that you must be busy. My name is Dominique Evans, I'm now 23 and am trying to live life on my own. Times now a days are hard; I should know, I was born with a learing disability. [...] Anyway I was writing this letter because I need to ask you a favor. My mom has always been there for me through thick and thin, now I want to be there for her.
(Maury thread)
OMG Maury! I love the show about cheating spouses! Get this! My husband of 5 years has been cheating on me since the day after our marriage and even while I was in the hospital giving birth to our *now 4 year old* twin boys. I ask him why and he lies and says he has not and that he is just late at work, out with the boys or driving around town, car trouble and many many lame excuses. I saw the hidden camera idea *great* so my mother, sister and a few friends have been taking turns following him around town the last 2 weeks and guess what!? We caught him red handed with his hands on her cookie jar. [...] PLEASE EMAIL me when you can air this! Thank you, P.S I can pay for tickets and flight fair but will need help paying for a babysitter to stay with the kids! This is good and well worth the time of airing!
(Maury thread)
Please help me! My hubby bought a 1986 lincoln towncar in dec and has a dream of makeing it a hot rod but he doesn't have the time with 4 kids and two jobs. I would love to see what you folks can do to the car and make it his dream car. Oh by the way the reason he decided to buy it was that is identical to the car his parents had when he started driveing. thanks a bunch please help me, jennifer
(Overhaulin' thread, emphasis mine)
Yikes! Everyone wants you to overhaul their car. Can't imagine why you'd choose my son's except that it would certainly put ALL of your best skills to the test. My husband and I bought our son his dream car for $1500. An 85 red Mazda RX7 -hadn't been driven for about 8 or 9 years. We foolishly thought we could get it running . . . Well about another $1200 later, it's still sitting on our driveway.
(Overhaulin' thread)

Beyond focusing on the appropriateness of television stories, some strangers (mostly younger females on movie-related threads) used dedication to acting as a marker of one's worth to be in a film or meet a celebrity:

hello people why are you so obsessed with johny depp i know he is really really hot and is a great actor with fantastic skills but instead of trying to get his info why dont you take acting lessons and work your way up to an actress i always say you need to give to get just like you need to give your time to get johny depp info one day not trying to be mean :)
johny if you read this i want you to know i am inspired by you and when i see one of your movies it brings i smile to my face i am taking acting lessons i want to become an actress even if its hard to become one and if i do not suceed i want to be an under cover agent
(Engel-Cox Pirates of the Caribbean thread)
The last thing I want to say is that I find it VERY ANNOYING (and kinda stupid actually), that people want to be in the HP films because they love HP. That's all very well but what about the rest of us who love acting? Whjat about the people who are working hard to make their way into the acting world? You should want to be in it because of the acting, not because you think Dan/Rupert/Tom etc. are hot and you want to meet them...In that case go to a premiere or something and meet them and let the rest of us try and act!!
PS. This doesn't mean that I don't think that you can love HP, Dan/Rupert/Tom etc AND acting. Because that's entirely possible (probable). But it's just annoying that that's the ONLY reason you want to be in them!!
(Harry Potter thread)

Sometimes, strangers tried to participate in other ways: for example, offering advice or assistance to people they had seen on the show (or even on other shows!) This has some precedent on talk shows, in which members of the studio audience are invited to stand up and respond to the panelists.

Dear Maurey,
Friday Jan 26,2007
I was watching Good Morning America and Diane Sawyer is doing a sigment on children in Camden New Jersey, One little boy touched my heart his name is Ivan. [...] Ivan said on the show this morning he would like to be superman so he could get his family a house. Is there anything you can do for this family, He and his brother and mother sleep on a chair. [...]
(Maury thread)
Maury, Im 45 yrs old and have watched you all my life, Your shows ever since i can remember have made me laugh [...] Also a quick note to Cathy on the 4/7/06 show. Your beautiful and forget about Evan. Theres someone out there who will love and respect you the way you deserve to be in this life.
(Maury thread)

Strangers sometimes demonstrated a sense that they belonged to part of an audience, and addressed that audience in lieu of celebrities or shows.

I'm not here to ask Maury anything....I just want to know if anyone ever noticed that 2 guests of Maury's were also on Jerry Springer(yes I watch it when I can't sleep) [...] well on Maury the sister's are "close" on Jerry they HATE each was it a mistake that the producers didn't catch? or are Tabitha and Taleena just a couple of sceemers?
I just wanted to know if anyone else caught on to it.
(Maury thread)

Sometimes strangers assumed the purpose of a site was to serve the audience they imagined -- usually a very different audience than the one imagined by natives. Imagined audiences on celebrity-and-show-themed threads were very similar to Internet fan audiences described by popular culture scholars. (CITE) Sometimes they imagined a very specific reading of a celebrity as sex object:

I think everyone who thinks that Tom Welling is gay should take a long hard look at themselves... seems to me that you have some gay tendencies yourself... if you fancy the guy then thats not a problem, everyone on this site does but just cut the all should be ashamed of yourselves really.
(Tom Welling thread)
If you think everyone is so lame for coming here and discussing MK and A, then why do you keep coming back? None of these people give a shit if you like the discussion or not, your posts seem more pathetic than any of the other one's i've read, so why dont YOU get a life and quit coming here? You know just as well as I do that if MK and A pose for playboy you'll be first in line to get your copy so quit being a moron.
(Mary Kate and Ashley Olsen thread)

Threads about shopping not only imagined certain audiences, but actively began creating them, with the bloggers' half-spoken approval. On the How To Sell A Wedding Dress thread, for example, many hijacker comments began "Hey ladies!" or "Hey girls!" Participants on this thread as well as the Lisp thread expressed solidarity, happy they weren't alone in managing the emotional baggage of an unworn wedding dress or a lisp:

I'm so glad this blog is still in operation-
and that I've found so many teens like me who suffer from a lisp.
(Lisp thread)

Business transactions apparently actually took place as a result of the wedding dress thread; one hijacker effervesced that she had sold her dress: "I can't believe it, posting on this blog actually worked!" While some trade happened on that thread, the Hoppity Hop thread was unique when it came to the sheer level of activity of the audience developed there. In the course of the thread, enthusiasts of these ride-on ball toys traded price and product quality information; developed a relationship with an overseas vendor of the product, who had found them there; and worked to set up payment systems. Eventually, when blogger Josh threatened to shut down the thread, the participants asked him if they could copy the content of the thread to another website -- "it's yours," said the hijacker asking for the move, while another hijacker said "Thanks for letting us hang out this long." Josh consented to the move and even offered to link to the new site (no doubt eager to get traffic headed elsewhere). Eventually, he shut down comments on the thread.

Finding and presenting evidence

To different groups of strangers, different kinds of evidence counted as adequate to defend their own take on indexicality, topic, or channel. Not all of these are limited to Internet evidence; it appears that especially in the case of celebrities, evidence can be brought in from other domains known to be related to the celeb in question, such as television or magazines. However, at times, some strangers attempt to claim evidence as simple as an email address as proof enough to make their case. "Finding and presenting evidence," in this section, sometimes has to do with how strangers determine the truth of other media they come into contact with as well.

The Olsen Twins thread has a number of evidence claims. Strangers offer evidence that they have information about or contact with Olsen twins, or refute others' claims based on the quality of their evidence. Interestingly, the uses of evidence break the strangers on this thread down into two groups: (mostly) female fans, who try to defend the stars' reputation, and male pr0n-seekers, who are trying to confirm that the stars will or have posed nud'e. Female fans drew on evidence from television, print publications, and from their own loyalty to the stars as fans:

Anonymous wrote:
Don't get your hopes up you pervs! Those girls will never pose for play'boy. I have been a devoted fan of theirs since I was two. I know the girls and I have watched what they do enough to know that it will never happen. They are the perfect role model for little girls, and they have their heads on straight. I am so tired of everyone saying they can be bought. Well you are all wrong. It will never happen.
mary-kateandashley no.1 fan wrote:
I so do not beleive that the Olsen twins are gonna pose nud'e. U can tell they're just not like that. I mean u neva see them wearing the most revealing clothes when they're at the beach, they wear a normal two-peice bathers ...

Male pr0n-seekers, by contrast, drew on Internet-related evidence and legal arguments (as well as television) to evaluate the possibility that they could find the media about the twins they sought:

mick wrote:
it is impossible for mary-kate and ashley to have signed a contract with hugh he'ffner or play'boy to appear in play'boy when they turn 18, due to the fact they would have had to be less than 18 to sign such a contract and the fact the contract would be non legally binding unless the party so mentioned is over 18.
MKLJJ wrote:
Adam wrote:
hey People. All of you that think Mary-Kate and Ashley Olsen will not pose.....God have mercy on you because you are very miss informed people.
I have talked to Mary-Kate and Ashley over Msn, and Yahoo! and they stated that they are going to pose.[...]

The latter claim was contested by a fan:

courtney wrote:
adam, you are very very naive if you think you are actually talking to the real mary kate and ashley. They would never waste their time trying to clarify the truth behind the rumors of Play'boy. Especially not to some people on line they don't even know. They are obviously much too busy multimillionaires to worry about that. Get REAL.

Evidence was mustered in other ways on other threads. In the Harry Potter thread, one stranger who offers advice about casting for a Harry Potter movie uses the nickname "warnerbros." This reference to a major movie studio appears calculated to make this comment stand out as authoritative among a thread full of comments by everyday people. The stranger mentioned earlier who signed her posts "Your Harry Potter Speaker" also employed self-identification tactics as a means of asserting authority.

Other strangers made claims of personal contact, reference to franchise-owned sites, or used the personas of celebrities to claim their information was correct. Many examples of this were available on the WWE thread, some mustered to contradict each other:

This is the Real WWE Diva Maria Kanellis!! And yes I am here to say that ALL OF THE EMAILS THAT THESE PEOPLE GAVE YOU ARE FAKE!!!!! And the WWE will have them Trashed!!! Here are the Real emails and I can prove it go to
(Elsewhere WWE thread)
yo all wrestling fans all them e-mnail addresses are fake ...non of them work i go to wrestling and have met superstars like tommy dreamer , sandman, undertaker, mick foley and many more so feel free to add me on msn
(Elsewhere WWE thread)

Once strangers established the dominance of their own preferred indexical relationship to the topic, they took what steps they could to reinforce that relationship. They were not able to delete or close comments, like bloggers were; however, like readers, they could criticize those who went off topic. A series of political comments, not germane to the topic, which were posted on the Mary Kate and Ashley Olsen thread yielded the following response:

why the hell did sumone bring up george bush? this is supposed 2 be about mary-kate and ashley!!

On the Hoppity Hop thread, an artist posting an essay about an unrelated piece of art (apparently for the sake of provoking commenters into new modes of reception) provoked the comment "What in the world does that have to do with hoppity hops?" from a stranger who commented repeatedly on the thread. When a second off-off-topic stranger commented about the artworks mentioned previously, though, it provoked no response.


It has been suggested to me that this study is likely to face consternation due to the wide range of fields it draws on and addresses; my colleague James Grimmelmann (who I dare to call a colleague, even though his scholarship on the Internet comes from the faraway clime of the legal tradition) and my advisor, Herve Varenne (who has done his best to help me work as an ethnomethodologist, despite my recalcitrance) have expressed concern that readers may have difficulty placing the outcomes of this effort.

Doubtless. I am coming to view this as an occupational hazard of growing up into an academic department more defined by its subject matter (digital media) than by its methods. When one has been trained as a generalist, working in a single field goes against one's training, sacrificing the strength of one's breadth and perspective for a depth yet to be developed. All one can hope to do is apply methods from other fields as rigorously as possible, and hope the results will be of use to colleagues working on the same subject matter, who may themselves be drawing on a diverse set of methods and traditions.

In an attempt to work with rather than against the diverse findings of this study, I will thus divide my observations by the fields to which I believe they are relevant, with caveats regarding my knowledge of those fields.

I will first draw some observations for what I consider my "natal field," New Literacies. Because this is where I have spent the largest part of my time, in terms of attending conferences and reading literature, this is where I feel my observations will be strongest. I hope to use this study to suggest to the field a return to the focus of the likes of Brian Street, and directions for honing the strengths of the field as a whole, both as researchers and literacy practitioners.

I will then go on to pose questions related to mediation generally, and its ramifications for linguistics, cultural studies, HCI, and possibly even the semantic web. In these fields I can mostly only pose questions without connecting these to the larger bodies of literature, as I have done little more than work around their margins.

I only came to an understanding of conversation analysis, large-scale online conversations, search engines, and the semantic web in the course of this dissertation, and a familiarity with HCI as a professional field right beforehand. As a result am not confident that my questions have not already been raised elsewhere. I am unsure of the extent to which linguistics generally, and semantic search questions specifically, have addressed the disruption of context elements like indexicals, channel, and addresser/addressee relations in mediated communication. As a result, my observations here should be viewed as seeking traction in these fields; mostly statements of my own findings, raising a few questions whose answers may already exist in a corpus with which I am as yet unfamiliar.

Similarly, while I feel I have a fix on the general shape of the fields of cultural and media studies, there are traditions of thinking within those fields with which I am not familiar. I can only note the findings from this study which I feel are of interest for that field, and hope that at some point I will either be able to reconcile these observations with existing ones, or pose new questions to the field if these questions have by some chance not yet been posed.

Power, the Internet, and New Literacies

This paper has attempted to understand social control in the way it is understood by the likes of Garfinkel and Latour: not as entirely top-down, not as entirely grassroots, but rather as a negotiation between actors in given situations, some of whom will always come to the table with more power, resources, or influence than others. The powerful literacy practices being negotiated here have to do with URLs, search engines, portals like AOL, and email addresses.

Sassen and Lessig have stated that organizations like ICANN and large Internet companies like search engines may exercise de-facto control over the supposedly unregulable Internet. And as it happens, grounded theory analysis of the data shows that natives -- blog owners and their readers -- indeed make reference to these institutions as a means of exerting their preeminence and control over their turf on Internet. They use domain names and IP addresses -- granted by ICANN -- and URLs as evidence of ownership, genre, authority of information, paths that users have taken on the Internet, and even as demarcating the social "wrong side of the tracks." They are able to muster traces of search engine use, and discuss the "right" and "wrong" ways to use search engines. These assertions of authority are made daily on the Internet, in everyday interactions between users.

Natives also know how to make use of email addresses in a way which saves them time by protecting their inboxes from spam and phishing. And because bloggers have the unique capability to do so, they can sometimes also use these addresses to block who is able to comment on their webpages, or even find them on the Internet (by use of a robots.txt file).

If these natives are coders, designers, and sysadmins than strangers are (and it seems, from the demographic information gleaned from their About pages, that they may be), they may also be involved in less visible writing of the code of Internet applications themselves, possibly even blogging software and search engine algorithms. At the very least, like the writers of the Gadgetopia, OK/Cancel, and Lemonodor blogs, they are engaged in discussions within and about the industry that writes this code.

As it stands, strangers often do not make effective use of these powerful Internet literacy tools. They do not write URLs in a way that would be useful to other users, or even to computers; they do not read them in a way which clarifies whose writing they are reading. Many do not appear to use email addresses in a prophylactic way, indicating they may receive more spam or phishing offers, making their email harder to navigate. They also make use of older literacy practices, such as writing out geographical addresses, which may actually put them in harm's way. When they post street addresses, personal details, passwords, and credit card numbers, they reveal dangerous misunderstandings of the public nature of the Internet software they are using, which may compromise their safety and financial security.

Literacy is, of course, a two-way street. Just because natives often read URLs and search results in effective ways does not mean they are always sophisticated when writing blog posts, comments, and links. Most bloggers in this sample (including the author of this paper) did not welcome the intrusion of strangers onto their blogs, yet because of the ways they wrote their blog posts, they were complicit in inviting strangers there. Search engines read their links and headlines and acted as the final determinant in the question of who was right or wrong in these literacy struggles: Natives wrote semantically-charged links to threads, thereby inducing Google to raise those threads' ranking; they wrote links to blogs they liked, upping their PageRank too. And they (we!) wrote post titles, blog names, and sometimes even posts themselves rather poorly, if we genuinely did not wish strangers to show up on our websites and ask for naked pictures of Mary Kate and Ashley Olsen.

These findings suggest important new directions for New Literacies research and practice, and possibly library science as well, as that field certainly shares the same concerns and has begun to make connections with the work of Lankshear and Knobel, among others. To genuinely empower those learning to use the Internet, there are specific lessons in reading and writing which are vital to teach.

URL literacy is of critical importance. As has been noted (NEED CITES), this must go beyond the simple original guide that ".edu and .org sites are ok for research." Educators have been aware of this for some time; the canonical example of why this rubric is a bad idea is the scientist at one university who ran a Holocaust denial page in his .edu domain (CITE!).

What will be more important in URL literacy is introducing students to the WhoIs function, to the meanings of different top-level domain suffixes, and to other elements of a URL which can be read and understood ("about.html" might contain more information about the website, for example; viewing the URL of a popup promising a free iPod; or looking at long URL strings to see if a website might be inappropriately passing private information about a user in plain view, to understand website security). A basic explanation of ICANN, how IP addresses are resolved into domain names, and other elements of online epistemology related to authorship and ownership would also be advisable in the higher grades.

Lightweight browser-based scaffolding solutions may also be available, or if not, might be produced. One existing tool for building URL comprehension is the FlagFox plugin for Firefox, which provides information about the domain in which each page's URL is registered as you browse.

Educators should also be aware that software which attempts to fill in, correct, or redirect for half-formed or malformed URLs may hurt learners' awareness of URLs, and redirect them to places where they did not wish to go.

In schools, this kind of education about URLs may mean loosening or abandoning "net nanny" software -- or allowing for Internet access where it has been blocked. With such software in place, it may prove difficult to demonstrate the ownership of websites which teachers find questionable.

It is worth noting that URLs and IP addresses have been around since the advent of the World Wide Web; unlike chat software, Twitter, online games, blogs, and other online software which has captured the fancy of New Literacies research, addresses are the building blocks of the Internet, and not likely to go away anytime soon. They require forms of writing and reading which are specific to interaction with computers, unlike forms of Internet patois which are purely social in nature. Further attention to the writing which runs our informational infrastructure, rather than skating along its surface, should be key.

Not to say that educators should not be introducing students to different varieties of software, on- and offline. Strangers in this corpus clearly struggled to make sense of which software they were using at different times, and this had ramifications for their security. But it appears that what may be more pressing than teaching students to use blog software, email, and browsers, it may be critical to first explain to them distinctions between different kinds of software, with attention paid to basics of how the software interacts with their computer and the Internet.

New Literacies research should keep in mind that what we are looking at when we look at natives' online writing are the public literacy practices of the powerful. To genuinely empower online literacy learners, the field should continue to investigate these ways of reading and writing, along with their more private and complex ones, such as those employed when developing software, setting up servers, presenting projects to managers, and otherwise making the Internet run. Should the field continue to ignore these literacy practices in favor of celebrating the creative practices of casual users working with end-user software, it will do a disservice to young literacy learners by failing to provide them the most powerful (and lucrative) literacy skills we might offer them.

To engage meaningfully with these forms of literacy, it may be important to be in closer dialog with industry and academic departments of computer science. These institutions may not think of these terms as issues of "literacy," but they should be able to give a clearer picture of what skills they find lacking in graduates entering their programs. Departments with a focus on computer security should have particularly interesting contributions to make to our understanding of online literacy; research on search engine optimization, phishing, and spam could tell us a lot to help empower those learning to navigate online.

Various New Literacies scholars, including Lankshear and Knobel, Callow, and Kress, have stressed the growing importance of visual design to online literacy. This can be important from both the reading and writing sense. The findings that this study shares with the human-computer interaction work of Jakob Nielsen specifically (and from, say, the blog A List Apart as well) suggests that literacy instruction could take cues directly from HCI research and established experts on web design. This research has identified online writing practices which are not conducive to online navigation -- such as poorly-written titles, under-developed About pages, and badly-labeled links -- and there is probably a secondary-school multimedia design curriculum waiting to be gleaned there. Both "native" and "stranger" students, those with proficiency and those without, could benefit from such a curriculum.

Beyond issues of technical skills, this study suggests a few more directions for the field of New Literacies as a whole. This study underscores that Leu's claims about deixis in New Literacies were a missed opportunity: while he was speaking about the word literacy, no attention was paid to how deixis, or indexicality, proves to be a problem for new readers trying to find their way around online, or how this might relate to problems of deixis faced by readers in any setting. This might prove to be a useful site for research for those looking to identify important areas of competence for young readers.

Continued attention to Sassen, Lessig, and other legal and political scholars of the Internet should also guide an ongoing attention to the privileged and degraded literacy practices which should be the bread and butter of New Literacies scholarship, as should studies of media ownership.

There may be concerns which extend beyond the reach of academic work and into the world of policy and regulation. The use and operation of search engines seems to fall into this category.

Library science and related fields have long tried to make the workings of search engines comprehensible to users (CITES); this is a focus to which New Literacies scholars should also attend as well. The workings of search engines, both social-organizational and mathematical, helps readers understand the context of the pages they land on. Deciding that the first search engine result is first simply because it is "the best" divorces the searcher from the powerful reading and writing practices of the people who make Google, and the rest of the Internet, run. Users who do not understand the effect of their own linking, titling, and commenting behavior on the Internet, for that matter, are disempowered, unaware of the potential strength they could have in shaping the Internet themselves.

PageRank has been published, and is public, but the algorithm that includes it and runs Google is far more complicated, and it is not available for public scrutiny. It is proprietary. Ultimately, those of us who are not subject to Google's nondisclosure agreements do not know exactly why certain results are being turned up as better than others. Responsibility falls to users to take results with a grain of salt, understanding them as the product of a process we cannot see and remembering that they are certainly not God's unvarnished truth. And yet our basic satisficing behavior, which tends to result in not only picking the first result but also rarely clicking past the first page of results, adds inertia and solidity to the algorithm, further contributing to the winner-take-all power-law distribution.

Search engine developers argue that they must keep their algorithms private. There is, after all, an immense industry dedicated to search engine optimization, and the "black hat" (think "cowboys;" also think ads for V!agra) side of that industry strives to move the search engine ranking of their pages higher no matter the cost. Search engine developers (as well as email, blog, forum, social networking software, and other public Internet tool developers) are in a constant arms race with the black hat SEO industry, trying to consider the results that search engine users really wish to find rather than turning over the highest rank to the highest advertising bidder. (Greg Conti, CITE; that other guy, CITE) Should search engine algorithms become public, it would be far too easy to "game" them.

But there have been calls for making them public. Because the stakes of having good search rank are high, the associated power distribution may have an impact on politics and economics. (CITE that master's thesis?) In the future, when algorithms are also involved in non-search behavior, indexically associating personal information (say, medical information) with social networking sites such as Facebook, issues of personal privacy and rights to know how one's information will be used also come into play; Mitchell has thus called for "algorithmic literacy." (Mitchell, IPF09 at New School, CITE)

In the long term, it seems important to reckon, in a legal and regulatory sense, with how much the public may know about search engine algorithms. In the short term, basic steps towards better teaching of algorithmic literacy -- including an understanding of how to write optimal headlines, links, and posts -- are absolutely in order.

One final note on the longstanding digital divides of practice, embodied in the phrases "Eternal September" and "AOL!" and reflected in danah boyd's research on "white flight" and other elements of zoning on MySpace and Facebook; I call these out specifically because they are more social literacies than machine-regulated ones, and appear to be increasingly important. boyd's research is not the only place that online literacies have gotten attention as social shibboleths. Just recently, the Chicago Tribune and LifeHacker published blog pieces inquiring about hiring discrimination based on email address. Certainly this is not yet data indicating such discrimination exists, but investigating this possibility seems an important next step in understanding zoning or redlining online. From a practitioner's perspective, it may be useful to attend to the constantly changing perceptions of different domains. Some might even see fit to encourage students to "code switch" to more prestigious domains when presenting themselves for professional life.

Further reflecting on the digital divides suggested by this corpus, it may be cause for concern that strangers appear to be overwhelmingly female while bloggers and other natives were more likely to be male. This could, of course, be a function of the snowball sample; there are also issues of self-representation online to consider (though in reckoning with bias because of gender play online, one might wonder why commenters across threads, discussing different topics, would decide to represent as female when disagreeing with a blogger). However, with female developers taking issue with a number of tech-industry presentations over the past year -- Stallman's EMACS virgins joke, the CouchDB talk (may be NSFW), and the FlashBelt talk (may be NSFW) -- this issue takes on a feeling of relevance to issues of employment equity, and of course also education.

The gender imbalance suggests further investigation about differences in online literacies. More directed, less phenomenological, attention to differing awareness of URLs, understanding of software, and interpretation of search engine results by gender may be in order.

Control: An afterword

This study started out seeking the locus of control on the Internet, but of course seeking it as Latour or Garfinkel would, rather than Marx would, or de Certeau. It is worth noting that in this medium, total control was never truly possible through mere argumentation. Possibilities escaped control tactics; natives and strangers changed "sides," defended each other, defied criticism and warnings.

There were places where it was clear that strangers knew what the original intent of the blog was; they just needed the forum, and ultimately, their comments began to outweigh those of bloggers and readers. Such was the sentiment of one thirteen-year-old girl on the LISP thread:

i know this is about the lisp computer programming thingy, but the lisps have taken over so im commenting anyway.

After particularly long threads of bloggers and their readers heckling strangers, some natives began to express their irritation with the hecklers. One Overhaulin' thread commenter wrote,

Which is dumber: that a third of the comments on this page are from internet sheep who thought they were writing to a tv show, or that the other two thirds are from internet sheep who thought it was funny to make the same joke 54 people already made before them?

Another reader ultimately summed up most of the Maury thread with a conclusion which uncannily echoes the direction of this paper:

"Stupid and uneducated aren't the same, though. True 'nuf - lots of people don't know how to spell, lots of people don't "get" the Web, let alone the Internet, and lots of people get themselves into all SORTS of awful situations.
But to my mind, this is more about one culture (prospereous, educated, Internet-savvy) suddenly staring through a window into the soul of another (poor, mostly uneducated). It's hard to miss the insights that the responses give into BOTH of them. Lots of us have treated this thread as a sort of litmus test for intelligence. Maybe it's a litmus test for other things as well, like compassion and kindness.

This sentiment was praised by the blogger, who also complained that he was starting to get bored and irritated by the parodies.

A few natives eventually came to defend the strangers visiting a blog, and allowed them to continue their threads, leading to what I called "hijacked" threads:

Leave it alone. Everyone here seems to have created a cute little supportive community. It might not have been the original intent, but what resulted is cute and worthwhile... (Lisp thread)

The blogger who supported strangers most vocally was "Ampersand," the blogger whose post developed into a forum for selling wedding dresses. When one of his readers asked, "Jesus Christ. Are all these people for real?", Ampersand responded, "I assume so. I hope they're selling their dresses. (I'm actually VERY fond of this thread, just because I never in the world could have planned it!)." Even Josh from Communications from Elsewhere, who posted some of the angriest get-off-my-blog comments to strangers, allowed the Alton Brown and Hoppity Hop threads to continue on his blog without commenting negatively towards them for quite some time. While Josh was generally disgruntled about strangers, bloggers like Ampersand who went with the flow when their blogs were taken over generally expressed enjoyment at the serendipity the tides of the Internet had brought in.

Ampersand and other bloggers who oversaw hijacked threads did not close, password-protect, limit to specific domains, or robots.txt their comment threads -- disabling the medium's affordances for conversational participation. There was no other way of definitively controlling the topic of conversation. It is, after all, the Internet: a many-to-many medium.

Suchman, HCI, and the semantic web

Suchman, in the conclusion to the first edition of her book, emphasizes the importance of context and of the indexicality of language to negotiating working standards of meaning in face-to-face conversation. She reiterates that machines, due to their digital nature and limited interfaces, have a narrow perspective on what is interactive "input" from human beings.

Search engines and the Internet complicate the problems of context and indexicality in a number of ways, as I have demonstrated here. First, like Suchman's expert-system copier machines, search engines are incapable of interpreting the situation of the user beyond the limitations of their algorithms and other aspects of their code. Unlike Suchman's copiers, they have the strength of an ever-changing sea of human input to shape their interpretations, giving them much more semantic "understanding." But consider the search engine as an expert system for a moment. Its algorithms for choosing from its databases could be thought of as stand-ins for the "user model" incorporated in an expert system:

a preconceived representation of the user and her situation.[...] constructed in advance as the template against which the user's actual actions are mapped, compris[ing] propositions about the domain, the task, the typical user, and the like. (Suchman, p 179)

Major-portal search engine algorithms could be seen in this way as making propositions about the domain (some pages are of higher value than others, and the best indicators of this are word frequency, titles, inlinking, etc), the task, and typical user (interested in the average of general information on pages publicly accessible on the Internet, rather than specific genres or domains; agrees with people who are Internet "literate" enough to write hyperlinks and thereby contribute to PageRank).

As this study has suggested, such assumptions about the typical user -- even that there is one kind of "typical" user of a search engine -- are dubious. PageRank, in particular, assumes a typical user who successfully writes and reads hyperlinks on a daily basis. Again, I cite Jonathan Coulton's "cancel my account" search ranking as a demonstration of the way this winner-takes-all mechanic benefits those in the Silicon Valley orbit, but doesn't help outsiders trying to solve real problems like canceling accounts.

Suchman notes that while many are "optimistic" about the success of expert systems, she instead sees them as based on cognitive ideas of "plans" which are essentially "generalized representation[s] of the situation of action," reduced to a set of inputs which a machine can understand -- "less a project of simulating human communication" than an adaptation to the limited abilities of machines. (Suchman pp 182-183)

Human plans, by necessity, are intentionally unspecific, in order to allow for flexible use in a range of situations. Suchman, like Garfinkel, describes plans as tools for reference in a given situation rather than instructions on how to proceed. Both note the impossibility of fully specifying the details of a given situation in a plan; such work would be infinite, counterproductive to the goal of flexibility, and simply tedious.

To human participants, language is important to planning coordinated, organized action in a given situation. "Our shared understanding of situations," Suchman writes,

is due in great measure to the efficiency of language[...] The significance of a linguistic expression on some actual occasion [...] lies in its relationship to circumstances that are presupposed or indicated by, but not actually captured in, the expression itself. Language takes its significance from the embedding world, in other words[...]" (Suchman, p 77)

This is where search engines require attention distinct from Suchman's analysis of copiers, because they are situated on the Internet. The Internet is a medium, delivering communication between people far from each other. As a result, it introduces additional situational, indexical linguistic confusion. This brings me to my second point: mediation itself allows for conversation between people who do not share immediate linguistic context, disrupting the grounds for mutual understanding. Obviously this is not new, or only a characteristic of the Internet, so in my subsequent section about pop culture and media studies I will revisit this point.1

These two problems -- search engines' limited understanding of input, and mediation's disruption of context -- are at the heart of the contentious conversations discussed in this study. The former is not likely an observation which will strike those in the field of informatics or library science as new. Developers and designers, meanwhile, are likely to throw up their hands in frustration at the lack of specific recommendations for design this provides.

To assuage this frustration, I will briefly point out areas which this study suggests would be fruitful. Again, this is with the caveat that I am not intimately familiar with research on search, with the development of search engine algorithms for the "semantic web" or otherwise (for the same reasons most people aren't, which I addressed in the section on power and literacy), or with HCI traditions.

As mentioned in the section on literacies, features which could be added to browsers in order to make it easier for users to interrogate and learn more about the ownership and construction of a URL -- perhaps in the URL bar, but perhaps also in mousing-over links -- might also help users understand the context of the page they are navigating.2

Blog software developers and bloggers, in particular, could take a number of cues from the findings in this data, though much of it has already been said by the likes of Jakob Nielsen. Blog software developers, including those who generate templates, should avoid using the confusing acronym URL to label that comment field. Buttons should be appropriately labeled, and it would likely be a good idea to provide a message indicating where and how publicly the comment will be posted. Bloggers should remember to keep article and blog titles relevant and provide identifying information on their About pages (if they are not blogging anonymously).

While nailing down the context-specific indexicality of all words (should be read in the sense of "nailing jello to a tree") is a prodigious and possibly impossible task, this study points to one particular set of indexicals as a fruitful project for semantic analysis of websites. Any search engine which could have elicited from users whether they wanted to talk to Maury Povich or about Maury Povich, and directed them to sites where that kind of conversation was already happening, could have caused several hundred confused or angry comments to disappear from this corpus. If natives used About pages and URLs as means of analyzing who the words "you" or "I" should point to on a given website, could a search engine make use of those resources as well? This seems a more reasonable task to tackle than hunting down the meaning of every "he," "she," or "they" that appears on a webpage.

In the time spanning from the first comments in this corpus to the final days of writing this study, a number of changes have already been made to search engines which might have helped avert these kinds of misunderstandings had they been available at the time. Microsoft has begun to make changes by introducing the Bing search engine, which divides results up into prominent sub-searches for a user's query. Microsoft has also acquired PowerSet, a semantic web startup, which makes use of Wikipedia and of Freebase, a database-driven approach to sorting specific, standardized categories of information. Then, of course, there was also the launch of Wolfram Alpha, which takes a more mathematical approach to using structured data (and is not particularly helpful when it comes to Maury Povich, offering only his full name, occupation, and date and place of birth. It would offer more information if Povich was, say, tradeable on the stock exchange). Both Bing and Google have begun to highlight specific details from high-ranked sites along with search results; Maury Povich's phone number and "Write Us!" page now both appear in the first set of both engines' results (though are not immediately apparent in the first link). Finally, because local context is critical to understanding, Google's recent announcement that it will increasingly personalize search results based on the cookies and other information associated with individuals' Google accounts is very interesting indeed.

It remains to be seen how any of these changes will end up affecting users' searches; ongoing analysis of new data like that in this corpus, along with more traditional user testing, will yield some information. On the level of results themselves, the effects of the changes may not be clear, or may be clouded by speculation in the search engine optimization community (again, because of the proprietary nature of search engine development).

1 The contextual chaos introduced by media could be seen, perhaps, in moral panics throughout the ages about print, telephones, radio, television, comic books, and film, which were identified as disrupting the zoning which kept "vulnerable" populations such as women, children, and churchgoers out of (con)texts they were not supposed to read.

2Tip of the hat to Matthew "Glyph" Lefkowitz for sparking ideas in this direction.

Mediation's ramifications for linguistics and cultural studies1

The disruption of context evident in this study poses challenges to the participants; it also poses new questions for a few academic fields. It also suggests potential focuses for the holy grail of the semantic-web search engine.

One of the fields for which this study suggests possible new questions is linguistics. As I have demonstrated, long-form online conversations appear to work under a listener-selects-previous rule, rather than the speaker-selects-next mechanism assumed by Sacks, Schegloff, and Jefferson. It is not clear why this difference exists. It seems likely that it has something to do with the vast number of people who could be participants in an Internet conversation, but it could also have to do with the disruption of spatial, temporal, or contextual congruency which mediation enacts.

This raises interesting questions which might warrant further study: Is all conversation actually listener-selects-previous, and it is face-to-face conversation which is somehow different in requiring a response? If that is the case, what is the nature of the stigma or discomfort which causes the speaker-selects-next rule to override the listener-selects-previous rule in face-to-face conversation?

Finally, in online conversations, how long is the average period for turn-taking, and does this change depending on the forum? How do interface elements such as screen size influence which comments users choose to respond to? I am told these questions may already have been addressed in early large-scale conversation research on "holding the floor" in HCI.

Another field to which this study might contribute is cultural studies. Particularly, the comments made to celebrities may be of interest. Rather than merely celebrating the profusion of fan texts the Internet makes available, this study may shed some new light on how fans may think about their relationships with and ability to contact celebrities.

While it might seem that the Internet has brought about this phenomenon of misdirected writing, cultural studies presents some examples of older forms of writing which reveal similar confusion on the part of the writer. The cultural studies collection The Adoring Audience includes requests digested from fans' letters to celebrities, sent to a British newspaper, prior to 1984:

A 16-year-old boy requests Paul McCartney's address. He knows McCartney has a house in Kent and 'once caught a train towards that direction but ended up lost.' A housewife wants Michael Jackson's address immediately so she can 'fly to the States, find him and I would be a friend to him.' A born-again Christian wants the same: 'I believe very strongly that I can help him.' A boy complains he can't get near Adam Ant: 'I have forked out so much money for concerts all over Great Britain to get his name in my [autograph] book...' A girl wants the Sun to arrange for her to spend the day with Duran Duran. Another girl, who has been a David Essex fan for eight years, begs for 'just two minutes' with him to say: 'Hello, how are you?' (Fred and Judy Vermorel, 1992)

The similarity in tone and content between these requests and the comments in this corpus is so strong as to suggest the phenomenon studied here -- at least insofar as the content is concerned -- is not new. People have been making requests like these for years; the Internet just makes them more visible. In fact, one native on the Maury Povich thread claimed to have worked for tabloid newspapers like the National Enquirer, and had been asked to field email similar to these strangers' comments.

While cultural studies has done much to clarify and improve our view of fans' relationship to celebrities, the data here may suggest re-approaching fan writing again. Some cultural studies writing has addressed the "use" of celebrities in the lives of fans from a purely psychological perspective.(CITE) When fans try, as they did here, to turn that relationship social -- by reaching out to celebrities, rather than simply fantasizing about them -- does it become a different "use"? Does it belie a belief that celebrities really do, or could, participate in fans' lives? That celebrities and their fans share the same context? And what is the difference between fans who do think they share context/interact with celebrities, and those who do not?

Continuing down another branch of this discussion: For how many years has this kind of writing gone on? It could be argued there is evidence that people have sought contact with those they see in images or read about for at least a century and a half. Wilkie Collins, whose Victorian "sensation" novel The Woman in White was published in 1860, received letters from men who wanted to know the identity of the woman on whom the fictional heroine was based. Many of them sent marriage proposals. (Collins, 1999) The tomb of Juliet Capulet in Verona, Italy, has received letters to the star-crossed heroine since the late 1930s, when the first film version of Romeo and Juliet was released; these letters are often addressed simply to "Juliet, Verona, Italy."2 (Accounts of the Juliet Letters pin responsibility on the movie, but one wonders if letters arrived before that.)

Consider, also, children's letters to Santa Claus, which are generally addressed "North Pole" but read or answered by local postal staff. While Santa's existence may or may not be known to writers, the Juliet letters present a clearer case, considering anyone who has finished Shakespeare's play ought to be aware that the heroine is really in no shape to be answering her mail, even if the stabbing only happened yesterday.

Juliet and Santa letters involve sending messages out into the wilds of the postal system with addresses which are fictional (the house in Verona known as Juliet's is only a best guess at where the girl assumed to be Shakespeare's model lived) as well as vague. This begins to suggest that the literacy practice these writers engage in might be a knowing engagement in perpetuating fiction.

Then, of course, considering the evidence that at least one stranger was engaging in "Internet prayer," it is worth considering that this may be another form of apostrophic address entirely.

It remains to be investigated how strangers perceive the real-world impact of their comments. Do they address themselves to stars on blogs as writers address themselves to the ostensibly-no-longer-able-to-read Juliet? Do they actually believe their message will reach their target somehow? Through what mechanism? How many see this kind of writing as having the mystical, unspecified impact of a prayer?

This begins to shade off into questions which transcend genres and media, suggesting a broader line of inquiry for any study of mediation and language: Given the historically contextual nature of human language, how is that facet of speech transformed when it can be carried across millions of miles, or stored for millions of years? How are we to make sense of messages that come to us from a place and from people we do not always know? Can we contextualize them as people always have -- making sense with our local resources? What institutions and cultural practices are necessary in order for us to understand at a distance? This question suggests to me that one more direction for this study would be to catch up with research on a relatively new field calling itself "telepistemology."

1 I really only developed this section in this fashion to heal the unhealable rift between cultural studies scholars and media studies scholars and linguists that I observed in the now-defunct Cultural Studies and Cognitive Science department at Hampshire as an undergrad :D see what I did there?

2, which comes from "Shakespeare's Juliet is an agony uncle in Verona" Deutsche Presse-Agentur (DPA), at Khaleej Times Online, UAE, February 13, 2004;;; Google's cache of . Tip of the hat to Elvis Costello and the Brodsky Quartet for bringing these letters to broader attention.


Appendix B: Comments which were not considered

Some notes about content which I am not including in the corpus:


A number of people, when I've described this project to them, have said, "Well, aren't you just talking about spam?" Really, I'm not. As someone who recently went through the painful process of rooting out 1.5 GIGS (not mgs, GIGS) of comment spam from a Movabletype blog, I can tell you that there are very distinct stylistic, indexical, and content elements of comment spam which make it clear that gumbabies are something else entirely.

At the base of it, comment spam is commercial. Its aim is to earn someone money through mass advertising. The source of this hoped-for revenue is either clickthrough or from actual purchases. The comments in my corpus, by contrast, are not commercial; they are communication from individuals who, as we will see in a while, have very different aims.

Comment spam ends up on a blog through repetitive, possibly automated posting, rather than a more-or-less unique string posted by an individual. Frequently, comment spam repeats the same phrases, over and over. Specific strings show up as "tokens" or identifiers if you are looking over a blog or guestbook which has been hit by spam. Here are a few examples:

Hi friends! Sorry for it, but I very need money! :( massage therapy schools minnesota | []massage therapy schools minnesota[/url] || best medical schools in the world | []best medical schools in the world[/url] || columbia university medical school | []columbia university medical school[/url] || medical billing training school | []medical billing training school[/url] || consolidation loan medical school | []consolidation loan medical school[/url] || medical transcription schools | [ (from .
Hi friends! Sorry for it, but I very need money! :( hardcore lesbian sex | []hardcore lesbian sex[/url] || hardcore movie | []hardcore movie[/url] || hardcore partying | []hardcore partying[/url] || hardcore picture | []hardcore picture[/url] || hardcore porn | []hardcore porn[/url] || hardcore porn star | (etc.) (from )
Hi, nice site! [url][/url] Warm regards Barbara!

(from )

provillus love [at] pandol [dot] com 2007/3/13 11:51 Nice site and good design! Best regards! (from )

"Hi friends! Sorry for it" and "nice site" were two strings which were quite useful to me in weeding comment spam out of my own blog. And, as it happens, if you go looking for the string "nice site!", you may eventually find bloggers who are trying to clean spam with this token out of their accounts:

This is not to say that off-topic comments are never repetitive; sometimes they are. But as a contrast to spam, it appears that when these errant commenters post more than once (as in the "repeat offender" category on, ), it is less likely to be cut-and-pasted between comments. Commenters often do make the same pleas over and over on different sites, to different celebrities, but they appear more likely to be writing comments by hand and changing some elements as they go along.

Another commercial element that sets comment spam apart from gumbabies are the number of links contained therein, as is visible in the spam above. Strangers in my corpus are not generally looking to promote other websites; they rarely link to anything. And comment spam is more likely than non-spam to have links to sites from certain domains: specifically, Russia, Poland, Brazil, and Romania (.ru, .pl, .br, .ro).

419 scams

For similar reasons, I am not going to be looking at money-scam comments that run along the lines of those sometimes sent by mail. These are stylistically similar enough to "419"-style emails (see , which interestingly has pre-Internet predecessors, both in 419 frauds sent by fax or telex, and in Spanish Prisoner frauds dating to even earlier: ) that I think they constitute a part of that phenomenon and not this one. Here is an example of a comment-posted 419 message, which should have a familiar ring if you have a hard time filtering spam out of your email:

Dear friend, Please accept my appology for sending you this mail since we have not meet nor see each other before, please i am sorry for any embarracement that this letter may cause you, the truth is that i need your help, my late father deposited some fund in a local bank here with my name as the only daughter, worth of nine million five hundred thousand dollars $ 9.5 million, now that i have lost my dad and i am the only one that left in the family no one to help me, i need your help so that i can transfer the money into your account in your country. I can not handle it alone that is why i need your help please help me claim and transfer this money into your account so that you will help me to invest the money in a very good business since i don't have any one to stand for me and for the transfer and i am still yound i hereby seek for your sincere help in other to invest my money for my future please i want you to read this mail with free heart and try to put consideration into it for you to understand my situation for i need your help into this issue. I have every documentation that back up the fund all i need from you is your sincere help that's all please help me, upon your response to my mail i will direct you on the next step to follow. Thanks for your kind consideration and God bless you. mary Beko. (from )

There are similar commonalities to Craigslist scams, which otherwise might be mistaken for errant commenters because they don't appear to have read the original post: While commenters in my corpus sometimes requested financial assistance, not one of them requested or provided information for a direct bank transfer.

Appendix C: Graphs of turn-taking patterns

Links to PDFs. longfiles are loooooooooong

Visualization of the Harry Potter thread

Visualization of the Overhaulin thread

Visualization of the Maury thread


My own understanding of computers has been greatly enriched by the excellent computer education I have received throughout the years. I don't think I would have the fundamental insights about technology I write here without my teachers in the early years: Dave Kressen, father of the Polytechnic School computer lab, and his successors Jon Fay, Lyle Hatridge, Malorie Wiebe, Carol Thornton, and Deborah Lee, who let me spend more time than I should have been allowed in the lab. Two other people were central to my early exposure to computers: my grandfather, Gilman Andrews, who showed me how to both work and play on his Tandy; and my best childhood friend Robert Durff, whose curiosity, playfulness, and willingness (eagerness?!) to break things led us both to push computers' capabilities to their limits. Robert now works for Microsoft.

My understanding of online communication software also owes great debts to Evan Henshaw-Plath and Matthew "Glyph" Lefkowitz, subjects of my undergraduate thesis and my first teachers about the Web and open source software. Kellan Elliott-McCrea, Yoz Grahame, Elizabeth Goodman, Elizabeth Churchill, Jon Gilbert, Blaine Cook, Tom Carden and their friends in the Bay Area helped me debug many of my early ideas about how blogs work and how people use them. At some breakfast in the Mission, one of their friends from Blogger quipped "People will put just anything in a comment box," and that phrase has continued to goad me to understand that apparently unthinking action better.

Extra-special thanks go to Matt Haughey and Jessamyn West for giving this project a huge boost. My conversation about the errant-commenter phenomenon with Matt made me realize how broad this phenomenon was, and Matt and Jessamyn's amplification of my request for more examples on MetaFilter gave me a much larger corpus than I might have found on my own. Thanks to the MetaFilter community in general, and users divabat, jonson, fixedgear, hades, cortex, and squirrel in particular for help in finding threads, and to discoursemarker for suggesting connecting this dissertation up to the AoIR community.

During the course of developing this dissertation, some colleagues at Linden Lab were valuable tutors: Kent Quirk tutoring me on the inner workings of computers, James Cook giving me a crash-course in how Google's search apparatus works, Erica Olsen for support in looking into the academic fields of information and library science, and Stephany Filimon for doc-student sympathy. Thanks for support needs actually to be extended to the entirety of Linden Lab for supporting me through a difficult, somewhat errant pilot for this project.

If I've erred at any point in this dissertation, it is my fault, not theirs!


Copyright Gillian Andrews 2009. All rights reserved.

Personal tools