User:Gusandrews/SearchProject/RunningNotes
From Studyplace
Contents |
[edit] 12/15/07
See personal notes about the Stamen party.
[edit] 11/13/07
Talked with Kellan's friend Tom from Stamen on Sunday night. Tom does work on visualizing data; he had previously briefly worked on a doctorate in engineering, among other things. These are some ideas he brought up when I raised the search question to him:
First of all, he lamented the fact that when people game Google PageRank -- usually search engine optimizers -- it not only makes more search results "bad," but also makes web pages "bad." When I pressed him on what this meant to him, he referred to the visual garbage and other advertising around the screen which detract from the information he really wants. He sort of gave a Strunk-and-White-of-the-Internet justification, claiming that generally, a well-designed page is also one with better content.
Second, we talked a little about the possibility that many people might go with their first search result no matter what. It's what came up when you did the search, right? says Tom. It must be the right answer.
He notes he never uses the Google "I Feel Lucky" button, because he generally believes he won't find what he wants in the first result. He'll usually go back to his results with some modifiers after the first round, and pick something a ways down in the stack.
[edit] 11/09/07
Returning to my four questions, in the wake of a conversation with Matt C:
- How do different groups do the Internet differently?
- What are X's mental models of search?
- How are different Internet practices related to power structures? How do they influence power structures?
- and finally, Why the hell do people keep showing up on my website and asking if I'm the actor Ashton Kutcher?
Matt says ditch question #1, and I do think he's right. It's vague. I guess what I meant to imply was that eventually I'd identifiy groups to compare. Anyway it's explicated in the other questions, so I figure I'll be covering it anyway.
Matt liked the mental models question better, and I do too, at least enough to make it into a methodology for my Second Life pilot -- which, it appears, is going to happen. Good stuff.
Here's how I ended up threading the questions together talking to Matt:
If users and producers prove to have different mental models of how search engines work, whose mental model ends up shaping the way search engines actually work -- and, as a result, which websites get read and contributed to? Does the (L)user then end up not being able to do searches in a way that makes sense to them, to find the pages they need, to make themselves heard where they should be? Is this really the search engine developer's fault, or is it a result of the "essentially neutral" networked structure and tremendous content of the Internet?
This *still* runs afoul of the huge problem of what "search" is. The "what would you be studying if there was no Internet" clause comes in here: let me tell you, it wouldn't be search. It would be "how people decide to buy a new car," "what people do to entertain themselves," "how people seek medical solutions." And mind you, this is a problem with Google's mechanism, too: it aims to be a part of each of these domains. (And it becomes one. And generally it does ok.
Not only that, but I still can't make this meaningful to anyone outside of academia. Not being relevant to non-academics makes me very uncomfortable at the moment. Maybe I should move back to New York and the department. Maybe that would make me more comfortable with this topic. But at what price?
[edit] 11/07/07
A little nugget of quotes gleaned from some older notes when I was studying Soja and/or Nespor (of ANT fame), apparently:
- Soja: "Social life must be seen as both space-forming and space-contingent, a producer and a product of spatiality." (1989)
- Nespor: (academic disciplinary) "power... in these terms, is about the production of space-contingent social life at the expense of space-forming social practice"
- de Certeau: thus a discipline's space becomes routinized, standardized so it can work with other parties
[edit] 11/06/07
A quick thought which seems to have sprung from returning to de Certeau earlier:
When users use NOT operators in a search string, aren't they adapting to a search algorithm which they are (to some extent) aware is not searching the domain they're interested in?
The thought builds on the idea that search engine algorithms make weird approximations of what people really want.
Couple of cases to support or undermine this idea:
Case 1: I go looking for "video games" in Google Scholar, wanting to find the literature on games and education which I know is out there. I unexpectedly find my first results are crawling with "games are bad for you" studies produced both in medicine and psychology -- two lines of thought I'm really not interested in. I add "-violence -cardiology" and proceed.
Case 2: User X sits down with Google and searches for '"video games" -Nintendo -Playstation'. The user is aware that within the field of video games, there are different types of games, and he is only interested in one game category. He thinks it will be easier to find what he's looking for if he begins to divide the field up into categories he knows.
Both these cases support the idea that NOT operators help limit the general search engine's field of view, specifically in terms of Gee-like Discourses or domains. The difference between the two is that in one case, the user discovers unexpected results and then decides to narrow it down, while in the other the user goes into the search knowing that the engine is not "smart" enough on its own to find what's needed, and needs some guidance. (There -- that's a better reckoning than yesterday.) The thing is, though, both of these cases assume a user who understands NOT operators; I'm guessing not every user will.
Case 3: User X goes looking for the phrase 'VIDEO GAMES ARE NOT GOOD FOR CHILDREN' using a search engine which recognizes Boolean operators, unaware that the engine is sensitive to these operators.
This would be an accidental running-afoul of the NOT operator, then. I'm not even totally sure it would happen this way, but I suspect it might in some cases.
In sum, it might be good to ask a question like "What do you do if the search engine isn't turning up results you like?" or even ask the user to demonstrate that. Feels like there might be a whole string of questions to be asked along these lines -- maybe one for different elements of the search engine (operators, ranking, etc).
de Certeau also reminds me about something at the theoretical level. I routinely talk as if there's been no middle ground explored between the hegemonic models of Bagdikian, McChesney, and Chomsky, and the user-positive models of Jenkins and other fan-studies scholars. That's disingenuous. de Certeau keeps in mind power structures and agency at the same time. He even gets pretty close to something like Latour; strategies of the powerful and "tactics" of the powerless essentially network with each other, dependent on each others' moves.
What this leaves me, I guess, is an ability to say my project is de Certeauvian. Because with searchers hunting in the unknown forests at the edge of what the makers of the Internet actually understand about the Internet, for quarry the makers hadn't thought to eat, it really does feel close to de Certeau.
The problems remain:
- identifying a group of searchers which calls itself such
- justifying how this is helpful for human knowledge, and what the hell someone could do with what I discover.
[edit] 10/28/07
I am seriously losing the thread out here.
I don't know if it's because I'm so far from the department, I'm just going down a rabbit hole on this subject, I'm just not particularly pleased with my life at the moment, or some combination of the above, but I'm not happy about this topic as the subject of a dissertation right now. Not that I'm not happy about the subject of search. I am on fire about the subject of search. I just think it's not a very reasonable dissertation subject.
Here are four questions I'm focusing on. On some pieces of paper taped together on my desk, I've charted these out against which types of research methods seem to be a good match for these questions:
- How do different groups do the Internet differently?
- What are X's mental models of search?
- How are different Internet practices related to power structures? How do they influence power structures?
- and finally, Why the hell do people keep showing up on my website and asking if I'm the actor Ashton Kutcher?
The first three questions, as their more circumspect working suggests, are reasonable questions within academic discourse. The latter question is the rabbit hole to which I'm referring; I still find the Ashton Kutcher question the most compelling part of the whole picture.
Here's a few thoughts on the questions:
How do different groups do the Internet differently?
The information retrieval/library science fields have done the question of how different groups do search up decently. Thanks to Case (2002, 2006), I now know they've looked at voters and siblings, doctors and stock analysts, students and seniors, and various other groups.
My summary assessment of this literature is, for starters, that the bias in answering this question has largely been towards professional searchers. This is because of historical origins of the field, according to Case.
Second, I think LS/IR scholars are still not looking closely enough at search in social context. For the most part IR/LS has not been looking at the models of search people carry with them from one context to another. How does what a person learns about search in school relate to what they do on the job? In the library? With friends? Do people carry search strategies from context to context? I do believe this has more recently been approached by some scholars in Finland; I am starting to see bibliographies which contain both the major scholars of IR/LS and scholars more familiar to CCTE students -- the likes of Lave and Wenger, Gee, Latour, even Garfinkel. Haven't read those articles yet; they're next in the hopper when I finally get time.
There's a huge opening here for someone to waltz in from New Literacies and talk more to LS/IR folks about Jim Gee's idea of Discourses, how they relate to professional fields, etc. I think someone doing this would be very well-received, and I would certainly love to be the person doing this.
The trouble is the usual one with waltzing in anywhere and introducing people to Discourses. First, the benefits of using a Discourse lens are generally pretty hard to explain. My feeling is the response to the Discourse model tends to be, "Duh. Yeah, we know people talk different ways in different fields. What exactly am I supposed to change in my school/library/classroom as a result of your research?"
Second, not to be too calculating, but I'd be waltzing in to explain discourses to information retrieval and library science scholars. I'm not exactly sure how big of an impact that would have on human knowledge or even daily life in schools.
What are X's mental models of search?
The main rationale for jettisoning the search question would be time. This question seems particularly time-sensitive.
The more I talk to Varenne, the more I look at the gaps in the literature, and the more I think about the questions I want to ask, the more I think this particular question would be best served with a longitudinal study. I think the interesting question here is about the different styles of search people learn over the course of a lifetime, and how those do or do not build on/inform their other social ways of being. You can ask them for a current snapshot of their mental model of search, but you will not know where different elements of this model came from.
Varenne rejects the idea of asking people about their histories, and I buy his reasons for that rejection; I think I'd get more realistic data in vivo. I think Kulthau's longitudinal study was useful (was it?), and I think another longitudinal study which is more oriented toward general Internet searching would be useful.
How are different Internet practices related to power structures? How do they influence power structures?
I ran this one by my boyfriend (patient soul) last night, and ran into the usual hurdles while I was trying to explain it:
For starters, I tend to explain this question as "Who has power on the Internet?"
Well, "power over" what? Usually what I'm looking at when I ask this question is comment threads, search queries, ownership of blogs -- forms of personal expression on the Internet, I guess. It seems like a slippery place to start talking about power. It might make more sense to talk about server ownership, bandwidth, or PageRank.
And what does "power" mean? Reflections of power in other aspects of life? The ability to shut other people up? The ability to marginalize people who are doing the Internet in a different way?
I'm not really equipped with a theoretical framework to explain a "power structure," anyway.
Asking who has power over the Net is great, and all -- it feels satisfyingly important and Chomskyan -- but it's kind of disingenuous. I don't believe in any given model of power in other media. Audience studies has a tendency to ignore material resources available to different actors; hegemonic depictions of the media don't allow for de Certeauvian "poaching." I guess my lack of faith in any one model is kind of one of the reasons I'm kind of seeking to play out the power question to begin with: I want to find a middle ground between Jenkins and Bagdikian, de Certeau and Chomsky.
But describing what that middle ground looks like still runs afoul of the ethnography trap. Great, people say: so you've described the intricacies of how this plays out in everyday life. How is this replicable? Isn't it anecdotal? And once again, how the hell am I supposed to use this to improve my life on the Internet?
Why the hell do people keep showing up on my website and asking if I'm the actor Ashton Kutcher?
I have begun to make the mistake of explaining my nascent questions about search to people using the Ashton Kutcher question as a way to make it tangible to them. I explained my ostensible dissertation to a program manager at Google the other day. She was very interested, got what I was talking about quickly, all that.
In the course of describing it to people as a question of why people ask these stupid questions and show up at my website, I've gotten all excited about it again. I've gone looking for more examples of the phenomenon, and found dozens. I've described it to friends, who urged me to go talk to Internet luminary Cory Doctorow, as he's seen this phenomenon on his website, BoingBoing, for years. I've even developed a name for it: since my friends and I decided it wasn't quite the same as a honeypot, I figured it was closer to a gum baby (or, if I was paying less attention to race and the evolution of the original Anansi legend, a tar baby).
Worst of all, I've dreamed up schemes for re-creating gum babies on purpose. I'm pretty sure I could do it using Google AdWords, though the IRB would probably stamp "MILGRAM" on my proposal in red ink and send me packing if I tried to do that in anything but a pilot project.
[edit] 10/12/07
After some more gleaning of studies from Case, I took a nostalgic tour through the various websites which have had people show up asking questions of celebrities. I still can't get over that particular use of the Internet: people trying to contact stars who television has brought very close to them.
This may be a population I'd like to recruit to study, because they clearly do the internet very differently than its developers and optimizers do. And they're definitely a community of practice: so many of the requests made of celebrities are similar in content, or phrased the same way. I count about ten sites in all which have had hundreds of comments like this. The comments on the MeFi thread (which resounded around the blogosphere -- ok only a little, google says only 10 sites linked to the original blog thread, and only 3 to MeFi) itself demonstrates once again the differences between people who consider themselves to be doing the Internet "right," and those they see as doing it "wrong."
I'd been at a loss as to how to recruit the "doing it wrong" community until it occurred to me today: why not create a honeypot for search terms like "celebrity email address," and use Google AdWords to drive people to it? I may detail more about that in my methods section.
Meanwhile I need to get back to thinking about how to design what happens when I actually sit down with people -- have to decide what I'm testing in the pilot study, and whether doing that pilot at Linden Lab actually makes sense.
(and yes, I figure it's not sustainable to keep talking as if the celebrity-searchers are "doing it wrong.")
[edit] 10/10/07
Am reading Case, 2002 and he is already way hipper than Marchionini -- references to Dewey and Bruner. I can maybe dialog with this dude.
Dervin (1992) seems to retain currency, she's mentioned under theories of sense-making.
[edit] 10/9/07
Really, for serious reals, completely and totally done with Marchionini, not going back, no idea why I bothered to keep reading more chapters today. Lots of taxonomies, many of which are relative to their historical era. Some might come in useful after gathering data to compare and to develop a vocab for speaking to IR types, but not even sure about that.
[edit] 10/8/07
Really about ready to give up on Marchionini, possibly on library science/information studies in general, unless I can find evidence that more recent works have moved further away from mind-as-machine lines of thought and more towards social practices analysis. There's a real absence of thought about knowledge as socially situated, embodied, and practiced in Marchionini so far, and a real lack of perspective on the relative, subjective nature of a field which set out from the interface between the librarian and reader, the web database and the searcher. I feel like I'll have more tools to work with back at the home base of Lave and Wenger. Will skim the rest of Marchionini though. If working with libsci/IR in the future, must must focus on any attempts they make to be social.
[edit] 10/5/07
Returned to Garfinkel. I'd been hopeful, but now I'm not really sure he'd be useful to my project. His emphasis on observable behaviors would be really hard to reconcile with searching, an activity which takes place almost entirely within the user's head with few discrete moments of externality (going to the computer, the entering of the search string, clicks through to other pages, getting up from the computer). It would require either a lot of interrogation or a lot of inference on the part of the researcher to see anything much more to search than this, it seems to me.
I guess I could try to look for other occurrences in people's lives when they make use of what they learned, but how to sift those instances from the thousands of other moments in a person's life? (And under the constraints of IRB, no less?!) Maybe I could be sure to observe an event which is thematically related to the search (dancing, finding dance shoes online; doctor's appointment, looking for medical information online).
One thing that does intrigue me on second reading of Garfinkel, though, is the bracketing/breaching idea. How could I interrupt search in order to bring out people's assumptions about it? This was actually something we did a great deal, inadvertently, at Linden Lab: our search engine was such poor quality that users' ideas about what a "good" search engine should do were expressed routinely, in frustration. "Broken" search engines might well be an interesting way to elicit expectations. And I was thinking it might also be useful to have programmers and search engine optimizers view videos of users making use of a search engine, and comment on what "looked right" or didn't look right in the way they saw the engines used.
[edit] 10/4/07
Went back to Situated Learning: Legitimate Peripheral Participation by Lave and Wenger. Interesting in contrast with the IR/library science literature: L&W have a hard time countenancing anything individual at all, while IR barely ever deals with the social in a reasonable way, at least in the somewhat antiquated lit I've seen so far.
(Remember Lave and Wenger insist that domains are created by people *doing.*)
L&W make a great case for learning in-situ, in-social. Their model of learning matches with one end of Marchionini's spectrum of finding information: knowledge accretion, as opposed to an event-initiated hunt for particular ideas or facts. Which makes me wonder: Is the entire field of IR, with its ideas about retrieving discrete bits of information, subjective and specific to Western ways of thinking, to computer databanks and libraries, to this strange concept of the learner as a unique individual brain divorced from context? And if so, in what ways is that literature going to be useful to me?
Re-introducing myself to Lave and Wenger's take on knowledge domains, I wondered about my assumption that different search tactics are tied to specific knowledge domains... maybe I'm underestimating the role of general skills?
And I wondered if I could get searchers to put on "hats" indicating different roles in lab research, the way we do with younger kids when we're encouraging different kinds of thought... or whether that would just be too silly. (Although the idea of being known for doing "silly" research kind of appeals.)
[edit] 10/2/07
Still not finding anything about:
- the moment they sit down to craft the string
- strings and strategies for different domains
- anyone who's really done work on what domains ARE... no research yet in my experience which asks about users' own domain background. How would I go about doing this?
Potential problems with doing stuff on search in Second Life:
users are not looking for ideas or information there -- they're looking for actual things, and for facts! Are they looking to do knowledge accretion?
How to really get good pictures of mental models in research?
sketch of study, if working with Linden Lab is possible:
- Get a picture of developers' mental model of domain, of engine?
- Get picture of users' mental model of domain, of engine?
- Have users walk through how they would search using old SL search
(also how they'd do it in Google?)
- Then have them do the same search in new SL
This is a question about domains of knowledge.
What work has been done about domains of knowledge?
About how they interact with/are defined by the people engaged with them?
About who has power over them?
Aha, now this is sounding like a Lave and Wenger issue!
I am interested in how bodies of knowledge get defined.
[edit] 6/10/08
Revised my initial plan of randomly picking 10 early gumbabies to use as a test of atlas.ti; will use early gumbabies with highest numbers of comments instead, and a balance between celeb and tech help threads.
I think Varenne will say one of two things:
this is not a phenomenon because the people wouldn't agree this is the way everyone does it or this is not a phenomenon because there are different topics you are looking at
I am worried that I will choose the tech topic and not find people or choose the stars topic and get bored with it because it requires looking at fantasy behavior or I will just bound things wrong and people will criticize me and my research for it
[edit] 6/26/08
Things I am not currently coding:
Type of appeal
Non-Internet or print media references (ie tv, movies, etc)
Request to see something/see celeb perform action on show
Request for gifts or money
Speech-type greetings and closings (hi, g'day, etc)
Probably re-code “grammar,” I expanded it into more categories
References to stupidity
Possibly need to rethink “phatic” – lots of reference to channel in Google Answers thread, but is it the same when they are not talking about the IMMEDIATE channel, some of the stuff you have already coded for non-strangers may be wrong
"My name is" intros
l337
lack of punctuation. But should I be coding that? I mean, the Internet is supposed to have lax rules about these things.
quotes from another post
URLs (not field)
non-postal (generic) signoff -- ADDED 7/24/08
[edit] 6/28/08
New hypothesis: Need to code each COMMENT by age, gender, etc in order to get it to show me connections between codes. Just coding a word or a string will not give me co-occurrences. So: characteristics of commenter coded to comment; others coded to sentence.
Also, I want to use co-occurrences relations, not “neighbors” – the neighbors function, counterintuitively, gives you chunks of text, not nearby codes.
[edit] 7/6/08
NOTE: Currently saving PDFs solely for the sake of text analysis and quotation. Should return to page when possible in order to refer to graphical and layout elements, which are not currently being preserved well by pdfs. (Gave this up but made it through Tiara's posts.)
Which gumbabies am I using? Method: I started with three threads on MeFi: Maury, Demon Dogs, and Tony Hawk. (Might also use Referrer Results, 2000 to demonstrate how bloggers "accidentally" up their hits for bizarre search strings.) I am using all the posts linked to in comment threads. Where necessary, I have also contacted people who mentioned threads on their blogs but did not link them. I looked through comment threads for additional links and mentions. Then I asked the MeFi community for more recommendations. Some of these include culled lists of comments, guestbooks, and email -- identified by bloggers as the same phenomenon. Finally, there are a few threads recommended by friends with whom I discussed my doctoral work, as well as threads from my own page. There are also threads recommended by readers of my gumbaby blog. Currently I am considering also using Google-found gumbabies where bloggers and their peers react within the comment threads, but the current list does not contain those threads.
Part of the difficulty with capturing and analyzing these threads is there is no guarantee the analyst is looking at what the commenters see. As a result, I used RTFs for extensive textual analysis, referring to screenshots from the pages in question when questions of graphics and layout came up.
Website display is subject to extensive variation. Any number of bits of software and code may be manipulating what the user sees -- from toolbars the user wanted to install which shrink the amount of the page which is visible at a given time, to unwanted malicious software which adds content to pages or brings up popups. Lack of certain software or use of older browsers may make it impossible to view some content on a page. Ads in a sidebar may change; the user may have a piece of software installed which blocks some ads from being displayed. Individual users may have their browsers set to display pages differently: without graphics loading; with text enlarged; with disability-support software altering the color display or how much of the entire computer screen is displayed. Pages display differently in each of the different browsers. Finally, monitor size and configuration -- from a multiple-monitor setup down to display on a cell phone -- shape how much of a screen is visible at a different time, with what configuration, with how many linebreaks.
Top-notch writers of webpages know this, and it is a subject of discussion and solutions aimed at reducing this variability. Cascading stylesheets (CSS) and extensible markup language (XML) are two technical solutions which have been developed to manage this variability.
And of course, not every webpage writer is aware of CSS or XML. There are still pages out there hand-coded in HTML. Many wikis present style guidelines, but the technology does not oblige users to follow these.
