Article Summary for Lecture #14—Schuitema

In “The Future of Cooperative Cataloging: Curve, Fork, or Impasse?” Joan Schuitema explores the history of cooperative cataloging in search of patterns that might help provide direction for the future landscape of the profession. She contends that the future of cooperative cataloguing is in question due to shifting values in librarianship, rapid technological change, and competition in information retrieval tools and techniques—i.e., Google interfaces. By looking at some of the landmark developments within the field, Schuitema hopes to determine exactly why particular changes came about and, more importantly, how those changes and their impetuses might relate to the current climate of cooperative cataloging.

Schuitema begins her historical assessment by noting the efforts of Charles Coffin Jewett in pursuit of his goal to make the Smithsonian Institution a hub for library organization—a plan impeded by a lack of funding. She then highlights attempts by libraries to partner with publishers in distributing records along with publications. While such efforts have continued into the present, they have often met resistance due to their tendencies to slow down production, increase publishing costs, and lag behind changes during publishing. Finally, Schuitema examines the impact of printed cards on the development of cooperative cataloging and finds that, although they served to promote standardization, the success of card systems—such as the LC’s—was limited by increases in costs and the inability to keep up with production rates. “By 1940,” says Schuitema, “cataloging had reached crisis proportions and arrearages were growing at alarming rates.” (261) The most intriguing aspect of the history of cooperative cataloging for Schuitema is the fact that practitioners “continue to wrestle with the same issues.” (262)

To answer this question, Schuitema shifts her attention to the current landscape of cooperative cataloging and contemplates the profession’s future. She notes that catalogers are burdened by the continued growth of knowledge and expressions of that knowledge within new formats. Traditional practices are being outsourced and automated, and rules and standards are becoming more elaborate. Coupled with these issues is the fact that libraries are continuously finding ways to cut costs. Perhaps the greatest problem facing catalogers today, according to Schuitema, is that of dealing with shifts among the information-seeking behaviors of users—who, instead of relying on the “pre-defined paths” provided by catalogers, are increasingly utilizing their own techniques to acquire information. Finally, by assessing the nature of cooperative cataloging “through the eyes of a therapist,” the author argues that anxiety regarding the profession’s future largely stems from changes in societal values, which, often result in the devaluation of traditional methods and a need for new skill sets. The shifting retrieval needs of users, contends Schuitema, are accompanied by a requirement for new skills. Many practitioners are simply unable, or unwilling, to adjust to this demand.

While she states that the profession’s future is less clear than ever and that the past fails as an appropriate guide, Schuitema is still optimistic. She sees cooperative cataloguing moving toward a fork in the road, one path involving the continuation of traditional methods and the other “curving” to accommodate changes among societal values and information-seeking behavior. Using Marjorie Kelly to stress the notion “that we can’t advance as long as we’re holding tight to what not longer works,” Schuitema believes that if catalogers fail to meet new challenges, the profession will be at an “impasse.” She suggests that librarians “rid themselves of the notion that there is one best way of organizing the world of information.” By doing so, the cataloging profession can avoid diverging on different paths and, instead, find ways of bringing seemingly disparate needs into “confluence.” (264, 268-269) Schuitema makes a strong case. Catalogers must adapt to technological changes and values shifts in order to advance the profession and meet the information needs of patrons. If cooperative cataloging cannot effectively change course, then competing methods of addressing retrieval needs may indeed render the industry obsolete.

Article Summary for Lecture #11—Olsen, Nielsen, and Dippie

In “Encyclopaedist Rivalry, Classificatory Commonality, Illusory Universality” Hope Olsen, Juliet Nielsen, and Shona Dippie examine the cultural construction of classification systems by deconstructing encyclopedic texts produced by Jean d’Alembert, Denis Diderot, and Samuel Taylor Coleridge. Their findings suggest the presence of biases within certain classification structures with the “potential to erase cultural identity.” Specifically, the authors assert that the hierarchical structure of mutually exclusive categories presented by Coleridge and d’Alembert (and, to a lesser degree, Diderot) “result in a homogeneity that denies difference and identity.” The article offers excerpts of the texts, careful descriptions of the classification systems put forth within them, as well as analyses of the cultural and historical underpinnings of each. By revealing the potential for bias within the classification structures of the selected texts, the authors hope to encourage alternative approaches to these widely embraced forms of knowledge organization. (457, 463-464)

Following a brief introduction and an explanation of their methods used to encode the texts, the authors denote some key differences between the French and English schools of classification. The first of these deals with purpose. Diderot and d’Alembert, influenced by the French Revolution, sought to supplant the traditional authorities of crown and church with enlightened reason. Their inspiration, therefore, was grounded heavily in the European Renaissance. Coleridge saw the French Revolution as an attack on legitimate sources of logic and reason, primarily the divine, esteeming the order and method found within medieval England’s scholastic curricula—the trivium and quadrivium. But while the French and English writers “seem leagues apart” in terms of purpose and history, the authors contend that their texts express several important commonalities. (458-460)

The authors claim that Coleridge and d’Alembert shared a belief in the universal structure of knowledge organized in mutually exclusive categories arranged according to teleological progression and hierarchical primacy. The basis for mutually exclusive categories is found both within d’Alembert’s concept of “impenetrability,” which divides space into categories separated by the unique properties of the bodies they contain, and Coleridge’s differentiation of the uniting and progressive attributes of, respectively, law and theory. In addition, both writers saw knowledge as working toward human progress, a purpose that, according to the article, resulted in the logical subdivisions of classification presented by each. Coupled with this teleological progression was also an adherence to hierarchical arrangement. The authors ably show that Diderot and Coleridge classified objects and ideas by nesting categories representing them within one another, reducing and expanding knowledge in order to promote a generalized portrait of reality. (460-462)

While the classification schemes presented by the English and French thinkers might be applied universally, the authors contend that they are limited to western thought. The categories of classification suggested by Diderot and Coleridge, based on a progression of culture advancing certain principles of knowledge over others means that “cultures not characterized by these principles are not able to have knowledge, are not authoritative and lack identity.” (463) Olsen, Nielsen, and Dippie make a strong argument. It is important for organizers of information to deconstruct the theories and philosophies beneath the classification structures they employ, to point out potential flaws, and to work toward improvement. A system widely embraced and accepted to be universal in application may turn out to be exclusionary. However, classification professionals often have limited resources available to manage an ever-expanding abundance of information and, therefore, have to make the most of the systems in place. Perhaps the best that can be hoped for, and what the authors of this article assert, is that alternative approaches can work alongside traditional ones to foster greater cultural inclusivity.

Article Summary for Lecture #10—Northedge

In “Google and beyond: information retrieval on the World Wide Web” Richard Northedge examines the methods and problems involved with indexing Internet web pages. He provides a brief history of approaches to online information retrieval, describing how the various methods developed and evolved, and offering useful descriptions of both human-generated web directories and automated search engines. Northedge notes that in the early days of the World Wide Web, directories such as Yahoo! capably offered adequate lists of Internet web pages. However, the growth of the web quickly outpaced the ability of humans to index all the information it contained, which, according to Northedge, led to increased use of search engines. In spite of this prominent trend, online information retrieval processes still have not become fully automated but, at the time Northedge writes, feature a number of human-driven alternatives. Rather than anticipating the end of human efforts within this process, Northedge envisions a future of online information retrieval dominated by computer-generated indexes operating with datasets and language standards provided by humans.

Northedge cites Clay Shirky to point out the “failure of traditional human classification techniques when applied to the web.” Human indexing, Northedge continues, works best within a small corpus, with a fixed and unchanging text, featuring clearly defined categories and a controlled vocabulary. (192) The rapid and broad expansion of the web, consequently, made it an environment unsuitable for application of human-indexing methods. Search engines, such as the one operated by Google, provide a more appropriate means of indexing the billions of available web pages. Instead of scanning each web page in its entirety, explains Northedge, search engines employ a software program known as a “spider” or “robot” that continuously scans web pages, analyzes their content, and stores certain aspects of that content in databases. The particular means of collecting and storing the information gleaned from the web pages (done through the use of algorithms) is often proprietary—as is the case with Google. It is, nevertheless, understood that information is collected from web pages and stored as metadata that is, in turn, scanned by the search engine when users perform searches to produce relevant retrievals. (193)

While this process has become standard, Northedge notes several alternatives to the use of automated algorithms in indexing web content. One approach is to allow the creators of web pages to assign their own subject keywords—not a practical solution due to the potential for abuse of the system. A more widely accepted trend, says Northedge, is the assigning of keywords by the users of the web pages in what has been called “tagging” or “folksonomy.” Variability in language use is perhaps the biggest drawback to this approach, however. Because of such limitations, many have given up on looking for a way to index the web in its entirety and, instead, advocate indexing only those resources deemed “high-quality.” Northedge points out that others are focusing their efforts on ways to overcome the language variance problems of search engines—examples include shifting from word-based to “lexeme”-based systems and David Crystal’s taxonomic database, Textonomy. (194)

The strengths of Northedge’s article reside in his concise but useful description of search engines and his examination of these tools against the problems presented by indexing web content. With inexhaustible resources, humans currently are still better indexers than automated systems. However, time and money are costly and, so, computers must be involved in this process. As such, Northedge’s assertion that the future will consist of computer-generated indexing controlled by human-driven datasets and language standards seems a reasonable supposition. (194) Meanwhile, the web is becoming more extensive, but technological systems are becoming more sophisticated along with it. The next decade could hold unexpected shifts within the world of web page indexing.

Article Summary for Lecture #9—Mai

In “Analysis in indexing: document and domain centered approaches” Jens-Erik Mai discusses different approaches used by indexers to determine the subject matter of, and assign index terms to, documents. He argues that, by focusing strictly on analysis of the document and failing to take into account context and users’ needs, the prevalent document-centered approach is “problematic.” Mai instead suggests a domain-centered approach that shifts the focal point from the document to the context of the domain—i.e., “an area of expertise, a body of literature,” or “a group of people who share common goals.” This new approach adds to the traditional two-step indexing model of analyzing a document to determine its subject matter and assigning index terms to the document by including analyses of the relevant domain, the indexer’s perspective, and user needs prior to examining the document itself. Document analysis, contends Mai, is complex. The overall benefit of the domain-centered approach is that it “offers a framework to manage the complexity” and achieve “effective results.” (600, 605, 609)

Mai begins by defining the different approaches used among indexers and discussing the strengths and weaknesses of each. A document-oriented approach, says Mai, attempts to determine subject matter solely from the document itself—context and users’ needs are not considered. Differing only slightly is the document-centered approach, which, while still focusing only on the document when defining subject matter, bears in mind users when assigning index terms. Finally, a user-oriented approach calls into question users’ information needs both in determining subject matter and in assigning terms. Mai draws upon the work of Hjorland and Albrechtsen by combining their domain analysis theories with the user-centered approach. Domain-centered indexing is a more effective approach, Mai argues, because it provides indexers a “clear frame of reference for making decisions” that is “consistent with users’ use of information.” (599-600, 609)

Mai differentiates between the two steps involved with document-centered indexing when discussing the weaknesses of that approach. He claims that standard practices can be applied to step two—assigning terms—but, when it comes to defining a document’s subject matter, universal techniques simply are not applicable. This is because within a document-centered approach subject matter is based solely on analyses of a document’s attributes—title, table of contents, chapter headings, etc.—an ambiguous process, which, according to Mai, is often “left open to interpretation.” Since document attributes do not fully disclose subject matter, the indexer is forced to render judgments based on limited information. To do this a degree of “contextual knowledge” is required, which, in Mai’s view, provides the warrant for his domain-centered approach. (600-602)

Mai’s position on indexing is that context is key. Context is an essential ingredient in developing an understanding of a document’s intended uses, which, in turn must be understood before its subject matter can be ascertained. His idea of domain is simply a way to construct the various contexts within which documents are analyzed. Contexts will therefore differ among domains and, so, slight variations in an indexer’s approach will be required. User needs and indexer perspectives, according to Mai, are additional parts of the contextual knowledge surrounding a document. When viewed at the document level, his approach works from the outside in, the steps of which are 1) analysis of domain, 2) assessment of users’ needs, 3) determination of indexers’ roles, and 4) analysis of document’s subject matter. Mai presents a very compelling case. The traditional approach, due to the inherent limitations involved with establishing subject matter solely from a document, seems flawed. A broader contextual knowledge of users, indexers, as well as documents seems a better formula for generating more appropriate and practical indexing results. (603-609)

Article Summary for Lecture #8—Knowlton

In “Three Decades Since Prejudices and Antipathies: A Study of Changes in the Library of Congress Subject Headings” Steven Knowlton examines changes among LC subject headings related to people through the lens of suggestions made by Sanford Berman in 1971. In his book, Prejudices and Antipathies: A Tract on the LC Subject Heads Concerning People, Berman argues that much of the terminology employed by the LC in its creation of subject headings is overtly biased, largely inaccurate, and often offensive. Berman cites specific examples of biased terms and, in many cases, proposes corrections.   Using Berman’s assertions as a model, Knowlton examines the current status of the LCSH, identifying revisions made to certain headings and pointing out remaining biases. (124-126)

Knowlton begins with a general discussion of LCSH and the problems associated with its biased terminology. He notes that, in spite of the criticism directed toward the LCSH, it has still managed to achieve widespread acceptance among library professionals. In addition to the scrutiny aimed at the LCSH’s structure and form, says Knowlton, critics beginning in the 1960s began to point to biased language within the subject headings. Subject terms favoring a specific group(s), Knowlton writes, “can make materials hard to find for other users, stigmatize certain groups of people with inaccurate or demeaning labels, and create the impression that certain points of view are normal and others unusual.” With its objectives including the accurate reflection of topical language, elimination of bias, and better guidance of users to material, Berman’s P & A is part of a surge of scholarship successfully advocating revisions of LC subject headings. (124-125)

Knowlton displays a series of tables containing a list of Berman’s proposed changes to the LCSH. The tables are organized in such a way as to distinguish between 1) headings changed in accordance with Berman’s suggestions, 2) headings partially changed (and that might entail new problems), 3) headings not changed and 4) headings from Berman’s DIY category. To do this, Knowlton compares current LC subject headings with the suggestions made by Berman. He supplements these findings with information found in the Cataloging Service Bulletin both to confirm the changes and to identify the dates associated with each. His conclusions are interesting. Between 1971 and 2003, according to Knowlton, the LC has followed through with 145 of 225 revisions proposed by Berman. 88 of these, Knowlton states, reflect exact changes recommended by Berman while 54 others show updates partially adhering to his suggestions. Objectionable headings that have not been revised express some form of “literary merit,” represent a “restructuring” of bias, or deal with differing opinions related to topical cross-referencing. (126-128)

Knowlton does well to highlight the many changes that have taken place within LCSH. Clearly, the efforts of Berman and others played no small role in facilitating these changes.   I find a very useful feature of Knowlton’s article to be the presence of tables that display his findings. Readers can scroll through the terms challenged by Berman, noting the actions taken by the LC concerning each. For example, readers will see that the term “mammies” has been deleted—replaced by “Child care workers, Wet-nurses, Nannies”—and all African American subdivisions of the term removed, and also that, despite objection, the heading “Slavery in the U.S.—Insurrections” remains. Any attempt to interpret changes such as these will involve a level of speculation. Nevertheless, LIS professionals—as well as users—need to be aware of changes to the LCSH and should make an effort to understand both the reasons behind the changes as well as their implications. (Appendix, Tables I, II)

Article Summary for Lecture #7—Taylor

In “On the Subject of Subjects” Arlene Taylor demonstrates the importance of subject cataloging, despite its being a “disreputable,” “ignored,” and “disparaged” branch of the LIS profession. (484) The majority of librarians, according to Taylor, are quick to point to studies reflecting declining use of subject-filtering tools in order to discount the overall utility of subject searching. While Taylor acknowledges this decline, she attributes it not to a lack of trying among users but says, rather, that they are thwarted in their subject searching efforts by either zero hits or too many results. This failure of the catalog, continues Taylor, forces users to resort to use of keyword searching which, in turn, limits both the accuracy and relevancy of search results. She then describes various ways in which reliable subject searches are beneficial, suggesting that “innovation” is needed in these areas in order to correct the problems patrons face when searching by subject and to increase the overall accuracy of results. (490)

Taylor notes that, due to the rapid increase in Internet use, as well as the issues involved with subject searches, many people prefer to search by keyword. She then points out some problems associated with this method. First, keyword searches cannot identify synonyms and, therefore, deny users access to many relevant sources. A keyword search for dogs, for example, would not produce results containing the word canines. Another limitation of keyword searches mentioned by Taylor is that they cannot assist users in differentiating between multiple definitions of words. “Search engines,” writes Taylor, “cannot tell you if a suit is a legal term or a set of clothing.” This ambiguity is, indeed, a limiting factor. Finally, and perhaps most importantly, keyword searches do not associate the various relationships among items. This important feature of search tools requires a degree of effort and ingenuity from humans. So, while search engines are cheaper, simpler, and often faster that subject searches, Taylor contends that their flaws seriously limit their value for researchers “looking for the best on a subject or everything on a subject.” (486)

Taylor then discusses the current state of online catalogs, describing proposed ideas for transferring classification standards into the digital realm. She lauds the potential of OPACs but says that they hinder subject searching because they either produce too few or too many results. Taylor cites a study by Ray Larson showing that “only 12 percent of searches retrieved between 1 and 20 items,” and attributing this problem to failure of current systems to account for misspellings, singular vs. plural forms, specific terminology, and a general lack of knowledge among users of subject headings. Larson—and others such as Marcia Bates, Karen Drabenstott, and Tschera Connell—have introduced ways to overcome these flaws and better assist patrons in using OPACS. (488)

Taylor’s final topic is that of LC Subject Headings. She notes that, despite talk of completely discarding the system, the benefits and widespread use of the LCSH means that it is “here to stay.” Improvements to the LCSH have been encouraged, however. A revision committee suggests altering the arrangement of form subdivisions, designing a separate subfield code for them, and creating authority records “for combinations of topical headings with topical subdivisions,” all aimed at promoting consistency of structure and simplicity of use. (489)

For Taylor, the importance of subject searching is obvious. “Without the subject vocabulary,” claims Lawler, “the catalog record identifies a known item but gives no clue to content unless the title has content descriptive words.” While she concedes the presence of problems associated with current systems, Taylor believes these are not insurmountable. More specifically, she suggests that, in order to improve the overall efficiency of subject searching, librarians engage in clearer subject analysis, improving education, promoting collaboration, and implementing proven techniques. I tend to agree with Taylor’s assertions. Subject cataloging may not be necessary for those content with searches resulting in merely “something.” For those who want more—who seek more comprehensive, relevant, and accurate results—subject searching is absolutely essential. So, efforts need to be made to limit the flaws of current subject search tools and to improve their overall functionality. (490, 486)

Article Summary for Lecture #6—Wajenberg

In “A Cataloguer’s View of Authorship” Arnold Wajenberg addresses problems associated with defining authorship within cataloguing systems and attempts to formulate a more appropriate working definition of the term as a classification heading. As the title implies, Wajenberg writes—somewhat hesitantly—from inside the cataloguing profession and, as such, he is “concerned, not with the existential universe, but only with the bibliographic universe.” His ideal definition of authorship, therefore, is one that emphasizes simplicity in function over widespread inclusivity. Wajenberg defines an author of a work simply as one “identified” as such within the work, or within a secondary source. This “definition by attribution,” asserts Wajenberg, helps resolve many ambiguities associated with traditional methods of assigning authorship. (24-25)

Wajenberg provides a discussion of earlier attempts to define the term “author,” pointing out limitations and inconsistencies associated with each. He begins with Cutter’s 1904 definition of an author as the person—or “bodies of men”— “who writes a book” (narrowly) or “is the cause of the book’s existence” (broadly) and notes that variations of this definition have made their way into every cataloguing code since. Writing in 1969, Lubetzky attempted to add clarity to the author classification by defining the term as “simply the person who produces a work.” But, according to Wajenberg, the term “produces” is itself ambiguous and, therefore, a cause of frustration among cataloguers. Wajenberg ascribes to the ideas of Michael Carpenter who, in his book Corporate Authorship, argues that issues related to “multiple and diffuse authorship” render futile attempts to connect an author(s) to the original production of a work. Translated, collaborative, and even computer-generated works confound efforts to link those works with the names that, according to Wajenberg, are “bibliographically significant.” (21-23)

It is due to these problems that Wajenberg suggests his revised definition of authorship. However, he is careful to note limitations of this definition, which, as one might guess, are centered on the matter of identification. While bibliographic objects generally reveal author information, there are often reasons to question the certainty of this information. If one defines authorship by attribution, then one has to attribute accurately. Wajenberg cites as examples the problem of correctly linking early works, such as the Homeric epics, with their true author(s), as well as the many incorrect attributions among English plays. These types of problems are not insurmountable, contends Wajenberg, but can be handled with a little effort and “a measure of scholarly ability.” Often cataloguers must explore secondary sources in order to gain insight into, and corroborate claims of, authorship surrounding a work. Himself a cataloguer, Wajenberg believes that this level of cataloguing activity should be expected among his peers. (24-26)

Personally, I find Wajenberg’s definition acceptable. While, at first glance, it appears overly simplistic, its real utility resides in the fact that it unburdens cataloguers of tasks associated with determining the various roles/activities of those involved with the creation of a work. Problems related to cases of multiple/diffuse authors, under application of this definition, are no longer a concern of the cataloguer. If a work attributes authorship to a person/entity, then the name of that person/entity should be assigned to the cataloguing record. And although, as Wajenberg admits, there are certainly cases where attribution will not be clear-cut, there are methods to be followed that can compensate for this. Overall, Wajenberg presents a simple but effective model for defining authorship within bibliographic records systems.