A Better Barthesian Tool: Xebra
Xebra In Context: Related Work
Xebra's predecessors include not only Roland Barthes' manual system, but a variety of computer-based systems, as well. Thus, it falls under the generic term: "computer criticism." Initial efforts in computer criticism focused on analyzing numerical attributes of signifiers by using computer-based tools. Later efforts have been more linguistically oriented, attempting to focus computing power not on numbers, but on language itself. Xebra has its place among both the old and the new in this world of computer criticism, containing, as it does, elements of both.
The primary goal of computer criticism is to reduce manual effort, to let the computer do that which it does best (tedious, repetitious operations), thus freeing the mind to do what it does best (think). As noted in CHAPTER I, John B. Smith has cited the Barthesian system as one that could certainly benefit from a computer augmentation.1
Stated differently, the goal is to push any need for genius up a level, from the mundane task of parsing through text in order to unearth literary artifacts, to the more interesting and more pertinent one of interpreting the found objects, singly and in relation to each other. Xebra shares fully in this goal of computer criticism, being an attempt to reduce the need to assert intellectual effort on those elements of a Barthesian reading that can be done by a computer, thus allowing the mind its freedom to work on those elements that more closely bring one to one's end goal in reading texts.
1John B. Smith, "Computer Criticism." Style 12 (1978): 337.
Computer Criticism Tools: A Typology
In theory, there are three types of computer criticism tools: those that do text indexing; those that support the numerical or statistical analysis of language elements in texts indexed via the first type; and those that attempt a direct, deep, and full linguistic parsing of texts, incorporating both the first and second kinds. The first are known as low-level, the second as mid-level, and the third as high-level tools. In practice, there are many low-level, some mid-level, and none of the high-level. Xebra is a mid-level tool, with some elements of a high-level.
The low and mid-level systems exist for two simple, interconnected reasons: one, these approaches to computer criticism were the first to be attempted; and two, they are, in fact, the most amendable to known computing techniques. Natural languages present a computationally difficult, and, perhaps impossible, target. Whole texts have been written on the subject of the complexity of the task.
However, in summary, it can be said that there are mathematical proofs that show that the problem could very well be intractable, that is, non-solvable by any algorithm, known or unknown, in real time on a computer. Regardless of the mathematics of the matter, no one today has solved the problem of parsing natural languages as humans parse them. Instead, all solutions, to date, involve some form of grammar constraint definitions.2 Certainly, the systems developed or proposed up to this time for attempting computer criticism rely on tightly constrained, formally structured grammars for describing the textual data they can or would process.
Xebra is a mid-level system in that it supports statistical and graphical analysis of the Barthesian system products: the lexias and their associated signifier labels. It contains some elements of a high-level tool in that it supports surface parsing of texts in order to generate the lexias. There is no pretense in the system to true natural language recognition capability regarding the identification of signifiers directly for generating the lexias. Instead, it constrains the problem by using analogs of signifiers, such as noun phrases and text segments delimited by major punctuation marks, that are computationally amendable.
2G. Edward Barton, Robert C. Berwick, and Eric Sven Ristad, Computational Complexity and Natural Language, (Cambridge: MIT Press, 1987), 2-6.
Computer Criticism Tools: The Low And The High
Low-level computer criticism tools sought to compute information on the words of a text regarding their position in the text, that is, they were indexing tools. High-level tools would seek, if they existed, to fully parse and analyze a text using a particular literary methodology as the systematic approach to the text. The low-level tools were relatively simple to build, based as they were on algorithmically straight-forward numerical computation. High-level tools would, of necessity, have to use natural language processing algorithms that do not currently exist, and might never exist.
An example low-level tool would be the one S. Y. Sedelow developed, referred to as INDEX, in the late 1960's as the basis for early work in computer criticism. INDEX generated, as output, the words of the input text, each with an associated series of numbers, that reported the relative linear position of the word in the text from the beginning, as well as hierarchically, in terms of volume chapter, paragraph, and sentence.3
Xebra uses indexing capability of the kind discussed here only indirectly when it is attempting to generate lexias via noun phrase or punctuation mark counting. In fact, in general, the main value of low-level tools is in acting as input processors to mid-level tools that can use the index numbers for computing a variety of useful data regarding a given text. In the next sub-section a tool that used Sedelow's INDEX tool directly is discussed.
3John B. Smith, "RATS: A Middle-Level Text Utility System," Computers and the Humanities 6, no. 5 (May 1972): 277-278.
Computer Criticism Tools: The Mid-Level
Mid-level tools seek to increase the capability of a computer to take on the work of humans in terms of analyzing a text for meaning. The initial efforts in the 1960's and 1970's were generally not meant to be complete, explicit computer-based realizations of any given systematic approach to reading a text in their design or implementation, though they all could be used in support of a variety of such approaches, including to some degree, Roland Barthes'. It is only in the 1980's that effort turned to designing and implementing specific literary theory systems.
The early mid-level systems were essentially ad hoc constructions meant to support any given research approach that the system architect wanted to explore. The first computer-based literary critic's tools were equivalent to the carpenters tools such as hammers, saws, nails, etc., sitting in a tool box, waiting to be used for whatever purpose the tool handler might have in mind.
For instance, John B. Smith, noted earlier for his views on the Barthesian system and computers, along with Sally and Walter Sedelow, largely focused on building tools that performed word indexing of texts that included word position linearly and hierarchically, with subsequent statistical and graphical analysis of the quantities so discovered.
That is, they would count elements, like words, or going up a level of abstraction, images (such as mother images), and then correlate or graph the results of these counts in ways that designed to reveal patterns of use and, thus, the purpose, or, at least, characteristic use within the text of the elements being counted in the text.
Quantification of individual words and the subsequent mathematical analysis of the results have been particularly helpful in studying stylistics when the information garnered is analyzed for characteristic use, as well as for building concordances.4 Going up a level of abstraction from words alone, Smith has shown quantification of the images used in a text, and subsequent mathematical analysis of the results , is useful in interpretive efforts as well.5
Smith's system, Random-Accessible Text Systems (RATS), is an excellent example of a set of tools at the mid-level. In his 1972 article, "RATS: A Middle-Level Text Utility System," he gives a full description of RATS. He also presents, in summary form, INDEX, the system noted earlier, developed by S. Y. Sedelow, upon which RATS depends upon for initial processing.6 In order to illustrate the differences and commonalities between this form of a mid-level tool set and Xebra, a short summary of RATS follows.
4Nick Cercone and Carole Murchison. "Integrating Artificial Intelligence into Literary Research: An Invitation to Discuss Design Specifications." Computers and the Humanities 19 no. 4 (October-December 1985): 236-237.
5John B Smith, Imagery and the Mind of Stephen Dedalus: A Computer-Assisted Study of Joyce's A Portrait of the Artist as a Young Man. (Lewisburg, Pa: Bucknell University Press, 1980).
6Smith, "RATS," 277-283.
Smith refers to RATS as a "Middle-Level" system because it does more than simple word indexing, but less than complete critical analysis parses; that is, RATS gives the reader more than simple text indexing, but does not attempt to do any linguistic-based critical analysis of the text as a critic might do. In that sense, Xebra is a mid-level text utility system as well, for while it does an actual surface parse of the text, based on parts of speech unlike RATS, it still does not attempt to do the critical analysis itself.
Instead, both RATS and Xebra expect critics to use the results of the indexing, in the former case, and parsing in the latter, as input into an analysis phase. Xebra is further differentiated from RATS by its specification and inclusion of particular kinds of analysis tools. RATS assumed critics would build their own tools; Xebra does not.
Xebra is aimed at supporting those species of reading methodologies that: are generally phrase oriented, require some kind of coding or labeling of the text, and require analysis beyond word counts and word positions. RATS was meant to support analysis that was largely word oriented, based on word position and word counts. The meanings of the words did matter when moving up to levels of abstraction such as image counting, but this type of analysis was a step performed above the level of RATS, whereas it is at the heart of Xebra. The real difference in the systems, however, lies, finally, in a matter of philosophy.
The philosophical basis of RATS is that of any professional who seeks to have a set of tools capable of performing a wide variety of tasks, while that of Xebra is that of a professional who seeks to have particular tools designed to solve specific types of problems in a specific way. Thus, RATS consists of a set of purposely unintegrated, highly flexible tools that can be applied in a variety of ways to solve critical analysis problems, while Xebra contains purposely integrated tools designed for a specific purpose and use under the umbrella of a specific critical methodology for analyzing texts.
Xebra is not the only integrated system that has been either proposed or currently exists. For purposes of placing Xebra in context, three are of interest: one that exists only on paper, and two which have actually been implemented, to one degree or another. The one that exists only on paper is an expert system proposed in the early 1980's as a test-bed system for reader-oriented critical analysis methods, while the two that have been implemented are mid-to-late 1980s vintage systems. Of the latter two, one is essentially a database system tailored to support critical models used for dramatic works meant for the stage, with appropriate modifications and refinements made possible with the computer, while the other is a text analysis system that attempts to parse a text for thematic segments based on noun phrases with support for analysis of lexical repetition within context.
The unimplemented system is interesting because it was a proposal that pushed the boundaries of natural language processing systems of its time. In fact, even a decade later, it still pushes those boundaries. As such, it is certainly a system beyond either John Smith's "Mid-Level" tool-box or Xebra's integrated tool set.
Patricia Galloway introduced her system in her article "Narrative Theories as Computational Models: Reader-Oriented Theory and Artificial Intelligence."7 After giving an overview of current artificial intelligence (AI) models for natural language processing (NLP), she gives a description of how those models relate to Wolfgang Iser's reader-oriented theory of literary analysis which he described in chapters 3, 4, and 5 of his text The Act of Reading.8
To begin her description of the proposed system, she stipulates that the top level of Iser's model was not addressed by then current AI/NLP systems.9 Nor, is it still, for it would necessitate a semantic analysis focusing on structural clues to implied-authorial intent. This deep of an analysis is still beyond AI/NLP. If it were possible to do semantic analysis at that level, then that which Barthes' lexia-derivation formula requires would be a mere sub-set of the application that could then be used in Xebra.
Having thus dismissed the author-side of Iser's theory as beyond computational models then current, Galloway launches into a description of the reader-side of the theory as it might be implemented via AI/NLP computational models. The system she describes is an expert system that has a knowledge database representing a reader's knowledge of the world combined with two text processing strategies that match structures found in the text with predefined structures in the database to fill in story frames.10 Frames are a data structure technology invented at the Massachusetts Institute of Technology in the 1970's by Marvin Minsky in an attempt to computationally "read" and "understand" simple newspaper articles.11 The two Iser strategies, schema-and-correction and theme-and-horizon, are shown by Galloway to be computationally definable as frame-filling strategies, and, thus, theoretically implementable.
7Patricia Galloway, "Narrative Theories as Computational Models: Reader-Oriented Theory and Artificial Intelligence." Computers and the Humanities, 17, no. 4 (December, 1983): 169-174.
8Ibid., 169.
9Ibid., 171.
10Ibid., 172-173.
11Marvin Minsky, "A Framework For Representing Knowledge." in The Psychology of Computer Vision, ed. Patrick Winston, (New York: McGraw-Hill, 1975) 211-277.
However, the problem with them from a computational view is twofold. One, nobody has solved the semantic parsing problem sufficiently to find the kinds of patterns necessary for matching against the knowledge base. It had not been solved in 1983 and it has not been solved in 1994. Two, there exists no knowledge base that would be sufficient to help read anything other than the most mundane of newspaper stories. Building such a database is widely recognized as a very substantial undertaking, requiring much time and effort. The result, then, is that Patricia Galloway's system remains unimplemented, and perhaps unimplementable.
On the other hand, there are systems that have been implemented that, like Xebra, are not based on word indexing (such as Smith's RATS), but do not reach for or require the linguistic parsing capability of Patricia Galloway's system. One such system was described by Elaine Nardocchio in her article "Structural Analysis of Drama: Practical and Theoretical Implications."12
The system, known as the McMaster Project, which Nardocchio describes, is aimed at implementing a specific critical methodology or model. This model is based on two models that were developed for reading drama, one by Etienne Souriau in 1950, the other by Tadeuz Kowzan in 1968.13 Both models involve tagging various elements of a play. Souriau's model tags character roles by scene using six tags, while Kowzan's model tags non-verbal aspects of the action and set, as specified in the written text of the play, in the dialogue, or the stage directions, using five tags. Both models are similar to Barthes' model, as they are each focusing on signifiers and their connotations.14
The McMaster Project's system, as originally implemented, was a specialized database system that contained the text of the play and the tags. A reader could retrieve through the text, for example, by asking for all tags relevant to a scene, or through the tags by asking for all relevant portions of the play for a given tag or set of tags. A variety of structural analysis methods could then be applied to the results. Initially, the system database supported only the tagging as done by one person, though the goal was to eventually allow multiple tagging databases for each text, which would support comparison and contrasting, perhaps in an automated fashion, of a variety of interpretations of a given text.15
From one point of view, the McMaster Project's system is a specialized version of Xebra, since it is oriented around a tagging-based critical model and uses a database as its centerpiece for computer implementation. The McMaster Project system differs, however, in that the target text must be manually parsed and tagged, while Xebra has provision for computer-based parsing. Also, the McMaster Project's system does not have a specific set of analysis tools integrated with it. Essentially, Xebra could be used to do what the McMaster Project's system can do with a play. An interesting experiment would be to apply Barthes' lexia model to a play text using Xebra's parsing tools. However, this is beyond the scope of the current project.
12Elaine Nardocchio. "Structural Analysis of Drama: Practical and Theoretical Implications." Computers and the Humanities 19, no 4 (October-December 1985) 221-223
13Ibid., 221.
14Ibid., 221-222.
15Ibid., 223.
The third integrated system discussed here, in relation to Xebra, was developed by J. Léon and J. Marandin in France. The goal of this system is to produce systematic readings based on the authors' stated hypotheses and assumptions. In this regard, it is quite similar to the goal that Roland Barthes had in mind as he designed and implemented his own system in S/Z. As will be discussed later, Léon and Marandin were fully aware of Roland Barthes and his reading methodology.
The Léon and Marandin system applies two models to a text. The first is one based on the hypothesis that certain kinds of noun phrases act as headings under which sequences of sentences, after the occurrence of such a noun phrase, represent a thematic portion of a text.16 The second is derived from Roland Barthes' work in S/Z, where he states that there are clusters of connotative meaning that reference other clusters of such meaning, both in a particular text and between texts. Léon and Marandin have attempted to compute these clusters by measuring the density of use of particular words, that is, signifiers, within a text. Any two or more clusters of text that have a similar density of use for the same signifiers are thus said to be equivalent to Barthes' notion of clusters of connotation.17
16Jacquelin Léon and Jean-Marie Marandin, "Sarrasine Revisited: A Perspective in Text-Analysis," Computers and the Humanities, 20, no. 3 (July-September, 1986): 217-218.
17Ibid., 220.
These are both interesting hypotheses, but in terms of Xebra, the first is particularly so. It is similar to one upon which part of Xebra is based. The hypothesis in Xebra is that some number (call it X) of noun phrases is equivalent to Barthes' semantic unit which he calls a signifier. This hypothesis is derived from both basic knowledge of language, as is Léon and Marandin's, and from examining Barthes' lexias for S/Z, which contain one to four such signifiers. In many cases, it is possible to trace, with ease, the signifier Barthes tagged back to its noun phrase, though verbal phrases count as well, since actions are of interest to him. But by counting X number of noun phrases and cutting the text at that point, verbal phrases are caught as a matter of course.
An interesting point concerning the Léon and Marandin system is that it has been tested against Sarrasine, in full knowledge of the work of Roland Barthes in S/Z. One objective in the test was to show that the Léon and Marandin system is more systematic than the Barthesian system. As they say in their article "Sarrasine Revisited: A Perspective in Text-Analysis":
The work which underlies Barthes' reading [of Sarrasine] is implicit, hidden; our procedures are explicit and use hypotheses and assumptions the scope and value of which can be gauged by applying them to other texts. Our reading is done systematically and we show how we read. In fact we put into practice what Barthes only postulates: "There is no other proof of a reading than the quality and endurance of its systematics.".18
This assertion, that Barthes' methodology is only theoretically systematic, appears to be somewhat hastily drawn. The Barthesian approach includes specific and explicit procedures based on assumptions and hypotheses that are testable. Specifically, Barthes' central hypothesis (that the analysis of the pieces of a text which have been produced by sequential, essentially arbitrary cuts, with the only constraint being that each piece contain from one to four atomic semantic units or signifiers can serve as the basis of coherent, carefully annotated critical readings using an over-arching literary reading model) is testable, at least by inductive means.
Which is, in the final analysis, as much as one can say of Léon and Marandin's hypotheses, as well. The problem lies, in both cases, in the nature of language, in its tendency towards ambiguity through its infinite capability to mean, thus eluding all concrete attempts to finally pierce a meaning or signifier with a pin and mount it in a specimen box, forever preserved and known.
Finally, regarding the Léon and Marandin system, it is important to note that, like all the systems discussed in this section, including Xebra, it does not actually perform any literary analysis. Instead, like Xebra and RATS, it parses a text in a manner that allows the literary critic to then apply some analytical methodology to the results of the parse. Unlike Xebra, but like RATS, it does not have any analysis tools integrated into it.
18Ibid., 222.
Xebra and Computer Criticism Tools: In Summary
The state of computer-based criticism is not, conceptually, much beyond what it was twenty years ago. However, there has been advancement in the underlying technology to the point that a researcher no longer needs to build the individual tools in his tool box, like a RATS tool, nor even a tool like Xebra, which includes advanced parsing capability beyond that which RATS contained, as well as a set of integrated tools for analysis work. What has not, unfortunately, occurred is sufficient advancement to allow true semantic parses which would allow for sophisticated, automated analysis, without, or at least, with limited, human intervention. Instead, computer criticism is constrained to approximation techniques such as the use of noun phrases as a semantic unit, as is used in Xebra and in the Léon and Marandin system.
Xebra: A Functional Description
The computer-based Barthesian tool developed and used for this research is called Xebra, for Experimental Eclectic Barthesian Reader's Assistant. Xebra is 'Experimental' and 'Eclectic' in the sense that it supports reader supplied classification systems, thus allowing a reader the freedom to experiment with a variety of schemes eclectically. It is 'Barthesian' in two senses. First, it supports, as a subset of its full functionality, the pure Barthesian reading methodology. Second, the system owes its existence to the inspiration of Barthes and S/Z. It is an 'Assistant' in the sense that it provides aid to the reader, but the reader is always in control of the process.
Xebra contains several functions that are either directly related or analogous to functions that Barthes requires in performing his form of critical reading. To do a straight Barthesian reading, one must:
- parse a text into lexias;
- analyze and attach labels to the lexias;
- analyze lexias that are related in terms of being part of a unique sequence of related labels.
In regard to these three required Barthesian activities, Xebra has the capability to:
- parse a text into lexias (Xebra supports three distinct parsing methods, that of manual parsing, imitating Barthes directly, and automatic parsing using either punctuation marks or noun phrases as guides for splitting the text, both methods being analogues of Barthes' method);
- support lexia analysis and the attaching of labels;
- support the analysis of patterns composed of lexias related in terms of being part of a unique sequence of lexias labeled by a specific label code and descriptive term.
In addition, Xebra also does the following:
- supports the generation and analysis of patterns composed of lexias related by multiple Barthesian label codes and their descriptive terms in the labels associated with the lexias;
- allows notes to be attached to individual lexia and to the labels;
- supports production of quantitative reports on the occurrences of any given code in the labeled lexias;
- does all of the above using alternative label schemes as defined by the reader.
The first three functions are either direct translations or analogues of Barthes' three prime functions. The next four are both generalizations and analogues of Barthes' approach.
Support for generating and analyzing patterns consisting of lexias related by multiple label codes and descriptive terms, is a generalization of Barthes' third function. The hypothesis behind this generalization is that there are strands of meaning that will be revealed by generating lexia sets that are defined by multiple label codes and descriptive terms that would not be revealed using the single sequence relationship.
For example, if a large set of lexias is retrieved, the members of which being related by the criterion of being labeled with SEM codes with either the descriptive term "Fools" or "Destructive Force," a possible correlation can be hypothesized. By examining the lexias involved in the set, it might be possible to conclude that fools use destructive force, or that only fools misuse destructive force or, that, in fact, it was only coincidence that these lexias had both signifiers.
The support for taking notes translates part of Barthes' method as he executes it in S/Z. As he assigns labels, he invariably makes side notations regarding both the labels and the lexias being labeled. These notes serve as holders of information, information which is usually implied by the labels, as opposed to being explicitly stated. They also make up part of his actual "critical reading." Part of the problem of translating Barthes' method to computer is the necessity of determining what functions he performs in S/Z that are related to his quest for finding the literary artifacts, the basic signifiers, and those that are related to his actual production of a critical reading from those pieces of data. Notetaking is a crossover function, supporting both the data acquisition, as well as, the interpretation processes.
The quantitative reporting is meant to support critical reading activities that depend on analyzing the occurrences of particular items throughout a text. For instance, it could be important that a particular action sequence has only two labels within it, while another one has thirty.
Xebra's ability to easily obtain and use quantitative data is a generalization of Barthes' method. Barthes' method, as he practiced it in S/Z, made no explicit use of such data. Implicitly, however, many of his conclusions regarding strands of meaning are clearly being supported by the number of occurrences of references to some code. For example, he makes much about the "Replication of Bodies" theme in S/Z, largely, it seems, because there are so many citations of the theme. Thus, it appears clear that quantitative data can be useful.
Finally, another, more broader generalization of Barthes' method incorporated into Xebra is that it supports alternative labeling schemes--it is not limited to Barthes' five codes or to the syntax that he has invented. Many reading methodologies, either implicitly or explicitly, involve breaking down a text into parts and labeling those parts in some manner. Barthes' methodology is probably the most explicit, all encompassing system in this regard, but, it is not alone. For instance, people who use Speech/Act theory to analyze a text must perform text parsing and labeling. Xebra supports this kind of activity, since the codes and labels used are not limited by Xebra, but, rather, are left to the definition of the Barthesian reader.
Regarding the three primary functions, function one, the parsing function, is both a direct translation and an analogue of Barthes' own parsing function. Barthes parsed a text based on semantic structures, allowing, at most, three or four per lexia. One of the questions to be explored in this dissertation, as discussed in CHAPTER I, is whether a truly arbitrary, mechanical parsing is as supportive of generating useful data for the final act of interpretation as Barthes' manual method that entailed the use of intelligence to make final parsing judgements. Therefore, Xebra allows for the manual parsing of a text into lexias, as well as for two automated methods. These latter two are analogs only of Barthes' signifier recognition requirement, there being no reliable algorithms for recognizing semantic structures of the kind Barthes used.
Function two is implemented as a direct translation of Barthes' manual method, with the computer doing much of the work. The reader supplies the labels, but the computer keeps track of which lexias the labels belong with, as well as, the related notes.
Function three, pattern analysis, is supported by automatically selecting and sorting all the lexias by a label code, or a label code and descriptive term, and then displaying the particular resulting lexia sequence pattern at the reader's command. A generalization of this function uses multiple label codes and descriptive terms for selecting and sorting lexias as specified by the reader.
An example of the first kind of selection and sorting is given in Appendix II of S/Z, where Barthes has selected all the labels with the ACT code, then sorted them by first occurrence of a sequence of a main descriptive term in the text and by lexia number within the sequence of lexias. Thus, he lists the ACT labels with the term "To Leave," which participates in two sequences. The first sequence crosses Lexias 135, 136, and 137, while the second crosses Lexias 291 and 294.19
Generalizing from the above, one could construct a lexia set that had lexias labeled with ACT "To Leave" and with REF "Chronology," the goal being, perhaps, the determination of when "leaving" was happening in the text with regard to time. In that case, one would then sort the lexia in terms of the time referenced in each lexia. Doing such a sort is especially useful when the time frame of the text switches repeatedly amongst past, present, and future time.
The kind of selection needed to set up the above example is, in computer/database terminology, an "AND" query type, since the selection is done by asking (querying) the database for all lexias that meet two conditions. Such a query can also be built using more than two conditions. One can also perform what are referred to as "OR" queries, where the analyst requests the return of all data that meet any one of two or more conditions. For example, one could ask for all lexias labeled with REF codes of "Chronology" or "History," the goal being the determination of any patterns having to do with time in the text, again via sorting the lexia by referenced time frames.
Making the system even more powerful, it is possible to construct queries that have both "AND" and "OR" conditions, nested to arbitrarily deep levels. Thus, one could ask for all lexias with ACT "To Leave" codes and either REF "History" or "Chronology." To look for any patterns across the retrieved lexia set, again one would probably sort the members by reference time frames.
CHAPTER IV, which evaluates a specific example of a Barthesian/Xebran reading done for this dissertation, contains several examples of this and the preceding kinds of queries. While one could do this type of selection and ordering manually, having a computer perform them relieves the analyst/critic of an effort that could only distract the mind from the actual goal.
To summarize, the three basic functions of Xebra aid the reader in parsing the text into lexias, labeling the lexias, and generating patterns by sorting the lexias in terms of a specific type label and its descriptive term. Supporting functions are the maintenance of any notes the reader might want to attach to either the lexias or labels, and a quantitative reporting capability. A generalizing function, not found in Barthes' own description of his system, generates patterns by selecting and sorting the lexias by more than one label code and descriptive term.
19Roland Barthes, S/Z. trans. Richard Miller, (New York: Hill and Wang, 1974), 256-257.
The Xebra Functions: Illustrated
To illustrate some of Xebra's capability, the opening lines from an English translation of S/Z are used here in a series of examples. First, there is a set of three parsing examples, and second, there is a sample page from a Xebra report showing a lexia and its accompanying set of labels.
The data file for these examples contained this text:
I. Evaluation There are said to be certain Buddhists whose ascetic practices enable them to see a whole landscape in a bean. Precisely what the first analysts of narrative were attempting: to see all the world's stories (and there have been ever so many) within a single structure: we shall, they thought, extract from each tale its model, then out of these models we shall make a great narrative structure, which we shall reapply (for verification) to any one narrative: a task as exhausting (ninety-nine percent perspiration, as the saying goes) as it is ultimately undesirable, for the text thereby loses its difference.20
The rule for manual parsing is simple: the computer will place whatever text a Barthesian reader chooses into a lexia. The lexias are numbered as a generic means for identifying them. For example, an output file created with Xebra using manual parsing on the S/Z sample text, above, contained this data:
19Ibid., 3.
MANUAL PARSING EXAMPLE
Lexia Number |
Lexia Text |
0001 | I. Evaluation There are said to be certain Buddhists whose ascetic practices enable them to see a whole landscape in a bean.a |
0002 | Precisely what the first analysts of narrative were attempting: to see all the world's stories (and there have been ever so many) within a single structure:a |
0003 | we shall, they thought, extract from each tale its model, then out of these models we shall make a great narrative structure, which we shall reapply (for verification) to any one narrative:a |
0004 | a task as exhausting (ninety-nine percent perspiration, as the saying goes) as it is ultimately undesirable, for the text thereby loses its difference.a |
aIbid., 3.
The rule for punctuation parsing is: the computer separates the text into lexias based on punctuation marks, with the reader determining how many marks to include per lexia. An output file created with Xebra, broken on each punctuation mark in the sample text, contained this data:
PUNCTUATION PARSING EXAMPLE
Lexia Number |
Lexia Text |
0001 | I.a |
0002 | Evaluation There are said to be certain Buddhists whose ascetic practices enable them to see a whole landscape in a bean. |
0003 | Precisely what the first analysts of narrative were attempting:a |
0004 | to see all the world's stories)a |
0005 | and there have been ever so many)a |
0006 | within a single structure:a |
0007 | we shall,a |
0008 | they thought,a |
0009 | extract from each tale its model,a |
0010 | then out of these models we shall make a great narrative structure,a |
0011 | which we shall reapply)a |
0012 | for verification)a |
0013 | to any one narrative:a |
0014 | a task as exhausting (a |
0015 | ninety-nine percent perspiration,a |
0016 | as the saying goes)a |
0017 | as it is ultimately undesirable,a |
0018 | for the text thereby loses its difference.a |
aIbid., 3.
The rule for noun phrase parsing is: the computer breaks up the text into lexias with each one containing no more than the number of noun phrases the reader requests. An output file created with Xebra, counting four noun phrases per lexia, using the S/Z sample text, above, contained this data:
NOUN PHRASE PARSING EXAMPLE
aIbid., 3.
This third type of parsing (noun phrase parsing) is actually susceptible to variability in results, depending on the particular algorithm that is used for determining a noun phrase. Xebra uses a simple algorithm, that, while subject to error, is precise enough for the purposes of this tool. Complexity versus simplicity, in this instance, refers to the amount of effort put into removing ambiguity. Simple- minded parsing algorithms make quite a few mistakes, being easily confused whenever there is the slightest ambiguity regarding whether a word is part of a noun phrase or not, while more complex ones make fewer mistakes. Lexia 0003, in the above table, illustrates such a mistake. The word "attempting," because of the simple algorithm used in Xebra, is counted as a noun (gerund), when, in fact, in this case, it is part of a verb phrase. However, disambiguating this, while possible, would take more effort than it is worth, since the goal is to only approximate Barthes' atoms of meaning with noun phrases, not match them exactly.
Table 8 shows the five page report generated via Xebra showing each lexia and its accompanying labels, as they were assigned in Xebra, using the lexias from the S/Z noun phrase data shown above.
REPORT EXAMPLE
3/27/94 | Lexia Label Detail Report--Noun Phrase Parse | Page | 1 |
1 | I. | Evaluation There are said to be certain Buddhists whose ascetic practices enable them to see aa |
ACT | To See | 1 | To see a whole within a part | |
HER | Enigma 1 | Question | Evaluate: what, how? | |
HER | Enigma 1 | Proposal | Evaluate: how: by looking at bean | Proposal Proposes using intuitive approach developed by 'certain Buddhists' |
HER | Enigma 2 | Question | What is the relation of whole to part? | |
HER | Enigma 2 | Proposal | The part contains the whole in some sense | |
REF | Code of World's Religions | Buddhism | Buddhism is an Eastern Religions Religion, it's mention brings in the all of the associated connotations | |
SEM | Reliability | 'Intuitive approaches are often thought unreliable | ||
SEM | Other Worldly | 'Ascetic practices', of the Eastern variety, especially, carries strong connotations of other worldness | ||
SEM | Impossibility | 'It is said' connotates a level of disbelief in the possibility, at least | ||
SEM | Hubris | The author implies that there is a certain hubris implicit in the Buddhist's claims | ||
SYM | Antithesis | A Term (Eastern Methods) | Introduction | A symbolic structure balancing Western thought against Eastern, Intuition vs Analysis |
aIbid., 3.
3/27/94 | Lexia Label Detail Report--Noun Phrase Parse | Page | 2 |
2 | whole landscape in a bean. Precisely what the first analysts of narrative wereb |
ACT | To See | 1 | To see a whole within a part | |
HER | Enigma 1 | Proposal | Evaluate: how: by looking at bean | Proposes using intuitive approach developed by 'certain Buddhists' |
HER | Enigma 2 | Question | What is the relation of whole to part? | |
HER | Enigma 2 | Proposal | The part contains the whole in some sense | |
REF | Code of Literary Theory | First Theorists' Approach | First theorists used structural analysis to categorize stories | |
SEM | Hubris | The author implies that there is a certain hubris implicit in the Buddhist's claims | ||
SYM | Antithesis | A Term (Eastern Methods) | Introduction | A symbolic structure balancing Western thought against Eastern, Intuition vs Analysis |
3 | attempting: to see all the world's stories (and there have been ever so many)b |
ACT | To See | 2 | To see all stories within one structure | |
HER | Enigma 1 | Proposal | Evaluate: how: analytically, structurally | Proposes to use analysis to emulate the intuitive approach of Buddhists |
HER | Enigma 2 | Proposal | Seeming 'wholes', stories, are contained, abstractly, in larger wholes | |
REF | Code of Literary Theory | First Theorists' Approach | First theorists used structural analysis to categorize stories | |
SEM | Impossibility | The phrases 'attempting' and 'ever so many' points at impossibility |
bIbid., 3.
3/27/94 | Lexia Label Detail Report--Noun Phrase Parse | Page | 3 |
3 | attempting: to see all the world's stories (and there have been ever so many)c |
SEM | Hubris | The author implies that there is a certain hubris implicit in the Buddhist's claims | ||
SYM | Antithesis | B Term (Western Methods) | Statement |
4 | within a single structure: we shall, they thought, extract from each tale itsc |
ACT | To See | 2 | To see all stories within one structure | |
ACT | To See | 3 | To see a structural model in a story | |
HER | Enigma 1 | Proposal | Evaluate: how: analytically, structurally | Proposes to use analysis to emulate the intuitive approach of Buddhists |
HER | Enigma 2 | Proposal | Seeming 'wholes', stories, are contained, abstractly, in larger wholes | |
REF | Code of Literary Theory | First Theorists' Approach | First theorists used structural analysis to categorize stories | |
SEM | Impossibility | |||
SEM | Hubris | The author implies that there is a certain hubris implicit in the ancient's belief in what was possible | ||
SYM | Antithesis | B Term (Western Methods) | Introduction | A symbolic structure balancing Western thought against Eastern, Intuition vs Analysis |
cIbid., 3.
3/27/94 | Lexia Label Detail Report--Noun Phrase Parse | Page | 4 |
5 | model, then out of these models we shall make a great narrative structure, whichd |
ACT | To See | 2 | To see all stories within one structure | |
ACT | To See | 3 | To see a structural model in a story | |
HER | Enigma 1 | Proposal | Evaluate: how: analytically, structurally | Proposes to use analysis to emulate the intuitive approach of Buddhists |
HER | Enigma 2 | Proposal | Seeming 'wholes', stories, are contained, abstractly, in larger wholes | |
REF | Code of Literary Theory | First Theorists' Approach | First theorists used structural analysis to categorize stories | |
SEM | Impossibility | |||
SEM | Hubris | The author implies that there is a certain hubris implicit in the ancient's belief in what was possible | ||
SYM | Antithesis | B Term (Western Methods | Statement |
6 | we shall reapply (for verification) to any one narrative: a task as exhausting (ninety-nine percent perspiration, as the saying goes) as it isd |
HER | Enigma 1 | Proposal | Evaluate: how: analytically structurally | Proposes to use analysis to emulate the intuitive approach of Buddhists |
REF | Code of Literary Theory | First Theorists' Approach | First theorists used structural analysis to categorize stories | |
REF | Code of Cliches | '99% Perspiration' | ||
SYM | Antithesis | B Term (Western Methods) | Statement |
dIbid., 3.
3/27/94 | Lexia Label Detail Report--Noun Phrase Parse | Page | 5 |
7 | ultimately undesirable, for the text thereby loses its difference.e |
HER | Enigma 1 | Partial Answer | Evaluation: Not by abstracting models | One loses differerence by focusing on commonality in abstraction process |
HER | Enigma 2 | Partial Answer | Whole to Part: Important Differences | |
SEM | Impossibility | Extremely exhausting, at any rate | ||
SYM | Antithesis | B Term (Western Methods) | Statement | A symbolic structure balancing Western thought against Eastern, Intuition vs Analysis |
eIbid., 3.
An examination of the five report pages, Table 8, reveals that its major components are:
1. Lexia Data
a. Lexia Number
b. Lexia Text
2. Label Data
a. Code
b. Code Field 1
c. Code Field 2
d. Code Field 3
e. Note Field
The lexia data occur once for each lexia, while the label data occur as often as necessary for each lexia. By Barthes' rules, of course, there should be, at most, three or four such labels per lexia. Barthes' rule appears to be a practical one meant to cut off the infinite play of meanings at some manageable level. One hypothesis of the current research is that, given a computer, the manageable number of atomic meanings actually rises above that which a manual approach allows. More detail on this matter can be found in "CHAPTER IV: Xebra: A Test Ride Thru The Signifier Galaxy Of 'The Bear'," below.
Xebra was implemented in three modules that operate independently of each other:
1. Parser
2. Database
3. Data Analyzer
The Parser program presents a menu to the reader, allowing four choices: manual parsing, punctuation parsing, noun phrase parsing, and exiting the program. If the reader chooses any of the first three, the program requests the name of the input file and the output file and then it begins processing the data and creating the output file. Only in the manual parse does the reader have to do any thing more, since the punctuation and noun phrase parses are totally automatic in nature.
In the manual parse, the reader highlights each piece of text that is to form a lexia, then tells the Parser to save it. This is an iterative process that continues until all the text has been broken up into lexias. This step could be done using any editor that runs on a PC, though the Xebra Parser is tuned to the task, assigning the lexia numbers automatically and formatting the file, as it needs to be, for the next step, which is the database build.
The Xebra database module can be any database program the reader wishes that handles text data, as well as, record data. The initial implementation of Xebra used for this dissertation was Borland International, Inc.'s Paradox 4.0 database program for PC DOS. This is a sophisticated relational database with extensions for storing textual data, as well as, normal database record data.
Below are sample record layouts for two databases: the Lexia Database, Table 9, which holds the text of each lexia; and the Label Database, Table 10, which holds the label data. The field name is accompanied by its definition. An "N" field is a field that can only contain numbers. An "M" field is "memo" field, which can contain any alphanumeric data, and is arbitrarily large. The number following the "M" is how many characters are automatically displayed without asking for more. An "A" Field is an alphanumeric field that contain as many characters as the number right after the "A". An "*" after a Field Definition means that the field is used as a key to sort the records in the database.
Lexia Database Record Layout
Field Name | Field Definition |
Lexia Number | N* |
Lexia Text | M240 |
Label Database Record Layout
Field Name | Field Definitionn |
Lexia Number | N* |
Lexia Code | A3* |
Lexia Sequence Number | N* |
Field 1 | A50 |
Field 2 | A50 |
Field 3 | A100 |
Notes | A255 |
These two databases, the Lexia and the Label, together can be used to do such activities as sort and subset all the lexias by codes alone, or to sort and subset lexias that have one or more labels in common. Reports, such as that shown in Table 8, can be generated both for all the lexias and their labels, or for the subsets. One can also derive a variety of numerical attributes concerning the data using the database engine, which can either be graphed or placed in a table form.
Graphs and tables can give a critic an insight into the patterns that the various atomic signifiers are entering into within the text. For instance, one could show in a graph the relative "weight" or "significance" of one label versus others in sheer numerical terms, as in Figure 1.
On the other hand, in a table, such as Table 11, one could show the movement of truth revealment through the text for a given hermeneutic entity, or enigma.
FIGURE 1: RELATIVE WEIGHT OF SEM CODES IN "The Bear"
TRACE OF ENIGMA 1 THROUGH "The Bear"
Lexia # | Field 2 | Field 3 |
1 |
Question | Who or what is "The Bear" |
1 |
Partial Answer | "The Bear" is Old Ben |
19 |
Partial Answer | Old Ben is a bear that men have tried to capture, kill, but have failed |
85 |
Partial Answer | Old Ben is care taker of his brethren--thus spirit of the wilderness |
86 |
Partial Answer | Old Ben is conscious, willing, active adversary to hunters |
89 |
Partial Answer | Old Ben is conscious, willing, active adversary to hunters |
89 |
Snare | Denies Old Ben is care taker, thus spirit of the wilderness |
91 |
Partial Answer | Old Ben is conscious, willing, active adversary of hunters |
92 |
Partial Answer | Old Ben IS the "head bear", he's "the man" |
93 |
Partial Answer | Old Ben more than a regular bear--he "sends" dogs, hunters home when he wants |
120 |
Partial Answer | Old Ben is mortal, can be killed |
120 |
Snare | Implies that being mortal, that's all Old Ben is--not true. |
121 | Partial Answer | Old Ben is mortal, can be killed |
121 | Snare | Implies that being mortal, that's all Old Ben is--not true. |
122 | Partial Answer | Old Ben is mortal, can be killed |
122 | Snare | Implies that being mortal, that's all Old Ben is--not true. |
123 | Partial Answer | Old Ben is mortal, can be killed |
123 | Snare | Implies that being mortal, that's all Old Ben is--not true. |
668 | Partial Answer | Old Ben--"Lover-Like"--Ben and Lion are close, possibly lovers |
681 | Partial Answer | Old Ben is the Bear, and he is dead |
763 | Partial Answer | Old Ben is a Spirit worthy of having the Season change at his Death |
764 | Partial Answer | Old Ben is a Spirit worthy of having the Season change at his Death |
1733 | Partial Answer | Old Ben is immortal |
1733 | Partial Answer | Old Ben is the Earth |
Xebra: The Proving Of Its Utility
In this chapter, the emphasis has been on, to this point, presenting, in an abstract form, Xebra's capabilities as a Barthesian reader's assistant. The focus now turns to demonstrating Xebra's utility as a Barthesian reader's assistant. Proving that Xebra is useful, in that regard, is fundamental to this dissertation project. Such a proof is necessary, given that the object of the project is to show that the Barthesian system is one that literary critics and theorists could use profitably, once a few problematic attributes of the system are addressed, through a tool such as Xebra. An approach, then, to evaluating the claim of Xebra's utility is set up here, first in theory and then in practice, while CHAPTER IV focuses on evaluation itself.
The grounding assumption of the evaluation approach to be described here, then implemented in CHAPTER IV, is that performing a Barthesian reading using Xebra on a text of suitable characteristics will provide sufficient basis for determining the effectiveness of the tool and the system it supports. The approach depends on two elements: 1) choosing a text for use as the object of a Barthesian/Xebra reading; and 2) prescribing a reasonable set of criteria for measuring the results of such a reading.
The evaluation proofs will consist of:
- demonstrating Xebra's capability to aid in performing a Barthesian reading well;
- comparing results of an actual Xebra-based analysis with accepted readings of a specific text to demonstrate its ability to discover, name the facts of the text and the larger patterns of those facts;
- showing that a Xebra-based Barthesian analysis aids in deepening our understanding of the text.
The hypothesis underlying the specification of these three kinds of proofs is that the successful exercising of Xebra in relation to them will be sufficient evidence that the underlying, basic questions of theory and practice concerning the Barthesian methodology (as examined in CHAPTER II) have, in fact, been addressed in Xebra.
The remainder of the chapter is given over to describing the details of the evaluation approach, including the actual setting up of the data to be evaluated. First, the choice of the text to be read and the criteria for measuring the results of the reading will be delineated, followed by a description of how the reading was performed, in order to produce the data needed for the actual evaluation given in CHAPTER IV.
The Xebra Evaluation: Choosing The Text
In attempting an evaluation of the highly empirical Barthesian/Xebran reading system, it appears clear that an empirical approach is most likely to prove rewarding. Thus, it follows that performing an actual Barthesian/Xebran reading is the best candidate for the task at hand. In order to perform such a reading, a text must be chosen that can serve as the object of the reading.
This text must meet three primary criteria. One, it must be sufficiently large so as to stress the ability of the Barthesian/Xebran system to handle the amount of data involved. Two, it must be sufficiently complex semantically, in order to stress the system's abilities in that regard. Three, the facts of the text must have been previously documented, independently, in some reasonable form, in order to serve as a basis of comparison as to the effectiveness of the tool and the system at uncovering the facts of the text.
William Faulkner's text, "The Bear," meets all of these criteria. First, at approximately 50,000 words, it is more than three times the size of Balzac's Sarrasine, which is ample room for showing that the Barthesian/Xebran system can handle large amounts of text. Second, William Faulkner is well known for his semantic denseness, the inter-relatedness of his prose, both locally, in phrases and sentences, and globally, across a text, and, in many cases, across several texts. "The Bear" is no exception to that rule. Third, in Walter Davis' text, The Act of Interpretation,21 there exists a set of definitions and examples of what constitute the primary and basic facts of "The Bear" that a critic must have at hand before attempting an interpretation of it.
For these reasons, "The Bear," serves well as the object of the demonstration in CHAPTER IV of the Barthesian/Xebran system's capability as a basic fact-gathering tool. The next section details the set of definitions and examples of basic facts in "The Bear" that Walter Davis prescribes for critics to use in approaching an act of interpretation of the text.
21Walter A. Davis, The Act of Interpretation: A Critique of Literary Reason, (Chicago: University of Chicago Press, 1978), 1-181.
The Xebra Evaluation: The Evaluation Criteria
CHAPTER I introduced, at a high level, the criteria for evaluating the effectiveness, and thus, the success of Xebra, and, by extension, the Barthesian methodology as embodied in Xebra. The goal here is to make those abstract criteria concrete, and thus, accessible to evaluation.
The text referenced above, Walter Davis' The Act of Interpretation,22 succinctly lays out the set of criteria to be used for demonstrating Xebra's utility. It should be noted that the actual use of Xebra on "The Bear" was done without recourse to these criteria. Thus, they serve as an independent, objective check on the system performance.
"The Bear" Barthesian reading will also be evaluated in relation to the critical readings of the text that are given as part of Davis' The Act of Interpretation. This second type of evaluation will focus on how well the facts gathered via Xebra directly support these readings and how well the use of Xebra's fact analysis capability supports the same.
Davis sets forth his criteria after explaining that they relate to the preliminary attention a critic should pay to a text (before the act of interpretation), making this preliminary attention, essentially, a fact gathering, inventorying act.23 In terms of a Barthesian reading, it is not clear that all of them are unambiguously preliminary. In fact, some could be argued as being, to some degree, part of an act of interpretation, which makes their direct use as test criteria problematic, but not impossible. In a Barthesian reading, the decision between what constitutes a simple pointing out of the facts of the text and the carrying out of an hermeneutic act, often turns on whether there is a distinct value judgement involved.
22Ibid.
23Ibid., 8-9.
That is, if someone were to say that it is a fact that a particular symbol structure exists in a text and thus, deserves one of Barthes' five labels, then a Barthesian reading should include it. Conversely, if someone were to say that the same symbol structure is of primary significance, above all other symbols, than that would not be a part of a Barthesian reading, as such. A value judgement suddenly introduces a rating scale, and such scales belong to hermeneutic models which determine them. It is one thing to point out that a symbol exists; it is quite another to make the judgement that it is primary or secondary or unimportant, because this can only be true in relation to some theoretical model of literary criticism.
While Barthes never states this distinction as such, careful study of his example reading of Sarrasine, as set forth in S/Z, reveals that he refrains from making value judgements about the importance, relative or otherwise, of particular facts, leaving that up to the interpreter. It is true, however, as discussed in CHAPTER I, that S/Z can leave a reader with the impression that Barthes made such judgements as part of his system. However, again as discussed in CHAPTER I, this is an artifact of his expository style, the methodology of his system, and the structural characteristics of the text, Sarrasine, itself.
Like Davis, then, Barthes is saying that there are facts, signifieds of the text, that must be known before the act of interpretation can go forward. Additionally, he is stating that each of the criticisms, in some way, cuts off some portion of the meaning of the text by using some of those facts, but not all. Indeed, he states that when a critical model is used to articulate a criticism of a text, what is actually happening is a focusing upon, or hearing of, one voice of the text24 which is a reference to the denotation of his labels or codes as being the names of the voices of the text. Since there are (at least) five such voices, hearing only one cuts off the plurality of the text, which Barthes always seeks to avoid.
What this means for evaluating this test of Xebra is that Davis' criteria must be sorted through. Those criteria that are specific to fact gathering must be extracted and used to judge the test of Xebra in terms of its fact-gathering functions, while the others are split into two parts: a fact gathering component to further test that function of Xebra, and an analysis component which will serve as a test of the analysis functions that one can perform using Xebra. So, to the extent that Davis' criteria require value judgements, then the test of Xebra will be in how well it aids the critic in making the correct or, at least, accepted judgements necessary for a successful act of interpretation.
24Barthes, 15.
25Ibid., 20-21.
Below are the items Davis lists in his text as being necessary to the act of interpretation and, in this dissertation, are being used as evaluation criteria for the combined Barthesian/Xebran system:
- Faulkner's Use of Conventions (e.g., tall tale, hunting sketch, bildungsroman, myth of initiation);
- Names--Significance of (e.g., Sam Fathers, Isaac);
- Language and Image Patterns;
- Events: relationships among, comparison and contrast(e.g., pattern of Quest, Relinquishing, Bequest in each section, including irony of);
- Roles of Structures of Myth, Ritual in Ike' Development;
- Theological, Moral Issues as embodied in thematic oppositions in nature and history;
- Narrative Styles: how they shape the subject(e.g., relationships of major section structure as defined by the chronology reversal within the narrative Faulkner employs);
- Determination of Primary Action: Tragedy of Ike and the wilderness.26
26Davis, 9.
The first criterion presents the problem of what is meant by "use." That is, when does fact inventorying end and critical analysis begin? Barthes' system is based on the "use of conventions" in the sense that connotative meaning (and denotative, for the semologists) is a meaning attached via convention. But the connotative meanings a Barthesian reader looks for are at a lower level of abstraction then "hunting sketch," indeed, the signifiers of the Barthesian reader are only constituents of this larger structure.
For example, there are numerous references to hunting in "The Bear," which a Barthesian reader, building a label database, would label as such. It would be the work of a critic, admist developing an interpretation, to note these labels and their pattern and to compare them to other known texts, thus coming to recognize the "hunting tale" conventions. If the Barthesian reader, during the labeling process, had recognized this pattern, it certainly would be reasonable to place a mention of this in the label notes for the hunting references, but it is not required.
Therefore, in the evaluation reported here, Xebra was used to list items that could be used in the critic's analysis phase to build the case that "hunting sketch" conventions are being used, which could then feed into the critic's act of interpretation of how they are being used. This final act is not part of the test of Xebra.
For the evaluation, a "use of" criterion for a structure greater than Barthes' signifiers will be considered to have been met if these signifiers that form this larger structure are, indeed, labeled, and thus, presentable to any subsequent user of the database. It is assumed that such a user, when presented with the elements of a "use of" some containing structure, will know and recognize the same.
Two apparent exceptions in Barthes' own practice should be noted with regard to this stricture. While, in general, his labels apply to signifiers in situ, without regard to other points of the text that they might relate to, there are two types of structures, as discussed in CHAPTER II, that Barthes knew spanned signifiers, which he labeled accordingly. These exceptions are the use of large, symbolic, antithetical structures and sequences of actions.
An example of the former is the symbolic structure Balzac notes in Sarrasine between "inside" and "outside," with the narrator occupying a "mediating" ground, though not necessarily neutrally,27 a structure that clearly spans many lexias. However, while this structure is labeled in terms of its pieces, the labels do not contain an explanation, in critical terms, of the "use" of these structures, vis-a-vis, throughout the whole of the story, as would be true if a Barthesian reading were a criticism of a story.
An example of the second exception, action sequences, would be Barthes' noting of a sequence he labeled "To Narrate," which took place over thirteen lexias scattered across the text.28 Again, Barthes makes no attempt at interpreting this sequence with respect to its function in the text, per se. Only a critic preparing an interpretation would do this.
27Barthes, 22-28.
28Ibid., 255.
Davis' second criterion, names and their significance, is a typical labeling exercise. For instance, besides the examples he gives of Sam Fathers, there is Old Ben. A "ben" in Scotland is the interior, or private room, of a two-room cottage, where only family or very close friends are welcome. Many of the settlers of Mississippi were of Scottish, and Scotch-Irish descent; the very name, "McCaslin," which belongs to two of the main characters in "The Bear," is indicative of such descent, so it is not unreasonable to attach the connotation of inner spirit, or hidden inner spirit, to Old Ben on the basis of his name. A Barthesian reader would certainly be remiss not to point out Faulkner's connotative use of "ben." So one can and, indeed, should, expect the Xebra label database for "The Bear" to be replete with name connotations.
The third criterion, language and image patterns, is also a standard use of Xebra, in both of its functions. First, before such patterns can be determined, one must find and code the pieces, a function of Xebra as a Barthesian reading aid. Second, patterns of use of language and images can certainly be extracted during the analysis of a Xebra label database, if the appropriate labels are there in the first place. It is important to remember that this phase is not necessarily one that is performed by the Barthesian reader, per se. Of course, it can be, if the reader is also a critic intent on performing an act of interpretation as well as performing the Barthesian reading, but it does not have to be.
In short, Davis is certainly correct in stating that the determination of these patterns is a pre-interpretative act. However, it is an act that lies outside the boundaries of Xebra as a computer-assisted aid in labeling a text (that is, in performing a Barthesian reading) and inside the use of Xebra as a computer-based analysis tool for critics exploring the data collected by a Barthesian reader. Thus, criterion three tests both major aspects of Xebra.
The fourth criterion, events and their relationship to each other, as compared and contrasted, is a normal use of Xebra, a use that again exercises both aspects of the tool (data gathering and data analysis). A data gathering person, that is a Barthesian reader, would be responsible for labeling all the events of the text, while the data analyzer, the would-be critic, who may or may not be the gatherer, would be responsible for determining how the events fall into patterns of relationship to each other and the rhetorical structure of the text.
The fifth criterion, roles of structures of myth and ritual in Ike's development, is, again, a two-part use of Xebra: data gathering and data analysis. Gathering the data for ascertaining that such structures exist relative to Ike, is a straightforward use of Xebra as a Barthesian reading aid, while, in the analysis phase, it should be simple enough to show that Ike is intimately connected to the myth and ritual references. However, determining the meaning or role of that connection between Ike and myth and ritual is more problematic, and, in fact, the determination could be different depending upon the critical model being used. In other words, finding and reporting the evidence of a connection is one act, while determining the meaning of the connection is another.
The sixth criterion, theological and moral issues as embodied in thematic oppositions in nature and history, again requires exercising both aspects of Xebra. However, the term "issues" is problematic, much like "use" and "roles" in earlier criteria. A Barthesian reading does not aim so high as "issues." Instead it reveals or discovers the signifiers that, through exercising of the analysis component of Xebra, together can be construed as forming a structure one might term an "issue." As for the thematic oppositions, these are best captured in Xebra as antithetical symbolic structures. Thus, such issues, religious and moral, that play a part in any antithetical construct, will have been tagged in the same lexia (or, at least, physically near by) as the antithesis, assuming that the Barthesian reader correctly identified all three.
The seventh criterion relates to Faulkner's narrative styles in terms of how they shape the subject, and, in particular, the relationships of the five sections of "The Bear" as defined by the chronology reversal across the narrative line that Faulkner employs. This criterion is, once again, a test of both aspects of Xebra's facilities. First, a coder notes the various references to chronology in the text. Second, an analyst would use the system to study these references, most probably, but not necessarily, in a graphical form, where the time frame of each episode of the story would be superimposed against the section structure. Thus, as the narrative moves through time within and across sections, a timeline graph would show any movements back and forth in time. Having created a graph, a critic could review it and form an interpretation of the use of time movement in terms of the shaping of subject given a particular critical model.
The eighth, and final criterion, the determination of the primary action of "The Bear" as being the tragedy of Ike and the wilderness, is very problematic in terms of Xebra. Aside from the fact that the definition of "primary action" can vary by point of view or critical model, a Barthesian reading is at a much lower level than such concepts as "tragedy" and "comedy." In a Barthesian reading, and consequently, in a Xebra database, the sequences of actions of the story are captured using the ACT code. But these sequences of actions are not evaluated and tagged as to their relative status to each other. A graphical analysis of the sequences might lead someone to decide what that relationship might be, but such a relationship cannot be guaranteed.
To summarize, these eight criteria will be used in two ways in CHAPTER IV to help determine the effectiveness of Xebra in the pre-interpretative act stage of the work of a literary critic. First, these criteria will be examined in relation to the facts gathered by using Xebra on "The Bear." Second, they will be examined in relation to the analysis results that can be obtained by using Xebra's analytic tool set. After that, the effectiveness of Xebra's fact gathering and analysis will be further evaluated in terms of specific support for the readings of "The Bear" as they are given in The Act of Interpretation.
The Xebra Evaluation: The Reading Done
The Xebra evaluation given in CHAPTER IV is based on the data produced by a Barthesian/Xebran reading of William Faulkner's "The Bear." The reading consisted of two phases. In the first phase, "The Bear" was cut into lexias three different ways by using Xebra. In the second phase, the signifiers were appropriately labeled by using the Barthesian labeling language.
The next two sub-sections are detailed descriptions of each phase. These descriptions focus on what was done and how it was done. Where appropriate, discussion on why particular activities or processes were performed in terms of the end goals of the evaluation and testing of the Xebran version of the Barthesian system, is also given. The third sub-section presents a statistical view of the results of the Barthesian/Xebran reading of "The Bear."
The Barthesian/Xebran Reading: Phase One
Phase One was the production of the sets of lexias. The first lexia set generation was based on the punctuation criteria, with the number of punctuation marks set to five. The second lexia set generation was based on the noun phrase criteria, with the number of noun phrases set to five. The third lexia set generation was based on Xebra's manual mode where the user chooses where to make each cut.
Before constructing the final lexia sets, experiments were conducted to find the settings for the number of punctuation marks and noun phrases that would be used for the actual reading performance. This experimentation consisted of running Xebra with a variety of settings for each variable, then analyzing the results. After testing numbers ranging from one through fifteen for each parsing process, five for punctuation and seven for noun phrases were chosen as the optimal numbers for the test. These numbers were chosen for two reasons.
One, the experimental data showed that any number above seven for noun phrases and ten for punctuation yielded a large number of very long lexias. This meant that the Barthesian lexia definition constraint of, at most, three to four signifiers per lexia, would most likely be exceeded in a majority of the cases. Given the goal of the automatic generation of lexias which conform to the Barthesian definition of such, as outlined in S/Z, it was necessary to ensure that the lexias were not too large. Settings of five for punctuation and seven for noun phrase parsing proved to generate lexias that were sufficiently small, generally, although not always, as will be seen later.
Two, given that the number of lexias produced with five punctuation marks was nearly half the number produced by seven noun phrases, these settings provide an excellent opportunity for a controlled test of how random cutting affected the final results (in terms of which group of words received which labels). In particular, the lexias generated with these settings afforded a good view of how the distribution of labels changed when a complete thought was cut in the middle versus near either end. This notion of a complete thought turned out to be important, as well, in the manual generation of lexias.
Manual generation of lexias was also explored experimentally prior to final set generation. One initial finding was that manual parses tended to yield fewer, that is, longer, lexias than either of the automatic parses using the parameter settings of five and seven, respectively. The reason for this appears to lie in the tendency of a human parser to include at least one "complete thought" per lexia.
One can cut a text without regard to syntactic considerations, but there is a strong tendency to not cut until the lexia contains the essence of at least one completed thought. A thought in this instance can, and usually does, contain more than one signifier, and, in fact, usually more than Barthes' limit of three or four. This is discussed in more detail later in this section. In the final manual lexia parsing, then, the number of lexias produced was nearly two-thirds that of the punctuation parse and one-third that of the noun phrase parse.
The actual building of the lexia databases was a matter of placing the text of each lexia into the appropriate database based upon its sequential position in the original text. The first lexia for the punctuation database was:
0001
The Bear I There was a man and a dog too this time.
Two beasts, counting Old Ben, the bear, and two men,29
The number, 0001, indicates it is the first lexia made from the text, and thus, the first one in the database. This number was generated by Xebra as it produced each lexia. In all, the final lexia databases included 939 lexias in the punctuation database, 1765 lexias in the noun phrase database, and 630 lexias in the manual database.
29William Faulkner, "The Bear." in The Portable Faulkner, ed. Malcom Cowley, rev. and exp. ed. (New York: Viking Press, 1974), 197.
The Barthesian/Xebran Reading: Phase Two
Constructing the label databases required more effort than that needed for the lexia databases, because it involved the visual examination of each lexia for signifiers and the generation of appropriate labels for each lexia. This is, necessarily, a manual act, whether the reading itself is manual or automated, due to the current state of the art in natural language processing as discussed in CHAPTER II. Since constructing the label databases lies at the heart of a Barthesian/Xebran reading, it is important to note how this process differs from a strictly Barthesian reading, such as Barthes' in S/Z.
There are three important variations between Barthes' reading in S/Z and that shown here. First, there is the fact that the reading of "The Bear" is based on three sets of lexias, where Barthes used but one for Sarrasine. Two, there is a variation in algorithms for generating lexia data sets from that used by Barthes. Three, there is a lack of forgetting, unlike with Barthes. These three variations are important for at least two reasons.
For one, these differences have a major impact on Barthes' insistence on his definition of what constituted a necessary, and presumably, sufficient analytical process; that is, "the idea, and so to speak the necessity, of a gradual [emphasis added] analysis of the text."30 For another, the differences accommodate nearly effortless, straightforward experimentation on the dimensions of the space occupied by a lexia.
Barthes did his gradual analysis on one set of possible lexias that he extracted from Sarrasine, if not arbitrarily, then certainly, without respect to the text's "syntactical, rhetorical, anecdotic"31 structures. Using the Barthesian/Xebran reading system, it becomes feasible to have more than one set of lexias, and to construct them through essentially random, arbitrary means. This is largely true due to the use of a computer to generate two of the lexia databases. This capability underscores at least one reason for using Xebra: to take advantage of computer technology in order to relieve the critic from as much manual effort as possible, thus, allowing for more concentration on the analytic necessities, as opposed to the simply mechanical. It turns out, however, that there are other advantages to using more than one cutting of the text: the manual process of examining the same text multiple ways yields useful refinements to labels generated previously, as well as, leading to new ones.
30Barthes, 12.
31Ibid., 14-15.
While it would be difficult for a single individual to generate three independent sets, it would not, of course, be impossible. However, most potential Barthesian critics would likely balk at doing a gradual, complete analysis of a text that involved iteratively parsing a text into different sets of lexias, then, painstakingly recombing the text for new signifiers and their meanings, or, rediscovering the old signifiers with still more meanings. Thus, the computer-based version is a labor-saving device that results not only in time and energy savings to arrive at the same end, it aids in producing a higher quality end, as well, if one defines quality in terms of completeness. This fact will be examined later in more detail.
The second major difference between Barthes' approach to analysis and the one used on "The Bear" came in relation to the method of generation of lexia sets in Phase One. While adherence to Barthes' methodology (as he outlines it in S/Z) for determining signifiers and, later, their labels, was sought in the reading, there were some important variations. These were made possible by the parsing functionality in Xebra beyond the manual method of parsing the text into lexias used by Barthes. The variations came in terms of how lexias are actually determined. Barthes had two criteria for that: one being the number of signifiers, the other being the willingness to disregard structure.
While Barthes does not go so far as to say his Sarrasine lexias are arbitrarily or randomly generated, he does state that the generation was done "without any regard for its natural divisions (syntactical, rhetorical, anecdotic)."32 In the lexia set generation during the reading here, however, two sets were most certainly created arbitrarily, being generated, as they were, via computer parsing.
One could argue that there were non-random criteria used for the parsing; that is, the punctuation count and the noun phrase count paid attention to the "natural divisions."33 However, these counts themselves were arbitrary, and thus, resulting in essentially random, arbitrary lexias. Thus, the second difference between the reading done here from that of Barthes' reading in S/Z, is that "The Bear" reading involved working with arbitrary chunks of text in, at least, two lexia sets, whereas Barthes' did not.
As for the manual parse generated lexias, both on Barthes' parse of Sarrasine and the manual parse done for the test of "The Bear," it is clear that neither arbitrary nor random selection happened, and arguably, could not happen. As one parses through a text, one's mind is making connections, comparing and contrasting what is immediately before one with what one has already seen, thus, influencing the cuts. So, in that sense, Barthes' lexia generation and the manual parse done for the test were similarly accomplished. However, there is a difference.
Barthes sought chunks of text that contained not too many, and not too few signifiers (three or four at most, one at least).34 This constraint proved uninteresting for the manual parse, given that the existence of two other lexia sets could be built with the text split into sufficiently small units, ensuring that this constraint was met. Therefore, it seemed reasonable to use the manual parse to choose lexias which were each a reflection of a complete thought, a different criterion from Barthes'. The other parses generated lexias that were independent of the grammar of the text, and thus, could easily be used to strictly emulate the Barthesian method. The manually generated lexia set was, therefore, open for controlled experimentation aimed at determining other criteria for measuring the space of a lexia other that the number of signifiers.
32Ibid., 15.
33Ibid., 15.
34Ibid., 13-14.
As noted above in describing Phase One, the final lexia databases were spread out in terms of number of lexia per database. This difference arose from the choice of parameters for cutting the text: five punctuation marks, seven noun phrases, and one complete thought. This distribution of lexia set sizes yielded a good sampling of random cuts of the text, thus enabling a sufficient test of a key hypothesis concerning where the semantic content of signifiers stop and start. The initial belief was that the string of text constituting a signifier, no matter where it appeared, nor how often it appeared, nor how mutilated it became via the random cuts of the text to form lexias, would carry with it the same signification(s). As will be shown later, this hypothesis is correct.
Finally, the third major difference in the reading here from Barthes' reading, is based in his notion of forgetting meanings. Barthes states that,
to read is to find meanings, and to find meanings is to name them; but these named meanings are swept toward other names; names call to each other, reassemble, and their grouping calls for further naming: I name, I unname, I rename: so the text passes: it is a nomination in the course of becoming, a tireless approximation, a metonymic labor.35
35Ibid., 11.
This implies that there are, if not an infinite number, than certainly, a large number of meanings to be found in any given text: meanings that are attached to specific signifiers, as well as meanings that can be named and labeled. Thus, the third major difference between what Barthes did in his analysis and what was done with "The Bear" is that, in the latter, an attempt was made to forget neither the meanings and their names, nor the signifiers with which they were associated. Barthes chose to forget, deliberately; for "The Bear," forgetting was deliberately not an option.
With a computer to store the trace of the Barthesian reader's thoughts concerning the meanings of each signifier discovered in the text, not forgetting those meanings was easy. It appears that Barthes decided to make a virtue out of a fact of his environment, an environment that encouraged forgetting, since all thoughts had to be kept in one's head or on paper, neither of which is that easy or desirable. Computers change that paradigm. A human reader using Xebra will still forget meanings for specific signifiers during the process of building the various label databases, of course, but rediscovering them is a trivial matter with the computer. Obviously, this is not so with either the mind or paper as one's repository of past discoveries.
Having stipulated three differences between a pure Barthesian reading, as exemplified in Barthes' rendering of Sarrasine in S/Z and that done with "The Bear," a sample of what Phase Two of the Barthesian/Xebran reading produced is presented. The reading, that is, the label generation, began with the lexia database generated via the punctuation parser. Below, in Table 12, are samples from the punctuation label database of what the records contained for three lexias from the associated punctuation lexia database: 468, 469, 470.
For reference purposes, each lexia is shown directly above its corresponding labels. The label data are shown in what is known as comma and quote delimited format. Each record field is delimited by a comma, with a text field being additionally delimited by double quotes. Each record begins with a lexia number. Samples of the noun phrase and manual label database follow in Tables 13 and 14. To illustrate the principle that the same labels attach to the same string of words, that is, signifier, these examples contain the same text split up differently.
PUNCTUATION LABEL DATABASE
Lexia 0468
" McCaslin said. "I would have done it if he had asked me to." Then the boy moved. He was between them, facing McCaslin;a
468, "SYM",1,"Antithesis","AB (Ideal/Non-Ideal Nature)","Mediation","Isaac is the true mediation of the Ideal and the Non-Ideal, not Boon"
Lexia 0469
the water felt as if it had burst and sprung not from his eyes alone but from his whole face, like sweat. "Leave him alone!" he cried. "Goddamn it!a
469, "REF",1,"Code of the
Bible","God condemns those who lead astray his
people","",""
469, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
Lexia 0470
Leave him alone!" IV then he was twenty-one. He could say it, himself and his cousin juxtaposed not against the wilderness but against the tamed land which was to have been his heritage, the land which old Carothers McCaslin,a
470, "REF",1,"Code of the
Bible","God condemns those who lead astray his
people","",""
470, "REF",2,"Code of
Chronology","","",""
470, "REF",3,"Code of Ages of Manhood
Attainment","","","21--when a male
officially reaches his majority, his manhood, or in the classic
(racist) saying--Free, White, and Twenty-One. Ike was now a free, independent
person, as such were defined in his place and time--free to say,
do what he wanted, in theory."
470, "REF",4,"Code of
Inheritance","","",""
470, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
470,
"SYM",2,"Inheritance","","",""
aFaulkner, 252.
NOUN PHRASE LABEL DATABASE
Lexia 0850
"I would have done it if he had asked me to." Then the boy moved.a
850, "HER",1,"Enigma
8","Snare","Proposed that Boon killed
Sam--that Sam was going to live, otherwise","This isn't
true--Sam was going to die--But Boon would have done so, if Sam
had Asked"
850, "REF",1,"Code of the Bible","Did
Judas (Boon) kill Christ/God
(Lion/Sam)?","",""
850,
"SEM",1,"Loyalty","","",""
850, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
Lexia 0851
He was between them, facing McCaslin; the water felt as if it had burst and sprung nota
851, "SYM",1,"Antithesis","AB (Ideal/Non-Ideal Nature)","Mediation","Isaac is the true mediation of the Ideal and the Non-Ideal, not Boon"
Lexia 0852
from his eyes alone but from his whole face, like sweat. "Leave him alone!" he cried.a
852, "REF",1,"Code of the
Bible","God condemns those who lead astray his
people","",""
852, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
Lexia 0853
"Goddamn it! Leave him alone!" IV then he was twenty-one. He could say it, himselfa
853, "REF",1,"Code of the
Bible","God condemns those who lead astray his
people","",""
853, "REF",2,"Code of
Chronology","","",""
853, "REF",3,"Code of Ages of Manhood
Attainment","","","21--when a male
reaches his majority, his manhood, or in the classic (racist)
saying--Free, White, and Twenty-One. Ike was now a free, independent
person, as such were defined in his place and time--free to say,
do what he wanted, in theory."
853, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
Lexia 0854
and his cousin juxtaposed not against the wilderness but against the tamed land which was to have been his heritage, the land whicha
854, "REF",1,"Code of
Inheritance","","",""
854,
"SYM",1,"Inheritance","","",""
854, "SYM",2,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
aFaulkner, 252.
MANUAL LABEL DATABASE
Lexia 0273
Then the boy moved. He was between them, facing McCaslin; the water felt as if it had burst and sprung not from his eyes alone but from his whole face, like sweat. "Leave him alone!" he cried. "Goddamn it!" Leave him alone!"a
273, "REF",1,"Code of the
Bible","God condemns those who lead astray his
people","",""
273, "SYM",1,"Antithesis","AB
(Ideal/Non-Ideal Nature)","Mediation","Isaac
is the true mediation of the Ideal and the Non-Ideal, not
Boon"
Lexia 0274
IV then he was twenty-one. He could say it,a
274, "REF",2,"Code of
Chronology","","",""
274, "REF",3,"Code of Ages of Manhood
Attainment","","","21--when a male
officially reaches his majority, his manhood, or in the classic
(racist) saying--Free, White, and Twenty-One. Ike was now a free, independent
person, as such were defined in his place and time--free to say,
do what he wanted, in theory."
Lexia 0275
himself and his cousin juxtaposed not against the wilderness but against the tamed land which was to have been his heritage,a
275, "REF",4,"Code of
Inheritance","","",""
275,
"SYM",2,"Inheritance","","",""
aFaulkner, 252.
The REF label, "Code of the Bible," among others in these examples, illustrates the general principle, stated above as a key hypothesis, that a given semantic unit, that is, signifier, always carries the same label set. In these label sets, one example is the signifier, "Goddamn it." This phrase occurs in the lexias in all three types of parsing in different locations, depending on the cut, but in all cases, the label applies. This guarantees that the atomic semantic units, Barthes' signifiers, are not arbitrary or capricious. They exist and they have particular meanings, which stay attached to them, no matter how the unit is mutilated.
Because this is true, Roland Barthes' system does, indeed, do what he claims in the sense that the basic building blocks of meaning (which are used to form large structures of meaning) are, themselves, identifiable and isolatable. However, these meanings, while they always adhere to the unit, are not always present to the reader's consciousness, given that they are often forgotten, though, sometimes later remembered, yielding the possibility of a multiplicity of readings from any given text. Thus, "it is precisely because I forget that I read."36
Phase Two was complete once all the lexias in the three lexia databases had been associated with a set of labels, forming the label databases. The goal of this phase was to complete a Barthesian reading, as it is specified in S/Z, as well as to take advantage of the extended functions available through Xebra. In CHAPTER IV, an analysis of the results of Phases One and Two, both in terms of achieving the eight criteria listed earlier and of the interpretive readings of "The Bear" given in Davis' The Act of Interpretation, is presented. What follows here is a numerical overview of the data produced during the Barthesian/Xebran reading of Faulkner's "The Bear."
36Barthes, 10-11.
The Barthesian/Xebran Reading: The Statistical View
This section presents a statistical view of the data collected during the Barthesian/Xebran reading of "The Bear." Table 15 contains information over both the total population and, stepping down a level, by lexia set. Table 16 is a comparison of Barthes' reading of Sarrasine and the reading of "The Bear" in terms of some key statistical points. The last five tables, numbers 17 through 21, contain the names of all of the unique labels within each code type for the reading of "The Bear," along with counts of how often each label occurred within each of the three databases, noun, punctuation, and manual.
These numbers are not necessarily meant to be part of an analysis of "The Bear," as such, though some reading methodologies might find them useful. Instead, they are intended to give the reader of this dissertation a basis for judging the scope of the effort involved in doing a Xebra-based Barthesian reading of a text containing 44,556 words, such as "The Bear." From these numbers, one should be able to estimate the approximate effort involved in doing such a project with other texts.
Please note when reviewing the comparison data between the two readings, Barthes' and the one done here, that Sarrasine, in the English translation, is approximately 15,000 words, and thus one third the size of "The Bear." This fact helps place the occurrence data for the five codes for each reading in perspective.
"The Bear" Reading: By The Numbers
Number of Lexia Sets: 3
Number of Lexia per Set:
Manual: | 637 |
Punctuation: | 939 |
Noun Phrase: | 1765 |
Total: | 3341 |
Number of Words per Lexia per set:
Avg | Med | STD | MAX | MIN | |
Manual: | 69 | 63 | 41 | 274 | 2 |
Punctuation: | 47 | 42 | 26 | 177 | 5 |
Noun Phrase: | 25 | 24 | 7 | 56 | 2 |
Number of Labels per Set and across all Sets, per Label Type
ALL | ACT | HER | REF | SEM | SYM | |
Noun Phrase | 6815 | 1002 | 195 | 2492 | 1776 | 1350 |
Punctuation | 5062 | 839 | 175 | 1810 | 1302 | 936 |
Manual | 4179 | 724 | 156 | 1504 | 1064 | 731 |
Unique | 174 | 35 | 10 | 60 | 44 | 25 |
Total | 16056 | 2565 | 526 | 5806 | 4142 | 3017 |
Average | 5352 | 855 | 175 | 1935 | 1380 | 1006 |
Standard Deviation | 1106 | 123 | 16 | 415 | 296 | 296 |
TABLE 16: CODE OCCURRENCES FOR Sarrasine VERSUS "The Bear"
CODE TYPE | Text and Database |
Number of Occurrences |
Number of Unique Sequences |
Average Length of Sequences |
Shortest Sequence |
Longest Sequence |
ACT | Sarrasine Bear: Noun Phrase Bear: Punctuation Bear: Manual |
272 1002 839 724 |
48 35 35 35 |
5.7 28.6 24.0 20.7 |
2 1 1 1 |
17 216 138 93 |
HER | Sarrasine Bear: Noun Phrase Bear: Punctuation Bear: Manual |
135 195 175 156 |
6 10 10 10 |
22.5 19.5 17.5 15.6 |
2 3 3 3 |
84 51 48 38 |
REF | Sarrasine Bear: Noun Phrase Bear: Punctuation Bear: Manual |
149 2492 1810 1504 |
83 60 60 60 |
1.8 41.5 30.2 25.5 |
1 1 1 1 |
18 225 169 142 |
SEM | Sarrasine Bear: Noun Phrase Bear: Punctuation Bear: Manual |
140 1776 1302 1064 |
54 44 44 44 |
2.5 40.4 30.1 24.2 |
1 3 3 3 |
23 206 125 92 |
SYM | Sarrasine Bear: Noun Phrase Bear: Punctuation Bear: Manual |
155 1350 936 731 |
77 25 25 25 |
2.0 56.2 39.0 30.4 |
1 1 1 1 |
20 538 365 271 |
Rounding out the picture of the data, what follows are five tables, one for each of the five codes, ACT, HER, REF, SEM, and SYM, containing the label types that are used in the final label databases:
1. Table 17: the ACT codes;
2. Table 18: the HER codes;
3. Table 19: the REF codes;
4. Table 20: the SEM codes;
5. Table 21: the SYM codes.
Each table entry includes the name of the label, followed by the number of individual occurrences of that label in each of the databases.
ACT CODE COUNTS LIST
Label | Noun | Punctuation | Manual |
---|---|---|---|
Man/Bear Confrontation | 216 | 138 | 93 |
Man/Wildernes Confrontation | 9 | 8 | 5 |
To Attack | 67 | 44 | 30 |
To Be Brave | 15 | 14 | 13 |
To Be Free | 26 | 22 | 24 |
To Be Lost | 16 | 16 | 16 |
To Believe | 34 | 27 | 21 |
To Bookkeep | 10 | 9 | 7 |
To Breath | 11 | 11 | 10 |
To Curse | 13 | 13 | 12 |
To Desire | 11 | 9 | 9 |
To Dispossess | 6 | 6 | 5 |
To Drink | 24 | 19 | 17 |
To Earn | 17 | 19 | 14 |
To Endure | 14 | 14 | 14 |
To Enter | 12 | 12 | 12 |
To Escape | 17 | 17 | 15 |
To Fear | 25 | 21 | 20 |
To Go Back | 32 | 26 | 25 |
To Hate | 4 | 4 | 4 |
To Heal | 8 | 7 | 6 |
To Hope | 16 | 16 | 16 |
To Hunt | 42 | 36 | 31 |
To Listen | 77 | 68 | 56 |
To Look | 62 | 59 | 57 |
To Marry | 7 | 6 | 6 |
To Narrate | 16 | 16 | 12 |
To Own | 46 | 42 | 34 |
To Relinquish | 11 | 11 | 13 |
To Repudiate | 9 | 9 | 9 |
To Sacrifice | 1 | 1 | 1 |
To See | 77 | 72 | 69 |
To Speak | 29 | 24 | 21 |
To Survive | 7 | 6 | 6 |
To Travel | 15 | 17 | 21 |
HER CODES COUNTS LIST
Label | Noun | Punctuation | Manual |
---|---|---|---|
Enigma 1: Who or what is "the bear" |
24 | 19 | 16 |
Enigma 2: Who is "the man" |
12 | 7 | 7 |
Enigma 3: What is "a dog" |
23 | 19 | 19 |
Enigma 4: What is "this time" |
20 | 21 | 19 |
Enigma 5: Who is the second man |
3 | 3 | 3 |
Enigma 6: Who is Isaac McCaslin |
6 | 6 | 6 |
Enigma 7: What (or who) killed the foal |
23 | 25 | 24 |
Enigma 8: What happened to Sam |
51 | 48 | 38 |
Enigma 9: How did Eunice come to drown in the crick |
16 | 14 | 13 |
Enigma 10: What is causing the sound |
17 | 13 | 11 |
REF CODES COUNTS LIST
Label | Noun | Punctuation | Manual |
Code of Birth | 5 | 3 | 3 |
Code of Blood Heritage | 77 | 64 | 59 |
Code of Bookkeeping | 75 | 46 | 41 |
Code of Carpentry | 1 | 1 | 1 |
Code of Christian Holidays | 1 | 1 | 1 |
Code of Christian Ministry | 2 | 2 | 2 |
Code of Chronology | 203 | 157 | 138 |
Code of Civilization | 14 | 10 | 10 |
Code of Class Hierarchy | 1 | 1 | 1 |
Code of Deadly Sins | 14 | 12 | 12 |
Code of Death | 1 | 1 | 2 |
Code of Earned Status | 78 | 52 | 35 |
Code of Economics | 94 | 61 | 51 |
Code of Family Hierarchy | 17 | 14 | 13 |
Code of Farming | 15 | 13 | 13 |
Code of Fatherhood | 34 | 28 | 25 |
Code of Games of Chance | 7 | 6 | 6 |
Code of Gender Roles | 18 | 15 | 13 |
Code of Godliness | 127 | 87 | 81 |
Code of Healing | 10 | 8 | 7 |
Code of Heroic Antagonists | 8 | 7 | 5 |
Code of History | 130 | 83 | 72 |
Code of Hunters | 138 | 100 | 68 |
Code of Hunting | 61 | 40 | 25 |
Code of Identity | 33 | 30 | 22 |
Code of Inheritance | 75 | 55 | 57 |
Code of Legal Process | 4 | 4 | 4 |
Code of Love | 75 | 48 | 30 |
Code of Lust | 40 | 22 | 15 |
Code of Military Roles | 7 | 6 | 5 |
Code of Monks | 7 | 4 | 4 |
Code of Names | 38 | 36 | 32 |
Code of Nature | 21 | 18 | 17 |
Code of Original Sin | 14 | 12 | 11 |
Code of Poetry | 2 | 2 | 1 |
Code of Possession | 29 | 22 | 17 |
Code of Psychology | 2 | 1 | 2 |
Code of Pre-Destination | 10 | 8 | 6 |
Code of Racial Stereotypes | 118 | 86 | 63 |
Code of Rites and Rituals | 46 | 31 | 28 |
Code of Rites of Passage | 62 | 34 | 24 |
Code of Royalty | 30 | 25 | 22 |
Code of Scam Artists | 30 | 18 | 12 |
Code of Scientific Gadgets | 2 | 2 | 2 |
Code of Slavery | 107 | 69 | 63 |
Code of Superstition | 6 | 5 | 5 |
Code of Survival | 15 | 12 | 13 |
Code of Twins | 5 | 3 | 2 |
Code of Writing | 42 | 32 | 26 |
Code of the Always Already Known | 25 | 19 | 16 |
Code of the Bible | 183 | 131 | 95 |
Code of the Number Three | 27 | 27 | 26 |
Code of the Sixth Sense | 4 | 4 | 4 |
Code of the Story of Jesus | 8 | 8 | 8 |
Code of the Trinity | 1 | 1 | 1 |
Ages of Manhood Attainment | 57 | 44 | 36 |
Gnomic Code | 225 | 169 | 142 |
Literary Reference | 6 | 6 | 5 |
Mythology | 3 | 2 | 2 |
Symphony Conducting | 1 | 1 | 1 |
SEM CODES COUNT LIST
Label | Noun | Punctuation | Manual |
---|---|---|---|
Adulteration | 43 | 31 | 22 |
Agelessness | 9 | 10 | 10 |
Ancientness | 33 | 29 | 28 |
Blindness | 57 | 51 | 50 |
Class Values | 45 | 35 | 22 |
Cowardice | 8 | 8 | 6 |
Destructive Force | 132 | 103 | 83 |
Duty | 19 | 12 | 11 |
Faith Holding | 52 | 30 | 25 |
Fools | 206 | 125 | 92 |
Forever Gone | 47 | 31 | 30 |
Freedom | 50 | 40 | 36 |
Gender Roles | 18 | 15 | 14 |
Greed | 32 | 19 | 14 |
Hell Fire | 8 | 6 | 5 |
Holy Spirit Hidden Within | 9 | 6 | 5 |
Idolatry--worship of icons | 9 | 6 | 7 |
Immensity | 21 | 22 | 19 |
Immortality of Spirit | 27 | 26 | 24 |
Incompetency | 45 | 30 | 16 |
Incorruptibility | 37 | 26 | 22 |
Indomitable | 43 | 28 | 24 |
Infertility | 6 | 6 | 6 |
Insignificance | 31 | 31 | 22 |
Invincibleness | 25 | 22 | 18 |
Irresistibility | 27 | 27 | 18 |
Irreversibility | 38 | 15 | 16 |
Leadership | 25 | 5 | 5 |
Life's Essence | 66 | 63 | 54 |
Loyalty | 102 | 65 | 48 |
Mark of Imperfection | 23 | 23 | 23 |
Mercilessness | 8 | 8 | 8 |
Midnight Hour | 3 | 3 | 3 |
Penetration | 8 | 7 | 8 |
Poverty | 24 | 19 | 16 |
Pre-Destination | 21 | 17 | 15 |
Puritan Values | 69 | 50 | 38 |
Purity | 43 | 34 | 24 |
Racial Discrimination | 73 | 48 | 38 |
Self-Esteem | 34 | 26 | 16 |
Transience of Existence | 15 | 12 | 9 |
Transformation | 73 | 49 | 43 |
Weight of Responsibility | 9 | 6 | 4 |
Worth | 121 | 77 | 67 |
SYM CODES COUNT LIST
Label | Noun | Punctuation | Manual |
---|---|---|---|
Antithesis | 538 | 365 | 271 |
Autumn | 11 | 11 | 9 |
Beginnings | 153 | 102 | 79 |
Birth Experience | 5 | 4 | 4 |
Domesticated Nature | 82 | 56 | 46 |
Endings | 212 | 139 | 106 |
Fatherhood | 10 | 11 | 11 |
Heart | 21 | 18 | 16 |
Heart as Truth Knower | 6 | 4 | 4 |
Holy Spirit Hidden Within | 21 | 19 | 14 |
Inheritance | 12 | 10 | 9 |
Mother of God | 10 | 8 | 5 |
One Father of All (God) | 2 | 2 | 2 |
Paradox | 24 | 17 | 14 |
Penetration | 3 | 2 | 2 |
Promises | 93 | 54 | 41 |
Sexual Entry Point | 13 | 11 | 10 |
Son of God, Son of Man | 1 | 1 | 1 |
Spring | 5 | 5 | 5 |
Substitution | 34 | 25 | 22 |
Summer | 7 | 7 | 6 |
Untamed Nature | 58 | 41 | 31 |
Winter | 22 | 19 | 19 |
Womb--return to, acceptance into | 6 | 4 | 3 |
Womb--solitude of | 1 | 1 | 1 |