Digital Approaches to Intertextuality: The Case of Eliza Haywood

Over the last year or so, I have been thinking more and more about intertextuality, or the ways in which writers borrow language and ideas from other writers. As a researcher who is particularly interested in the relationship between literary and scientific texts of the Enlightenment, I have been writing plagiarism detection scripts in order to pursue moments in works like Laurence Sterne's Tristram Shandy that borrow language from medical texts such as Burton's Anatomy of Melancholy. Having spent some time thinking about these kinds of questions, I was perhaps unusually provoked by the passage in Eliza Haywood's Betsy Thoughtless (1751) wherein the male hero Mr. Trueworth quotes a quatrain from Shakespeare that no scholars have subsequently been able to locate within the canonical Shakespearean corpus. “How dear,” Trueworth says, “ought a woman to prize her innocence! — as Shakespeare says,

They all are white,—a sheet
Of spotless paper, when they first are born;
But they are to be scrawl'd upon, and blotted
By every goose-quill” (463).

Christine Blouch, the editor of the Broadview edition of Betsy Thoughtless I was reading, attached a footnote to this quatrain that simply stated “Not Shakespeare. Source unidentified” (463). This footnote led my imagination to run wild—I wondered, could this poetic stanza be a fragment from one of Shakespeare's Lost Plays? Might it be the key that will unlock some of the grand mysteries behind the Shakespeare apocrypha?

While I soon learned the answer to these questions is a resounding “No”the quatrain is not by Shakespeare, but William Congrevemy interest in Haywood's use of intertextuality only continued to mature. In Betsy Thoughtless alone I found the better part of two dozen “quotations” such as the “Shakespeare” quotation above for which previous scholars were unable to identify sources. Curious to see how well plagiarism detection routines could identify the missing sources, I set out to uncover the materials that informed Haywood's work.

I began by selecting a handful of “quotations” on which to focus my attention. Of the twenty or so instances of intertextuality whose sources were unidentified in my edition of Betsy Thoughtless, I selected the twelve that I thought were most interesting. These were passages like:

"When puzzling doubts the anxious bosom seize,
to know the worst is some degree of ease" (51)

Haywood often introduces such passages with phrases like, “As the poet says,” or “I remember to have read somewhere,” which I took to be indications that the passages were based on extant texts of Haywood's day. I therefore typed up the dozen unsourced quotations I had selected and fed them to the Literature Online API. The API then broke the quotations into sequences of three words and sent each of those three word chunks to the Literature Online database. This procedure generated a spreadsheet of text data pictured in the following image, which I subsequently scoured for the sources of Haywood's passages:

A sample comma-separated-value file produced by the Literature Online API described below.

Using this data, I first tried to get a feel for the individuals whose prose most closely resembles the unsourced passages. Using some Python scripts, I counted up the number of times each writer in the LION database had a trigram (or series of three words) that matched one of the trigrams in the quotes I had selected from Haywood's novel. Here are the authors whose texts shared the greatest number of trigrams with the selected passages:

Authors whose work most resembled the selected trigrams from Haywood's Betsy Thoughtless. The birth and death dates of authors identified herein are taken from the Literature Online database.

These numbers were interesting—who would have thought Edward Ward would occupy the pole position? In the end, though, I found that very few of the sources for Haywood's identified literary borrowings appear in this chart. To discover this fact, I began by writing a few scripts that could loop over the text data pictured above. Using those routines, I soon discovered that in many cases, Haywood's references to other literary works were fairly straightforward, which made the computing easy. We could take for instance the pseudo-Shakespearean passage cited above:

“They all are white,—a sheet
Of spotless paper, when they first are born;
But they are to be scrawl'd upon, and blotted
By every goose-quill” (463)

Much to my chagrin, these lines led me not to a now-forgotten Shakespearean play but William Congreve's Love for Love (1695), where one reads: "You are all white, a sheet of lovely, spotless paper, when you first are born; but you are to be scrawled and blotted by every goose's quill. " (Of course Congreve could also be quoting a forgotten Shakespearean work, but this is another story.)

Other passages in Betsy Thoughtless had similarly straightforward sources. Using my procedure, I was able to find sources for the following passages in Haywood's text:

The Patriarch, to gain a wife
Chaste, beautiful, and young,
Serv'd fourteen years, a painful life,
And never thought it long.
Oh! were you to reward such cares,
And life so long would stay,
Not fourteen, but four hundred years,
Would seem but as one day (153)

These lines are from "The Perfection," a song published at least as early as 1726 in The Hive: A Collection of the Most Celebrated Songs, and one which Robert Burns quoted with delight some years thereafter. Another passage:

“All saw her spots but few her brightness took” (224)

was adapted from the 1677 play that made Nathaniel Lee's career, namely Alexander the Great, where the titular character boasts “All find my spots, but few observe my brightness.” Next up:

“That faultless form could act no crime,
But heav'n, on looking on it, must forgive” (280)

This passage draws from John Dryden's play The Spanish Friar (1681): “So wondrous fair, you justifie Rebellion: As if that faultless Face could make no Sin, But Heaven, with looking on it, must forgive.” The next unsourced passage,

“There is no wonder, or else all is wonder” (285)

is adopted from a remark of William Congreve's in The Mourning Bride (1697): “There are no wonders, or else all is wonder.” Let's turn to another:

“Young Philander woo'd me long,
I was peevish, and forbad him;
I would not hear his charming song,
But now I wish, I wish I had him” (289)

Here Haywood recites a popular song of the day, one that made its way into Charles Johnson's The Village Opera (1729):

An air from Charles Johnson's "The Village Opera" that Eliza Haywood cites in Betsy Thoughtless.

My script also uncovered the fact that George Lillo references the song in his play 1731 Silvia (and subsequent searches helped me find that Purcell set the song to music!). Next:

“Ingratitude's the sin, which, first or last,
Taints the whole sex; the catching court-disease” (322)

Mad man Nathaniel Lee wrote similar lines in his play Mithridates (1678): “Inconstancy, the Plague that first or last Taints the whole Sex, the catching Court-disease.” The last passage for which I found a straight forward source runs as follows:

“I, like the child, whose folly prov'd its loss,
Refus'd the gold, and did accept the dross” (602)

Here George Etherege's Comical Revenge, or Love in a Tub (1664) appears to be the source: “I, like the child, whose folly proves his loss, Refus'd the gold, and did accept the dross.” Using natural language processing techniques and the wonderful data provided by LION, identifying these sources took little time at all.

While the previous set of intertextual references were closely patterned on a variety of source texts, some passages that Haywood attributes to other writers are much less straight forward. Indeed, it seems she often combined lines from disparate literary works in order to forge her own ideas. Take, for example, the following passage:

Pleas'd with destruction, proud to be undone,
With open arms I to my ruin run,
And sought the mischiefs I was bid to shun;
Tempted that shame a virgin ought to dread,
And had not the excuse of being betray'd (111)

Like other instances of intertextuality in Haywood's writing, this passage seems to derive from multiple sources. The second line appears in the poet and doctor Richard Blackmore's “Advice to the Poets” (1718), where Blackmore writes “Let them this gen'rous Resolution own, / That they are pleas'd and proud to be undone.” The second and third lines of Haywood's aforementioned passage appear to borrow from Mary Wortley Montagu's “The Basset Table” (1716)—where one finds the lines “I know the bite, yet to my ruin run, / And see the folly which I cannot shun”—and posthumously published lines from “The Excursion of Fancy: A Pindaric Ode” (1753) by Aaron Hill (1685-1750): “Let us throw down this load of doubt, with which no race is won: / And, swift, to easier conquests, lighter, run, / The way, which reason is not bid to shun!” Another synthetic creation of Haywood's that I spent some time analyzing runs as follows:

When puzzling doubts the anxious bosom seize,
To know the worst is some degree of ease (51)

The first line of this couplet pulls from a line in Joseph Mitchell's “Poems on Several Grave and Important Subjects”: “When puzling Doubts invade my Breast, And I am cloath'd in Shades of Night . . . ", while the second inverts a line from Davild Mallet's Eurydice (1731): “When others too are miserable, not to know the worst is some degree of bliss.” In this passage, as in others, Haywood brings a variety of extant literary works to bear on her own project in fascinating and unpredictable ways.

* * *

Tracing the sources of these passages was helpful, not least because it allowed me to get a better sense of the ways writers like Haywood engaged with the texts of their age. For instance, using the data I gathered while tracing the sources of the passages above, I began considering new ways to optimize my plagiarism detection routines. Consider the following chart:

This graph indicates that, of the passages in Haywood's novel for which I was able to find sources, all of those passages shared at least three identical words in identical order with the texts they paraphrase. Roughly 80% of the instances of plagiarism I analyzed had at least four identical words in identical order with their source texts, ~55% had at least five equivalent words in equivalent sequential order, and so on. The lesson embedded in this chart is perhaps predictable: The greater the number of identical words one demands in order to identify one language act as a paraphrase of another, the greater the number of false negatives one can expect in one's analysis. As I noted above, my study of intertextuality in Haywood's writing was carried out using trigrams as the unit of analysis. That is to say, I expected a passage from Haywood's text to share at least three identical words in identical order with the text it paraphrases. While this seemed a relatively low condition for an instance of plagiarism to satisfy, it might have actually been too demanding a condition, because I was only able to find sources for ten of the twelve passages I set out to study. Perhaps a study using bigrams as the unit of analysis—or perhaps one of you—can identify the source of the remaining two quotations:

"Away with this idle, this scrupulous fear,
For a kiss in the dark,
Cry'd the amorous spark,
There is nothing, no nothing, too dear" (311)

* * *

"Unequal lengths, alas! our passions run,
My love was quite worn out, e'er yours begun" (462)