Digital Humanities


King's College London, 3rd - 6th July 2010

[Image: KCL Photo Collage]
[Image: London Photo Collage (Somerset House; Globe Theatre; Millennium Bridge; Tate Modern)]

The Diary of a Public Man: A Case Study in Traditional and Non-Traditional Authorship Attribution

See Abstract in PDF, XML, or in the Programme

Holmes, David I.
The College of New Jersey, USA

Crofts, Daniel W.
The College of New Jersey, USA

In 1879 the North American Review published in four separate monthly installments excerpts from "The Diary of a Public Man" in which the name of the diarist was withheld. It was, or purported to be, a diary kept during the "secession winter" of 1860-61. It appeared to offer verbatim accounts of behind-the-scenes discussions at the very highest levels during the greatest crisis the US had ever faced. Interest in this real or purported diary was considerable. The diarist had access to a wide spectrum of key officials, from the South as well as the North, gave a number of striking anecdotes about Abraham Lincoln, and provided an important account of events at Washington during the critical days just before the Civil War.

A detailed study of the Diary was conducted by Frank Anderson in 1948 in his book The Mystery of "A Public Man". Anderson argues that the Diary is part genuine and part fictitious with two of the three striking Lincoln incidents appearing to be inventions, along with other so-called "interviews" with prominent figures. He believes that, as a core, there is a genuine diary kept by Samuel Ward (1814-1884) at Washington during that winter, and that it is possible that the editor of North American Review, Allen Thorndike Rice, may have assisted in the process of embellishment. William Hurlbert (1827-1895), he argues, may also be involved, since the style of the Diary has a good deal of Hurlbert's pungency. Others have suggested that the diarist might be Henry Adams (1838-1918), who enjoyed close access to William Henry Seward who became Lincoln's Secretary of State and was a central figure in the Diary. Certainly the fact that, over a century after its publication, the authorship has remained undetermined is proof that the work of all those who may have shared in its preparation and publication was cleverly done.

Traditional Attribution

This paper argues that the diarist was not Samuel Ward; it was, instead, William Hurlbert. The preponderance of the evidence also suggests that the Diary may well be a legitimate historical document.

The diarist was not simply an observer but very much a participant-observer. One key circumstance would have impeded Ward. At the precise time the Diary was being penned, he was busily engaged in writing a memoir of his experiences in the California gold fields in 1851-52. His recollections were published in a New York weekly, starting on January 22nd 1861, and concluding abruptly on April 23rd 1861. A great deal of internal evidence suggests that the Diary and the Gold Rush memoir could not have been written by the same person, even allowing for their radically different subject matter. Ward's sentences sometimes meander in a Baroque manner, he often alliterates, peppers his narrative with Spanish and French expressions, and has a habit of encasing unusual words or phrases within quotation marks. By contrast the Diary is fast paced and immediate, with a style running towards active verbs accompanied by adverbs of a certain type.

In William Hurlbert, however, we find a newspaper writer whose style had sweep and dash. At the very moment when the biggest story he had ever witnessed burst to attention, he had no job, but nonetheless, had access to a remarkably wide range of prominent people. The Southern-born Hurlbert also had more basis than Ward to have developed close ties with leading Southerners. A comparison of the Diary with things known to have been written by Hurlbert yields some demonstrable parallels, not least in the number of signature words used in the Diary. Some specific features of the Diary also point to Hurlbert rather than Ward, for example twice the diarist mentions Josiah Quincy (1772-1864), the retired President of Harvard, who had been an important influence in Hurlbert's young life. There are circumstances, too, that suggest why Lincoln might initially have encountered Hurlbert and why he might have welcomed a repeat visit.

Concerning its legitimacy, in a number of crucial particulars the Diary conveys an on-the-spot immediacy that would have been almost impossible to recreate even months after the fact, let alone years; for example the unfolding story of Lincoln's secret and circuitous trip to Washington in late February and the diarist's delayed realization that Seward warned Lincoln to undertake it. The diarist expresses repeated concerns about the potential economic effects of secession, concerns which were quickly subordinated once the war started. The diarist also demonstrates an excellent ear in his accounts of his interviews with others, in particular their personal mannerisms. In all its particulars, the Diary synchronizes perfectly with the way events unfolded at that time.

Non-Traditional Attribution

For testing and validating the stylometric techniques involved in this phase of the study, preliminary textual samples were taken from prominent diarists of that era, George Templeton Strong, Gideon Welles, and Salmon Chase. Analysis of the top 50 frequently occurring function words using what is now known as the "Burrows" approach involving principal components analysis and cluster analysis showed clear discrimination between writers and internal consistency within writers. Textual samples were then taken from three candidate authors of the Diary, namely Samuel Ward, Henry Adams and James Harvey, with the "Burrows" approach once again indicating remarkable internal consistency and clear between–writer discrimination.

Four textual samples each of approximately 3,000 words, representing in total about 2/5 of the work, were taken at various places throughout the Diary, being sufficiently spaced to enable a valid check to be made on internal consistency. The Diary samples showed excellent internal consistency, suggesting single authorship which would refute Anderson's "cut and paste" theory. They appeared to be quite distinct from the samples of the writings of Adams, Harvey and Ward.

Focus then moved to the two main contenders for authorship, Ward and Hurlbert. Carefully controlling for genre in the selection of the textual samples from Hurlbert and Ward, subsequent multivariate analyses on high-frequency function words showed discrimination between these two writers, along with internal consistency. For the attributional stage of the research discriminant analysis was employed. The samples from the Diary, Hurlbert and Ward were divided into smaller sizes in order that as many high-frequency function words as possible could be used without violating the assumptions underlying the technique. All 12 Diary samples were placed into the Hurlbert group.

Finally, the "Delta" technique, proposed by Burrows and refined by Hoover, was employed using the 100 most frequently occurring words in the pooled corpus and on four potential authors of the Diary: Ward, Adams, Harvey and Hurlbert. The closest "match" to the Diary using Delta and its variants was indeed Hurlbert.


The non-traditional stylometric analysis has supplied objective evidence that supports traditional scholarship regarding the problem of the authorship of the Diary. The likelihood that the entire document was written by one person is very strong. William Hurlbert has been pinpointed, to the exclusion of all others, as the Diary's author. Much of the Diary could never have been concocted after the fact; the chances are that the entire document is authentic.


  • Anderson, F.M. (1948). The Mystery of "A Public Man"; A Historical Detective Story. Minneapolis: University of Minnesota Press
  • Burrows, J.F. (1992). 'Not Unless You Ask Nicely: The Interpretative Nexus Between Analysis and Information'. Literary and Linguistic Computing. 7: 91-109
  • Burrows, J.F. (2002). 'Delta: a Measure of Stylistic Difference and a Guide to Likely Authorship'. Literary and Linguistic Computing. 17: 267-287
  • Collins, C. (ed.) (1949). Sam Ward in the Gold Rush. Stanford: Stanford University Press
  • Hoover, D.L. (2004b). 'Delta Prime?'. Literary and Linguistic Computing. 19: 477-495

© 2010 Centre for Computing in the Humanities

Last Updated: 30-06-2010