Our data model was designed through an iterative process. We created the data structure by carefully examining many of the books included in our bibliography and thinking about how to model their physical properties and the relationships embedded in them. Our central data table is for titles. Titles in our database refer to:

1. Records of all editions of a printed work that a woman contributed to; we define contributors as an author, publisher, bookseller, printer, editor, translator, engraver, introducer, illustrator, compiler, or composer (or any combination of these roles), based on information we have gathered from nearly 100 sources, for the period 1700–1836.

2. Titles include broadsides, books (of various formats), pamphlets, and collected works; we do not include daily, weekly, monthly, or quarterly periodicals in this database (as there is too much data that is too difficult to verify and track), but we do incorporate annuals.

For a list of and description of all of the metadata fields we use for titles, see our documentation for Titles.

We began adding many of our titles to the database by importing data from existing projects, provided by individuals directing those projects who graciously agreed to share their data with us. All of these projects and individuals are listed in our acknowledgements. We also collect titles in our database from other digital and print sources as well as from libraries and archives, using their catalogues to locate and examine physical copies of titles. All sources can be found in our source records and the specific sources consulted as well as the Source ID associated with the particular title are found in each title record.

Rarely does a single source provide sufficient information for our title data fields, and we have made every effort to ensure that we have verified the data by consulting at least two reliable sources. Thus we supplement and verify title by checking these sources, or by examining the physical work itself. We indicate in each title record whether the data it includes has been verified (meaning we have found two independent sources, to verify the information, and that the work exists or did exist). Of these two sources, one of them must be a digitization, with the exception being the combination of the ESTC and The English Novel, both independently verified sources (see below). If a work has been hand-verified—meaning that a member of our team has physically examined a copy—we consider that title to be verified, even where there is only a single source, as we know a copy physically exists. We indicate whether a title has been hand verified, and in these cases we provide a Source and Source ID, which is the library or archive name and call number; this information is also included in the shelfmark field. We also copy shelfmark information when available in other sources (though, we only indicate that a copy has been hand-verified if it has been examined by a member of our team). In this way, our database, though it focuses on the edition, does offer some copy-specific information. If we have not been able to hand-verify a title and have been unable to find two independent sources, we label that title record as “Attempted Verification,” to indicate that we have made an attempt to find more than one source. We include in the category of “attempted verification” titles where the second source is Worldcat.org, as we do not consider this resource sufficiently reliable to satisfy our criteria of verification.

It should be noted that our title metadata is chiefly generated from information from and about the books themselves: the edition and imprint data is trusted to be correct based on what the book’s title page states, or has been recorded by the resources we use; additionally, we record the genre of each title based on what the work itself identifies as. As an example, if a work states it is a “tale” in its title, we label it as “Fiction - Tale.” If a fictional work does not explicitly state which subgenre of fiction it is, we label it as “Fiction.” We identify books based on (a) imprint and edition information and (b) how the book identifies itself in terms of genre in order to avoid issues of potential fakery (which are statistical anomalies, based on the resources we have consulted), and to avoid describing eighteenth- and nineteenth-century works according to modern genre classifications.

We also rely on the title pages of works to provide complete titles and to populate a field we call “signed author,” in which we capture precisely how the author is described on the title page. If there is no signed author, the field will be marked “anonymous.” If a work is anonymous but the author is known from other sources, we add the name for that person as an author contributor (linked to the person data, discussed below). If the author is not known, the author field is signed “unknown.”  The “signed author” field allows us to understand the various ways authors presented themselves to the public. As the history of reprinting is also something we aim to capture in our data, when possible we include edition information (as indicated on the title page) and the date of the first edition.

Data for the period 1750–1800 is easier to find and verify than data after 1800 (1801–1830), largely because of the ESTC. Data in well-researched genres such as poetry and the novel are also very well-documented (through The English Novel 1770-1829 and 1830-36 and the Jackson Bibliography of Romantic Poetry). As a result, we are staging the release of our data, beginning with the period 1750-1800; all titles published between these dates have been either verified or attempted verified. Titles between 1800-1830 will be verified or attempted verified as the project continues. 

To explore titles, click here: /title/.

To search titles, click here: /title/search.


Reese Irwin, Michelle Levy, Kate Moffat, Kandice Sharren