Project Methodology—Titles


Our data model was designed through an iterative process. We created the data structure by carefully examining many of the books included in our bibliography and thinking about how to model their physical properties and the relationships embedded in them. Our central data table is for titles. Titles in our database refer to:

1. Records of all editions of a printed work that a woman contributed to; we define contributors as an author, publisher, bookseller, printer, editor, translator, engraver, introducer, illustrator, compiler, or composer (or any combination of these roles), based on information we have gathered from nearly 100 sources, for the period 1700–1836.

2. Titles include broadsides, books (of various formats), pamphlets, and collected works; we do not include daily, weekly, monthly, or quarterly periodicals in this database (as there is too much data that is too difficult to verify and track), but we do incorporate annuals.

For a list of and description of all of the metadata fields we use for titles, see our documentation for Titles.

We began adding many of our titles to the database by importing data from existing projects, provided by individuals directing those projects who graciously agreed to share their data with us. All of these projects and individuals are listed in our acknowledgements. We also collect titles in our database from other digital and print sources as well as from libraries and archives, using their catalogues to locate and examine physical copies of titles. All sources can be found in our source records and the specific sources consulted as well as the Source ID associated with the particular title are found in each title record.

Rarely does a single source provide sufficient information for our title data fields, and we have made every effort to ensure that we have verified the data by consulting at least two reliable sources. Thus we supplement and verify title by checking these sources, or by examining the physical work itself. We indicate in each title record whether the data it includes has been verified (meaning we have found two independent sources, to verify the information, and that the work exists or did exist). Of these two sources, one of them must be a digitization, with the exception being the combination of the ESTC and The English Novel, both independently verified sources (see below). If a work has been hand-verified—meaning that a member of our team has physically examined a copy—we consider that title to be verified, even where there is only a single source, as we know a copy physically exists. We indicate whether a title has been hand verified, and in these cases we provide a Source and Source ID, which is the library or archive name and call number; this information is also included in the shelfmark field. We also copy shelfmark information when available in other sources (though, we only indicate that a copy has been hand-verified if it has been examined by a member of our team). In this way, our database, though it focuses on the edition, does offer some copy-specific information. If we have not been able to hand-verify a title and have been unable to find two independent sources, we label that title record as “Attempted Verification,” to indicate that we have made an attempt to find more than one source. We include in the category of “attempted verification” titles where the second source is WorldCat as we do not consider this resource sufficiently reliable to satisfy our criteria of verification.

It should be noted that our title metadata is chiefly generated from information from and about the books themselves: the edition and imprint data is trusted to be correct based on what the book’s title page states, or has been recorded by the resources we use; additionally, we record the genre of each title based on what the work itself identifies as. As an example, if a work states it is a “tale” in its title, we label it as “Fiction - Tale.” If a fictional work does not explicitly state which subgenre of fiction it is, we label it as “Fiction.” We identify books based on (a) imprint and edition information and (b) how the book identifies itself in terms of genre in order to avoid issues of potential fakery (which are statistical anomalies, based on the resources we have consulted), and to avoid describing eighteenth- and nineteenth-century works according to modern genre classifications.

We also rely on the title pages of works to provide complete titles and to populate a field we call “signed author,” in which we capture precisely how the author is described on the title page. If there is no signed author, the field will be marked “anonymous.” If a work is anonymous but the author is known from other sources, we add the name for that person as an author contributor (linked to the person data, discussed below). If the author is not known, the author field is signed “unknown.”  The “signed author” field allows us to understand the various ways authors presented themselves to the public. As the history of reprinting is also something we aim to capture in our data, when possible we include edition information (as indicated on the title page) and the date of the first edition.

Data for the period 1750–1800 is easier to find and verify than data after 1800 (1801–1830), largely because of the ESTC. Data in well-researched genres such as poetry and the novel are also very well-documented (through The English Novel 1770-1829 and 1830-36 and the Jackson Bibliography of Romantic Poetry). As a result, we are staging the release of our data, beginning with the period 1750-1800; all titles published between these dates have been either verified or attempted verified. Titles between 1800-1830 will be verified or attempted verified as the project continues. 

To explore titles, click here: /title/.

To search titles, click here: /title/search.


Project Methodology—Persons

Titles in our data model are linked to both persons and firms. A person in our database includes:

1. Women who contributed to the titles in our title table; we define contributors as authors, publishers, booksellers, printers, editors, translators, engravers, introductions, illustrators, compilers, and composers;

2. Men who contributed to these women’s titles (we usually only include men who played a prominent role, such as authors and editors);

3. Women who published, printed, or sold books, as determined by the imprint or the colophon of a title, or other reliable data, who were active between 1750-1830, as we have been able to find in our firm sources.

In this way, most of our titles are linked to specific women included in our persons table. One exception to this is for a title authored “by a lady” for whom no identity is known. Of course, it is possible that some of these titles were in fact authored by men, but they are likely to be of limited occurrence that it seemed preferable to include these titles, rather than exclude them all. Another exception is that in an effort to be comprehensive, we include all women who published, printed or sold books, even if we have not included titles produced by this woman. The work of the next phase of the project (after our initial launch in mid-2019) is to find and add these titles.

For a complete list of and description of all of the metadata about a title we include, see our documentation for Persons. We find the data for these people (first name, last name, title, gender, birth date, death date, place of birth, place of death) by searching the following sources: 

Oxford Dictionary of National Biography 

Orlando 

Dictionary of Irish Biography 

German Biography 

French Biography 

British Book Trade Index

Exeter Working Papers in Book History 

Google Search, using as search times whatever known information is available.

If our searches are unsuccessful, we attempt to search using possible alternate name spellings, married names, or maiden names, if known.

We also used data provided to us by Kirstyn Leuner, Director and Co-editor of the Stainforth Library of Women Writers to add Virtual International Authority File (VIAF) permalinks, Wikipedia links and Wikimedia links to about 600 of our female persons. We then searched for VIAF permalinks, Wikipedia links and Wikimedia links for all of our other persons. Once all of these searches have been undertaken, whether or not we were successful in fully populating the fields, we consider an individual person record to be verified. For many persons we possess very little data, often only the last name.

To explore persons, click here: /person/.

To search persons, click here: /person/search/.

 
Project Methodology—Firms


Titles are also linked to firms. A firm in our database is any book-trade business listed in the imprint or colophon of a title record, or any female operated book-trade business active between 1750 and 1830, as identified in our sources. The roles given for these businesses in our database are publisher, printer, and bookseller. Given the shifting definitions of the book trade roles during the period, we attach specific roles in each title record rather than in the firm records, as a means of recognizing that a firm listed in one imprint as a “bookseller” might also work as a “publisher” in relation to a different title record. If a firm played multiple roles in the publication of a given title (for example if a work was “printed by and for” a firm), we indicate that as well. Further, if multiple firms had a hand in publishing a title, we enter them separately. Separating out the imprint and colophon fields into separate entries within the firms field has been on the most time-consuming but also, we believe, valuable aspects of our data model. For a list of and description of all of the metadata about firms we include, see our documentation for Firms.

Firm records are initially created from information provided by imprints and colophons in our title records, which often include the location of operation alongside the firm name. The information available in imprints is not always consistent; however, full addresses, for example, are not always included. We supplement the imprint information by consulting the following sources:

Oxford Dictionary of National Biography 

The Exeter Working Papers

British Book Trade Index

The Scottish Book Trade Index

The London Book Trades, 1800-1850 [book]

A Dictionary of Members of the Dublin Book Trade, 1550-1800 [book]

Wikipedia

These sources allow us to include full names for the partners in firms, provide specific street addresses and locations, and indicate the start and end date of operation for each firm. We check all sources for a firm and often use multiple sources to provide as much information as possible.

The street addresses and locations of these firm records must be clarified, as firms often moved. To account for this, we create new records for each address a firm occupied. A single firm, therefore, may have multiple entries—one per address—to indicate the years they were active at each specific location and thus provide accurate geographical information about the firms used or run by women. We do not, however, list multiple addresses active at the same time for a single firm; we prioritize the address that was active for a longer period of time (some locations were active for the firm’s entire operational period, while other locations were only active for a year or two during that period) or the address that was listed more frequently in the imprints in our database, suggesting that it was the more active location.

Our inclusion of female-run firms extends our database’s firm data beyond the information provided by imprints and colophons in our title records, and therefore has a somewhat different methodology: using our firm resources, we have gone through each bibliographic source listed above by hand to find firms run by women, even those that aren’t attached to or found in any of our title record imprints. As outlined above, these firms include publishers, printers, and booksellers. As we began adding women from our resources rather than imprints, however, we decided to include women listed as stationers, who were often also publishers and booksellers during the period. We have included them for the sake of being as thorough as possible in our inclusion of female-run firms.

These women occasionally have their own entry in these resources, with their full names and addresses and active years provided, but more often they are listed as the widow who took over the business after the death of their husband, and are hidden within the entries dedicated to the husband and listed only under the husband’s name. Once we have the widow’s name we can then search for it in our other resources, and attempt to find more information to populate the record. Having their names also allows us to search for texts specifically printed, published, or sold, by the female-run firms we have found, allowing us to expand our title records. Part of the ongoing work of the WPHP is to find all titles that these female firms produced. This aspect of feminist recovery will take longer to achieve than that of authorship, in part because female authorship is better known and studied, and in part because female authors are usually easier to find in existing sources than women active in the book trades. For each woman who operated a firm in our database, we also create a person record. Creating a person record for these women allows us to attach them as “people” in our title records, and include biographical information including their locations and dates of birth and death and a link to her or his VIAF record, when one exists. In this way, for women who contributed to the production of a book as a publisher, printer or bookseller, the title entry links both to them as persons and to them as firms.

To explore firms, click here: /firm/.

To search firms, click here: /firm/search/.

 

Reese Irwin, Michelle Levy, Kate Moffat, Kandice Sharren

 

 

 

 

 

 

Our data model was designed through an iterative process. We created the data structure by carefully examining many of the books included in our bibliography and thinking about how to model their physical properties and the relationships embedded in them. Our central data table is for titles. Titles in our database refer to:

1. Records of all editions of a printed work that a woman contributed to; we define contributors as an author, publisher, bookseller, printer, editor, translator, engraver, introducer, illustrator, compiler, or composer (or any combination of these roles), based on information we have gathered from nearly 100 sources, for the period 1700–1836.

2. Titles include broadsides, books (of various formats), pamphlets, and collected works; we do not include daily, weekly, monthly, or quarterly periodicals in this database (as there is too much data that is too difficult to verify and track), but we do incorporate annuals.

For a list of and description of all of the metadata fields we use for titles, see our documentation for Titles.

We began adding many of our titles to the database by importing data from existing projects, provided by individuals directing those projects who graciously agreed to share their data with us. All of these projects and individuals are listed in our acknowledgements. We also collect titles in our database from other digital and print sources as well as from libraries and archives, using their catalogues to locate and examine physical copies of titles. All sources can be found in our source records and the specific sources consulted as well as the Source ID associated with the particular title are found in each title record.

Rarely does a single source provide sufficient information for our title data fields, and we have made every effort to ensure that we have verified the data by consulting at least two reliable sources. Thus we supplement and verify title by checking these sources, or by examining the physical work itself. We indicate in each title record whether the data it includes has been verified (meaning we have found two independent sources, to verify the information, and that the work exists or did exist). Of these two sources, one of them must be a digitization, with the exception being the combination of the ESTC and The English Novel, both independently verified sources (see below). If a work has been hand-verified—meaning that a member of our team has physically examined a copy—we consider that title to be verified, even where there is only a single source, as we know a copy physically exists. We indicate whether a title has been hand verified, and in these cases we provide a Source and Source ID, which is the library or archive name and call number; this information is also included in the shelfmark field. We also copy shelfmark information when available in other sources (though, we only indicate that a copy has been hand-verified if it has been examined by a member of our team). In this way, our database, though it focuses on the edition, does offer some copy-specific information. If we have not been able to hand-verify a title and have been unable to find two independent sources, we label that title record as “Attempted Verification,” to indicate that we have made an attempt to find more than one source. We include in the category of “attempted verification” titles where the second source is Worldcat.org, as we do not consider this resource sufficiently reliable to satisfy our criteria of verification.

It should be noted that our title metadata is chiefly generated from information from and about the books themselves: the edition and imprint data is trusted to be correct based on what the book’s title page states, or has been recorded by the resources we use; additionally, we record the genre of each title based on what the work itself identifies as. As an example, if a work states it is a “tale” in its title, we label it as “Fiction - Tale.” If a fictional work does not explicitly state which subgenre of fiction it is, we label it as “Fiction.” We identify books based on (a) imprint and edition information and (b) how the book identifies itself in terms of genre in order to avoid issues of potential fakery (which are statistical anomalies, based on the resources we have consulted), and to avoid describing eighteenth- and nineteenth-century works according to modern genre classifications.

We also rely on the title pages of works to provide complete titles and to populate a field we call “signed author,” in which we capture precisely how the author is described on the title page. If there is no signed author, the field will be marked “anonymous.” If a work is anonymous but the author is known from other sources, we add the name for that person as an author contributor (linked to the person data, discussed below). If the author is not known, the author field is signed “unknown.”  The “signed author” field allows us to understand the various ways authors presented themselves to the public. As the history of reprinting is also something we aim to capture in our data, when possible we include edition information (as indicated on the title page) and the date of the first edition.

Data for the period 1750–1800 is easier to find and verify than data after 1800 (1801–1830), largely because of the ESTC. Data in well-researched genres such as poetry and the novel are also very well-documented (through The English Novel 1770-1829 and 1830-36 and the Jackson Bibliography of Romantic Poetry). As a result, we are staging the release of our data, beginning with the period 1750-1800; all titles published between these dates have been either verified or attempted verified. Titles between 1800-1830 will be verified or attempted verified as the project continues. 

To explore titles, click here: /title/.

To search titles, click here: /title/search.

 

Reese Irwin, Michelle Levy, Kate Moffat, Kandice Sharren