The Text Documentation committee will provide tags for:
The committee must be familiar with library cataloguing theory and practice for printed matter and machine-readable data files; with enumerative bibliography; and with data-archive management (both internal procedures and user assistance).
The committee must first formulate the requirements for the tasks listed above:
In formulating the requirements, the subcommittees should first survey the relevant practice of the field (notably the International Standard Bibliographic Description rules, the Anglo-American Cataloguing Rules (current version), the American National Standard for Bibliographic References (ANSI Z39.29), the MARC format for library cataloguing records, and existing data-archive catalogues and cataloguing schemes (ICPSR, Essex SSRC, Standard Study Description as implemented at various European archives, and the catalogues in Oxford, Mannheim, Nancy, Pisa, Louvain, Bergen, and Oslo). From this survey, a coherent set of requirements should be formulated.
The second task of the committee is to combine and reconcile the three sets of requirements and to formalize them in an SGML tag set. Substantive characterization of an encoding overlaps so much with the formal declarations of that encoding that a formalization in this area may have to wait until the other committees have finished specifying declaration form and content.
The two meetings of this committee should be used (1) to present to each other the results of the requirements surveys and to begin the process of harmonization, and (2) to review the work of the subcommittees or individual committee members in proposing specific tag-and-attribute formulations of the feature set, adopt it with modifications, or suggest further development (in which case final approval must be by mail ballot).
Three subcommittees are required, though if the committee is small there may be little point in constituting them formally:
Bibliographic description overlaps with the charge of committee TR (Text Representation), which must handle encoding of printed bibliographies; the internal header prepared by committee TD (Committee on Text Documentation) should be a superset of the bibliographical tags specified by committee TR. It seems likely that the TD subcommittee on bibliographic description of the text should serve simultaneously as a subcommittee of committee TR.
Data description must inherently serve two constituencies: the borrowers who need it in order to select their data, and the archivists who must maintain the descriptions. Committee TD, comprising mostly archivists, must actively solicit suggestions from committees TR, AI (Analysis and Interpretation), and ML (Metalanguage and Syntax Issues) concerning useful data descriptors; the other committees must effectively represent the archive borrower community. (Notable example: sociolinguistic descriptors for spoken corpora must be handled by the data description tags, but it is socio-linguists, not data archivists, who are most likely to know what descriptors will be worth recommending. This particular set of descriptors poses the additional problem that spoken texts will not be explicitly addressed until after the first drafting cycle, while committee TD is not expected to live past the first cycle.)
The formal declarations fall into the responsibility of this committee in some sense, because they clearly belong in the encoding's internal documentation (header) section, and because they overlap so heavily in function with the data description tags. But the other committees must formulate the syntax and the required content for declarations. Committee TD needs to be aware of work on the declarations because it affects the data description tags, but the other committees should expect to bear primary responsibility for working out the details of declarations.
Committee TD should formally transmit preliminary accounts of its requirements survey for data description and declarations to the other committees for their information and for comment. It should transmit to committee TR both the requirements survey for bibliographic identification of texts and a fully worked out set of tags for marking in-text bibliographic citations.
The other committees should formally transmit to committee TD their decisions regarding form and content of declarations and desiderata for tags functioning as data descriptors.
Membership should include representatives of the library community and the data archive community, with special attention both to practitioners and to those active in standardization efforts.
[1]
By specifying
minimal bibliographic identification we do not mean to
limit the bibliographic section of an encoding to primitive
bibliographic information, but only to convey that not all
encodings will or need contain more information than is necessary
to locate the copy text: standard practice for bibligraphic
citation is as relevant as library cataloging practice.
[return to text]
[2]
The committee must provide
declarations for the types of alteration and normalization
commonly performed during or after transcription of texts, and
should provide guidance for users seeking to decide when such
alterations constitute a new version or edition of the encoding
or of the work.
[return to text]