Hans Jorgens Marker Some problems occurring when coding formal documents <date>31 May 91 <body> A traditional text may be seen as a string of elements. This string has a beginning and an end and any element of the string is either before or after any other element. The string can be divided into sub-strings (volumes, chapters, paragraphs etc.). These sub-strings also have beginnings and ends and complete ranking. Furthermore one sub-string of a given type ends before the next one of the same type begins. Within a traditional text you can have a portion that has a certain feature e.g. being a foreign language quotation. Such a portion again begins somewhere in the text and ends somewhere else in the text. Texts that fall within this definition are the main concern of the TEI-guidelines. Below I will treat three types of sources that have features which make them differ from traditional texts. The types are compound documents, inventories and account books. For the sake of clarity I am going to draw most of my examples from the fief (1) ledger of Koldinghus (2) 1610/11. This particular ledger is too large to be typical for its kind. The size of the ledger is just over 1200 pages. On the other hand because of its size it contains almost all the features found in such ledgers. Smaller ledgers are often less complete. Furthermore the fief ledger of Koldinghus has the advantage of being available in a printed edition (3) and in machine readable form 4. The edition was made by "Selskabet for Udgivelse af Kilder til Dansk Historie" (5) in cooperation with the Danish Data Archives (6). Compound Documents Quite a number of sources are documents which are composed from several basic types but still have a coherent overall structure. Important examples of compound documents are the estate ledgers as found in Northern Germany and Scandinavia. In their most complete form such ledgers will contain: - a title page stating the territorial and temporal scope of the ledger and often with - conversion formulas used in the ledger to calculate currencies and measures, - a cadastre over the estate covered by the ledger stating the lease due from every tenant under the estate, - an inventory of the holdings of money and goods at the start of the time period covered, - a list of fines and income from proceedings (7) falling to the estate if it had jurisdiction, - taxation lists if taxes went to the estate, - a list of copyhold fees (8) during the period, - an account of income of money, - an account of expenses of money, - accounts of income, expenses and the consumption of important commodities such as cereals, honey, beer and perhaps some types of livestock and finally - a collection of evidence with bearing on different aspects of the ledger such as - receipts from artisans and merchants or - testimonies from court cases of significance for the ledger or proving that tenants were unable to pay their dues (9). The different parts of the ledger constitute a coherent whole. The purchase of a barrel of honey is found as an expense in the money account, as an income in the honey account and on a receipt from the merchant involved. Many of these ledgers have been subjected to scrutiny by some controlling authority. Some private ledgers were sent from bailiffs administering part of a system of estates to the owner residing elsewhere but in some cases taking a keen interest in his property (10). Likewise the official fief ledgers had to be submitted to the chancellory of the exchequer. The auditor checked that everything added up, had the usual values and that the conversion formulas given were correct and were followed throughout the ledger. The auditor wrote his conclusions in the ledger in appropriate places. Usually in the margins or the corners of the relevant pages. A generally occurring problem in the marking up of compound documents arises from the interdependency of the different parts of the document. Though an item may be of a known type e.g. a list or a letter, the item now represents part of a whole. This means of course that you need ways of expressing the relationship between the entirety and the element. It also means for instance that parts of a letter may be linkable to items occurring in other parts of the document. Thus the means for marking up such items and relationships need to be provided. Inventories Many sources contain lists of persons or objects. A typical such list of objects is a cadastre. The purpose of a cadastre is to document the income of an estate. Cadastres may vary somewhat in structure. The cadastre of Koldinghus 1610-11 is a very large document. It starts on page 7 of the ledger of income 11 and continues through page 438. The basic unit of the list is the dues paid by one tenant. Such a unit is for instance: Pouell Iffuerss%n 3 lpd. sm%r 0.5 %rt rug 10 skp. byg 0.5 %rt havre 0.5 okse at stalde 1 br{ndsvin 1 mk dansk g{steri (12) This sub-list begins with the name (Christian name and patronymikon) of the tenant and then gives the amount and type of each of his dues. The tenants are grouped in parishes. At the top of page 16 you find a title saying Hardt sogn og by (13) After 37 lists of dues of varying length page 23 starts with the title: Summarum p} al landgilde udaf Har sogn 14 The rest of the page is then the sum of the small lists on the preceding 7 page. When you look closer at the preceding 7 page you will discover that occasionally one of the small sub-lists is preceded by something ending with a colon. For instance on page 18 you find: Br%dtzgaard: Lauge Hanss%n: 1 fj. sm%r 2 %rt. rug 15 ... And further down on the page you find Stubdrup: Las S%ffrenss%n: 5 skp. kv{rst 2 mk. 10 sk. 2 alb. }rlig 16 ... Peder Lass%n 0.5 rt. rug 1 rt. malt 17 ... By applying some topographical knowledge you may be able to deduce that Br%dtzgaard is the name of the farm on which Lauge Hanss%n is the tenant. Br%dtzgaard may be an outlying farm which is not built in the village. Stubdrup on the other hand is the name of the village within the parish of Hardt to which the following tenants belong. The parishes are grouped in districts (18). The fief of Koldinghus was a large one that comprised 8 districts. Thus on page 101 we find: Summa summarum p} al landgilde, som b%nderne giver udi Brusk herred (19) This grand total covers 5 pages and is a list in kind over the different commodities and forms of labour that comprised the dues of the tenants in the district of Brusk. After the similar lists of the remaining seven districts you find the list of the dues paid by the citizens of the town of Kolding. The citizens paid a due for the land they used. This due was paid in money and not in kind. Thus the items of the list consist of a name, perhaps a description of the land involved, and the amount: Hans Schotte af 1 k}lg}rdshave, som tilforn er brugt (til slottet): 8 sk. dansk (20) The cadastre ends with the master grand total on page 427 of the ledger: Summa summarum p} den }rlige visse indkomst over Koldinghus jordebog til Phillippi Jacobi dag anno 1611: (21) This grand total covers 12 pages. It has a number of marginal comments. In the printed edition some comments are marked by putting the text between quotation marks others are just printed with a smaller font. The edition does not explicitly say what this signifies. Probably the comments between quotation marks are marginal comments. The edition seems to assume that all comments are auditorial comments (22). Whether this assumption is correct can not be decided without access to the manuscript. A comment that is certainly auditorial is the last one: Kommer overens med n{ste forgangen }rs jordebogs summarum. N}r herhos agtes hvis jordebogen dette }r er formeret og forringet efter hosliggende forandringsregisters lydelse signeret med A. 23 This comment is not in quotation marks in the printed edition. Probably it extends over the whole bottom of page 438. Practically all the tenants in the cadastre have names that are clearly identifiable as Christian name and patronymikon. The patronymikon ending on 's%n'. Very few have a family name instead of a patronymikon. A few of the artisans have names that constist of a Christian name, a patronymikon and an occupation such as S%ffren Nielss%n M%ller. Fewer still have names that are only Christian name and occupation, such as Peder Remsnider. The citizens of Kolding often have triple names. There are several different types of such names: Jees S%frens%n F%nbo, (Christian name, patronymikon and former place of residence), Thomas Lauritzss%n Skr{dder, (Christian name, patronymikon and occupation), Mester Jens S%ffrenss%n (title, Christian name and patronymikon), Chresten S%ffrens%n Kock (Christian name, patronymikon and family name). When the citizens have only double names the first is always the Christian name while the second can be either occupation, former place of residence, patronymikon or family name. In order to analyze a text like this you have to be able to identify the items of due such as 1 %rt. malt. Each item should be referred to the tenant, village, parish and district it belongs to. Within the item it is necessary to identify the amount and the kind of commodity. In order to be able to compare the single items with the relevant totals and grand totals it is desirable to be able to recalculate measures. To accomplish this you have to apply knowledge of ancient metrology. In a compound document like a fief ledger some of the relevant measures are found on the front page. In this case it would be an advantage to have the measures linked to the relevant calculation formulas given in the document. As fief ledgers were produced annually they may also form the basis for longitudinal studies. This could be particularly interesting if different lists in the fief ledger and maybe even from other sources were brought together. When using a time series of fief ledgers a particular tenant should first be found in the list of copyhold fees. After that he should appear in the cadastre each year with the same list of dues in the same place. When he disappears you will usually be able to find a record of his burial in the relevant parish register. Furthermore someone new will appear in the list of copyhold fees and take the former's place in the cadastre. Actually the list of copyhold fees often tells which tenant is relieved by the newcomer: Oppeb}ret af Niels Jenss%n Raffn i Wilstrup til f{ste af 1 g}rd ibid. hans fader Jens Raffn afd%de: penge 17 dl. %ksne 1 (24) The list of dues for the new tenant should be exactly the same as for the old. If examples of the opposite can be found in the fief ledgers it would be quite interesting because it would be proof of a violation of Danish law by the Crown. This does not mean that it is improbable to find such examples. Apart from being found in the cadastre and the list of copyhold fees a tenant will also be found in taxation lists and maybe in the list of court proceedings. Further, the persons that are too poor to pay their dues and the 8 honorable dannem{nd who would have to testify to this fact are also found among the tenants in the cadastre. Thus if you have the means to link persons together you can have interesting contributions to tenants biographies from the fief ledgers. It is clearly important to have positional encoding for marginal comments. Further, it its necessary to have ways of expressing changes in hand or ink. Preferably an edition should identify the different hands involved, but this on the other hand is often dependant on an interpretation exercised by the encoder. When marking up names it is necessary for analytical purposes to be able to distinguish between the different types of names. Again this distiction is naturally an interpretation made by the encoder. In some cases you may need a way to express an uncertainty about the type of name found. (I happen to know from the context that the Chresten S%ffrens%n Kock mentioned above belonged to a family called Kock. He was not a cook by profession.) Dealing with some document types it may be prudent to restrict yourself to the registration of surnames and not interpret their kind. Much of the analysis that can be carried out on the basis of a cadastre is dependant on linking items inside and outside the cadastre. Again this emphasizes the need for having the means to mark up links between individual items. Accounts or Ledgers Account books have changed a lot over time. Presently accounts are usually a set of computer files. Often in an undocumented internal format, which is only readable by means of a dedicated piece of software. Formal bookkeeping as we know it from recent accounts and ledgers were common in the 19th century but practically not used in Northern Europe in the 17th, though the invention of bookkeeping by double entry goes back at least to 14th century Italy. The earliest known example is a Florentine account book from 1382 (25). By then bookkeeping by double entry may have been used for a century. The invention of bookkeeping by double entry is often ascribed to Luca Pacioli (26) because his book "Summa" (27) contains recommendations and instructions for bookkeeping by double entry (28). The modern account book based on bookkeeping by double entry is not adequately described as a long string of characters divided into volumes, chapters and paragraphs. A much more adequate model is a coordinate system with time as one axis and accounts as the other. Every entry in the account book is fixed to a certain date, and comprises a denomination and an amount. The amount is fixed to an account. Each entry is paired with another entry. The amounts of the two entries are of the same value; one is an income, the other one is an expense. (I know that I am being a little simplistic here, but the general idea is like that.) Account books of the early modern period are more like texts than recent ledgers are. Early modern ledgers differ very much in size. An example of a minor ledger is Tangg}rd estate 1553 29. This ledger comprises 29 written pages which is 40 times less than the size of the fief ledger of Koldinghus 1611/12. Early modern ledgers consist of a number of lists each giving the account of a particular type of income. One such list can be a cadastre where the particular type is the spoils of landed wealth. The distinguishing feature of the cadastre is the source of the income. In other lists the distinguishing feature may be the character of the income such as the income of rye or income of money. A simple example of one of the lists that constituted a ledger is quoted below. It is the income of money from honey from the fief ledger of Koldinghus 1611/12. Indt{gt for honning Indkommet og oppeb}ret for honning, som er solgt b}de b%nder og andre. Oppeb}ret af nogle b%nder af dette }rs landgildehonning, de ikke har ydet i rette tid efter mandtalsregisters lydelse medens af dem derfor er oppeb}ret penge, som er for 5 td 1 fj 0.5 stob honning, for hver t%nde 13 dl, er s} ungef{r: penge 68 dl. 1 mk. 8 sk dansk Oppeb}ret af {rlig og velbyrdig mand Caspar Marckdaner, lensmand her p} Koldinghus, for 3.5 td. 1 fj. landgildehonning, ham er solgt, t%nden for 13 dl., er s}: penge 48.5 dl 1.5 mk dansk Oppeb}ret udaf S%ffren Anderss%n i Kolding for 1 td. 1.5 fj. honning, ham er solgt, t%nden for 13 dl., er: penge 17.5 dl 1.5 mk dansk Summa oppeb}ret for 10 td. 1.5 fj. 0.5 stob honning, som er solgt: penge 134.5 dl 1.5 mk. 8 sk. dansk Findes liges} til udgift og blev liges} solgt forgangen }r (30). The text considered here is fairly simple, because (we think that) we understand it completely. It was written in two hands one for the actual accountings stuff, and one for the comment of the auditor. Other texts of a similarly formal nature may have structural features which are not that easily interpreted. The typical analytical purposes for which this text may have been made machine readable in the first place would be to determin - prices, - the Crown's income, - practices of scrutiny at Rentekammeret (the chancellory of the exchequer) - actual exchange rates between coins, or - trade between the fief and other parties such as the - tenants, - the feudatory (31), or - local merchants and perhaps I have forgotten some. For all of these purposes most of the following items are of interest: - commodity - amount - unit price - total price - grand total of amount of commodity - grand total of amount of money - buyer's name - buyer's occupation These items should be tagged. A question that raises itself immediately is how big a rounding off is hidden behind the "ungef{r" in the second paragraph. Given that a mark is a quarter of a daler, a daler is 74 skilling, and a barrel is 60 stobe, a simple calculation will show that the farmers paid 1/60 skilling less than the full price of the honey they kept. (The coins provided no possibility for paying 1/60 skilling, as the smallest coin was a hvid with a value of 3 penning or 1/4 of a skilling.) The roundings off throughout the ledger will present a problem when the totals are recalculated. In this example though all honey was sold for 13 daler pr. barrel, 13 times the barrels sold does not give the amount of money received. There are no miscalculations on this page, though they are quite common in accountings of this period. The presence of miscalculations represents an interesting analytical problem because miscalculations will usually influence the next level of integration. Furthermore in the many cases where a miscalculation was not discovered by the auditor the miscalculation became part of the actual settlement between the Crown and the feudatory. The tagging of numbers as given in the TEI-guidelines completely misses the actual significance of the numbers in a text like this one. Most of the numbers belong together in triplets. (dl. mk. sk.) or (td. fj. stob). A more useful way to tag them would be something like: <total-price AMOUNT = 5058.5 sk.> 68 dl. 1 mk. 8 sk. </total-price> Furthermore the relationship between buyer, commodity, unit price and total price is important. In this text, each transaction is found in a two paragraph sequence, and all information given about the transaction is found in each sequence. This is by no means always the case. A simple example could be that the commodity was only mentioned in the heading of the chapter, and then implicit for the transactions following the heading. The TEI-guidelines In the TEI-guidelines texts are generally considered as objects in themselves. For the historian the texts are only of consequence when they reveal information an a reality outside the text itself. The historian is not primarily occupied with describing the text as such, but aims at making it useful for a specific research project or for research projects in general. The production of machine readable source editions (and source editions in general) is always influenced by this distinction between what the text looks like, and which potential it has for illuminating a reality outside the text itself. In a source edition made for historical research it is usually not enough to state the fact that some part of the text is a number. It is often more interesting to know that the number is an election return or a price. Furthermore objects in the text have relationships to each other. The number is perhaps an election return and a word somewhere else in the text is the party concerned. Any edition is an interpretation, and the interpretation is made with reference to particular research questions. The ideal edition should then foresee all possible research questions, which could be put to a particular source. This is not even theoretically possible. One way to try to circumvent this problem, is to reduce the amount of interpretation by representing as many physical features as possible of the original source. By restricting the markup to physical features only you should in theory be able to give the basis for the interpretation to be done on the text. This is due to the fact that the interpretation of an item in a text naturally is dependant on the physical existence and appearance of that item. If you on the other hand in this mode of saintly purity restrict yourself to physical description only and do not interpret some parts of the text as names, dates etc. you may seriously diminish the source edition's adaptability to most known lines of research. This dilemma is naturally unsolvable and can only be handled by exercising sound judgement and understanding of the problems involved when making a source edition. A source will never be completely replaced by an edition. The basic fact that the edition is materially something else than the source implies that they will always be different to some extent. The TEI-guidelines provide for extensions for particular types of texts. The early modern ledgers are clearly are of a very special nature. The extensibility of the TEI-guidelines make them adaptable to almost everything. Unfortunately it is very likely, that any particular research project will make its own extension scheme. Furthermore the number of ways of making incompatible extension schemes is very large. Perhaps even larger than the number of ways of interpreting a text differently. Above I have pointed out a number of problems that need to be considered in a tagset to be used in historical research. Some of them are structural, caused by the sources having features that differ from the features of traditional texts. Among the tags that are needed are tags for expressing positional relationships. This can be absolute positions on the page in order to decide for instance that something is a marginal note, or it can be relative positions in order to decide what a note relates to. Naturally the need for tagging relative positions becomes void if everything is tagged with absolute positions. A semi-weird positional relationship is the imaginary relationship that is used when you are giving the position of an amount in the coordinate system model of the modern ledger. Another structural feature that makes the sources mentioned here differ from texts in general is the interdependence of items found in different places in the text. An example of this is the dependence of a total in the middle of the text on the conversion formulas on the front page. Notes 1. Fief is used as translation of the Danish (and germanic) word len. The fiefs were the most important administrative units in Denmark from the 15th to the 17th century. After 1536 most feudatories had to submit an annual ledger stating the crowns income from taxation and landed wealth in the fief and the expenses being covered by the fief. The majority of these ledgers remain from the period 1600 to 1660, and are naturally sources of great interest for economic and political history. 2. Koldinghus was the royal castle at the town Kolding. The town is located at the border between the Danish kingdom and the duchy Slesvig. Though Slesvig was also a part of Denmark Koldinghus was in a way a border castle. After the Prussian occupation of Slevig Kolding became in fact a border town. By that time Koldinghus was a ruin. It was burned down in the Napoleonic Wars. 3. Birgitte Dedenroth-Schou and Anemette S. Christensen (eds.): Koldinghus Lens Regnskab, 1610-11, K%benhavn 1984 4. DDA-0755: Lensregnskab fra Koldinghus 1610-11 (Koldinghus Fief Ledgers 1610-11) 5. The society for the publication of sources to Danish history 6. I would like to make explicit the fact that I personally had no connection with this project. The data base was designed and the principles for the transscription was decided upon before I became connected with the data archive 7. in old Danish sagefald 8. Danish: stedsm}l 9. In this category you find what we in Danish call tingsvidner 10. This was the case of the duchy L%venholm in eastern Jutland. H. H. Fussing: Gjesingholm 1609-63, Historisk Tidsskrift 10. Rk. 3, 1934-36, p. 76 11. Danish: indt{gtsregnskab 12. On the top of folia 12 of the ledger, Dedenroth-Schou and Christensen (1984) p. 20 The item translates: Peter Iverson 48 pounds of butter .625 barrels of rye 1.25 barrels of barley 1.25 barrels of oats feeding of .5 steer 1 pork 1 day of visiting (feeding and lodging somebody at the lords choice) 13. Hardt parish and village 14. Total of all dues of Hardt parish 15. 1 quartbarrel of butter .625 barrel of rye 16. I have no idea what kv{rst is and I have never seen it anywhere else. Neither does the total for Hardt parish mention any kv{rst. Thus it is my opinion that kv{rst is a mistake. 47.5 shillings per annum 17. .625 barrel of rye 1.5 barrel of malt 18. District is my chosen translation of the Danish word herred 19. Grand total of all dues paid by the tenants in the district of Brusk 20. Hans Scotsman for 1 kitchen garden which was previously used (by the castle) 8 shillings Danish 21. Master grand total of the annual certain income from the cadastre of Koldinghus at May 1st of the year 1611 22. Dedenroth-Schou and Christensen (1984) p. 10 23. Is the same at last years cadastre's total. When you take into account how much the cadastre has been increased and diminished according to the list of changes that follows the ledger and is signed A. 24. On page 509 Received from Niels Jensson Raven in Vilstrup as a copyhold fee for a farm there from which his father Jens Raven died 25. Will Durant: Verdens Kulturhistorie 14, Vojens 1971, p. 104 26. Erling Bj%l: Fra Urtid til Nutid, Politikens verdenshistorie 1, K%benhavn 1982 27. Luca Pacioli: Summa de arithmetica, geometrica, proportioni et proportionalita, written 1487, published 1494 in Venice. 28. Juschkewitsch, A.P.: Geschichte der Mathematik im Mittelalter, Leipzig 1964, p. 430 29. Knud Hornbech and Helge Land Hansen: Tangg}rd gods 1553- 1559, Odense 1980, p. 146-159 30. Quoted from Dedenroth-Schou and Christensen (1984) p. 108. The text translates roughly: Income of honey Received and earned for honey, which is sold to farmers and others. Earned from some farmers of this years honey from manorial dues, which they have not paid at the right time as stated in the census, while some of them have paid money instead, which is for 5 barrels 1 quartbarrel 0.5 stob, for each barrel 13 dollars, is around: money 68 dollars 1 mark 8 shillings Danish Earned from honest and noble man Caspar Markdanner, feudatory here on Koldinghus, for 3.5 ba. 1 qb. honey from the manorial dues, sold to him, the barrel for 13 dl., is then: money 48.5 dl. 1.5 mk. Danish Earned from S%ffren Anderss%n in Kolding for 1 ba. 1.5 qb. honey, sold to him, the barrel for 13 dl., is money 17.5 dl. 1.5 mk. Danish Summa earned from 10 ba. 1.5 qb. 0.5 stob honey, which is sold: money 134.5 dl. 1.5 mk 8 sk. Danish Is found in the expenses as well and was sold in the same way last year. 31. in Danish lensmand