@c -*-texinfo-*- @node Journal File Format,Extending with Python, Format Strings, Top @chapter LEDGER Journal File Format This chapter offers a complete description of the journal data format, suitable for implementors in other languages to follow. For users, the chapter on keeping a journal is less extensive, but more typical of common usage (@pxref{Keeping a Journal}). Data is collected in the form of @dfn{transactions} which occur in one or more @dfn{journal files}. Each transaction, in turn, is made up of one or more @dfn{postings}, which describe how @dfn{amounts} flow from one @dfn{account} to another. Here is an example of the simplest of journal files: @example 2010/05/31 Just an example Expenses:Some:Account $100.00 Income:Another:Account @end example In this example, there is a transaction date, a payee, or description of the transaction, and two postings. The postings show movement of one hundred dollars from an account within the Income hierarchy, to the specified expense account. The name and meaning of these accounts in arbitrary, with no preferences implied, although you will find it useful to follow standard accounting practice (@pxref{Principles of Accounting}). Since an amount is missing from the second posting, it is assumed to be the inverse of the first. This guarantee the cardinal rule of double-entry accounting: the sum of every transaction must balance to zero, or it is in error. Whenever Ledger encounters a @dfn{null posting} in a transaction, it uses it to balance the remainder. It is also typical---though not enforced---to think of the first posting as the destination, and the final as the source. Thus, the amount of the first posting is typically positive. Consider: @example 2010/05/31 An income transaction Assets:Checking $1,000.00 Income:Salary 2010/05/31 An expense transaction Expenses:Dining $100.00 Assets:Checking @end example @emph{Note:} It is important to note that there must be at least two spaces between the end of the post and the beginning of the amount (including and commdity designator). @section Specifying amounts The heart of a journal is the amounts it records, and this fact is reflected in the diversity of amount expressions allowed. All of them are covered here, though it must be said that sometimes, there are multiple ways to achieve a desired result. @subsection Integer amounts In the simplest form, bare decimal numbers are accepted: @example 2010/05/31 An income transaction Assets:Checking 1000.00 Income:Salary @end example Such amounts may only use an optional period for a decimal point. These are referred to as @dfn{integer amounts} or @dfn{uncommoditized amounts}. In most ways they are similar to @dfn{commoditized amounts}, but for one signficant difference: They always display in reports with @dfn{full precision}. More on this in a moment. For now, a word must be said about how Ledger stores numbers. Every number parsed by Ledger is stored internally as an infinite-precision rational value. Floating-point math is never used, as it cannot be trusted to maintain precision of values. So, in the case of @samp{1000.00} above, the internal value is @samp{100000/100}. While rational numbers are great at not losing precision, the question arises: How should they be displayed? A number like @samp{100000/100} is no problem, since it represents a clean decimal fraction. But what about when the number @samp{1/1} is divided by three? How should one print @samp{1/3}, an infinitely repeating decimal? Ledger gets around this problem by rendering rationals into decimal at the last possible moment, and only for display. As such, some rounding must, at times, occur. If this rounding would affect the calculation of a running total, special accommodation postings are generated to make you aware it has happened. In practice, it happens rarely, but even then it does not reflect adjustment of the @emph{internal amount}, only the displayed amount. What has still not been answered is how Ledger rounds values. Should @samp{1/3} be printed as @samp{0.33} or @samp{0.33333}? For commoditized amounts, the number of decimal places is decided by observing how each commodity is used; but in the case of integer amounts, an arbitrary factor must be chosen. Initially, this factor is six. Thus, @samp{1/3} is printed back as @samp{0.333333}. Further, this rounding factor becomes associated with each particular value, and is carried through mathematical operations. For example, if that particular number were multiplied by itself, the decimal precision of the result would be twelve. Addition and subtraction do not affect precision. Since each integer amount retains its own display precision, this is called @dfn{full precision}, as opposed to commoditized amounts, which always look to their commodity to know what precision they should round to, and so use @dfn{commodity precision}. @subsection Commoditized amounts A @dfn{commoditized amount} is an integer amount which has an associated commodity. This commodity can appear before or after the amount, and may or may not be separated from it by a space. Most characters are allowed in a commodity name, except for the following: @itemize @item Any kind of whitespace @item Numerical digits @item Punctuation: @samp{.,;:?!} @item Mathematical and logical operators: @samp{-+*/^&|=} @item Bracketing characters: @samp{<>[]()}@{@} @item The at symbol: @samp{@@} @end itemize And yet, any of these may appear in a commodity name if it is surrounded by double quotes, for example: @example 100 "EUN+133" @end example If a @dfn{quoted commodity} is found, it is displayed in quotes as well, to avoid any confusion as to which part is the amount, and which part is the commodity. Another feature of commoditized amounts is that they are reported back in the same form as parsed. If you specify dollar amounts using @samp{$100}, they will print the same; likewise with @samp{100 $} or @samp{$100.000}. You may even use decimal commas, such as @samp{$100,00}, or thousand-marks, as in @samp{$10,000.00}. These display characteristics become associated with the commodity, with the result being that all amounts of the same commodity are reported consistently. Where this is most noticeable is the @dfn{display precision}, which is determined by the most precise value seen for a given commodity. In most cases. Ledger makes a distinction by @dfn{observed amounts} and unobserved amounts. An observed amount is critiqued by Ledger to determine how amounts using that commodity should be displayed; unobserved amounts are significant in their value only---no matter how they are specified, it does not change how other amounts in that commodity will be displayed. An example of this is found in cost expressions, covered next. @section Posting costs You have seen how to specify either a commoditized or an integer amount for a posting. But what if the amount you paid for something was in one commodity, and the amount received was another? There are two main ways to express this: @example 2010/05/31 Farmer's Market Assets:My Larder 100 apples Assets:Checking $20.00 @end example In this example, you have paid twenty dollars for one hundred apples. The cost to you is twenty cents per apple, and Ledger calculates this implied cost for you. You can also make the cost explicit using a @dfn{cost amount}: @example 2010/05/31 Farmer's Market Assets:My Larder 100 apples @@ $0.200000 Assets:Checking @end example Here the @dfn{per-unit cost} is given explicitly in the form of a cost amount; and since cost amount are @emph{unobserved}, the use of six decimal places has no effect on how dollar amounts are displayed in the final report. You can also specify the @dfn{total cost}: @example 2010/05/31 Farmer's Market Assets:My Larder 100 apples @@@@ $20 Assets:Checking @end example These three forms have identical meaning. In most cases the first is preferred, but the second two are necessary when more than two postings are involved: @example 2010/05/31 Farmer's Market Assets:My Larder 100 apples @@ $0.200000 Assets:My Larder 100 pineapples @@ $0.33 Assets:My Larder 100 "crab apples" @@ $0.04 Assets:Checking @end example Here the implied cost is @samp{$57.00}, which is entered into the null posting automatically so that the transaction balances. @subsection Primary commodities In every transaction involving more than one commodity, there is always one which is the @dfn{primary commodity}. This commodity should be thought of as the exchange commodity, or the commodity used to buy and sells units of the other commodity. In the fruit examples above, dollars are the primary commodity. This is decided by Ledger on the placement of the commodity in the transaction: @example 2010/05/31 Sample Transaction Expenses 100 secondary Assets 50 primary 2010/05/31 Sample Transaction Expenses 100 secondary @@ 0.5 primary Assets 2010/05/31 Sample Transaction Expenses 100 secondary @@@@ 50 primary Assets @end example The only case where knowledge of primary versus secondary comes into play is in reports that use the @option{-V} or @option{-B} options. With these, only primary commodities are shown. If a transaction uses only one commodity, this commodity is also considered a primary. In fact, when Ledger goes about ensures that all transactions balance to zero, it only ever asks this of primary commodities.