Long time ago, since I’m got my first job, I have maintained a personal budget. I need to understand how much I spent, for which budget items I spent my money and so on. Also, it is very pleasant to look on numbers with my income (and, not so pleasant — to look at my expenses).

Pen and notepad

At first, I used a fountain pen and a simple notepad to write down planned expenses and planned remainder. Budget items were determined by themselves; I didn’t think about them for a long time:

  • communications: Internet and mobile phone
  • food
  • leisure: here is everything I take break from — from cinema and restaurants to buying thermal paste
  • household: usually there are different household chemicals
  • rent: flat rent
  • health: pharmacies, doctor visit and another «entertainments» for those over 30
  • transport: buses, metro, taxi
  • and so on.
|----------------+------------------+-------------------|
| Budget item    | Planned expenses | Planned remainder |
|----------------+------------------+-------------------|
| Communications |              500 |             13000 |
| Food           |            10000 |              3000 |
| Household      |             2000 |              1000 |
| Transport      |             1000 |                 0 |
|----------------+------------------+-------------------|

As you can see, I’m using «envelopes» method for expenses planning. But I have virtual envelopes — there are budget items in the table.

Quite quickly, I realized that I also need to write down numbers with real expenses in this table. In real life, it rarely happens what you spend the same amount of money on health or food as you planned — and then you sit and wonder why your budget item went into the red.

I calculated these expenses manually at the end of every two week. Just sit down with calculator and opened list of expenses in the Internet bank.

Emacs and Org Mode

After a couple of years, I got tired of counting expenses with a calculator. In addition, I also wanted to see the real balance of budget items, updated in real time, not every two weeks. And the total balance of funds, calculated with data from table. With this data, I can control the correctness of the entered expenses.

To do this, I needed some kind of table processor, not just notepad with pen. Because I like to use console and Emacs — I turned my attention to Org Mode. With this I can make tables and use formulas — both spreadsheet formulas and Emacs Lisp formulas.

After reading the manuals, I write these tables for my two-week budget:

#+NAME: incoming
|---+----------------+--------|
|   | Incoming       | Amount |
|---+----------------+--------|
| # | Salary         |   7000 |
| # | Past food      |    900 |
| # | Past household |    289 |
| # | Past transport |    170 |
| # | Past health    |    322 |
|   | All:           |   8681 |
| ^ |                |    all |
|---+----------------+--------|
#+TBLFM: $all=vsum(@2$3..@>>>$3)

#+NAME: budget
|---+---------------+----------+-----------+---------------+----------------|
|   | Expenditure   | Spending | Remainder | Real spending | Real remainder |
|---+---------------+----------+-----------+---------------+----------------|
| # | HCS           |     1000 |      7681 |               |           1000 |
| # | Rent          |     2000 |      5681 |               |           2000 |
| # | Communication |      600 |      5081 |               |            600 |
| # | Food          |     2590 |      2491 |           374 |           2216 |
| # | Household     |      400 |      2091 |           100 |            300 |
| # | Leisure       |     1000 |      1091 |           449 |            551 |
| # | Transport     |      500 |       591 |            80 |            420 |
| # | Health        |      591 |         0 |           100 |            491 |
|---+---------------+----------+-----------+---------------+----------------|
|   |               |          |         0 |      Overall: |           7578 |
| ^ |               |          |         0 |               |        overall |
|---+---------------+----------+-----------+---------------+----------------|
#+TBLFM: $6=$3-$5::@2$4=remote(incoming, $all)-@2$3::@3$4..@>$4=@-1$4-@0$3::$overall=vsum(@II..III$6)

Also, there are two manually filled tables — first with savings data and second with expenses statistics per month.

It works as follows. Income data was entered to the table named incoming and their sum automatically calculated in the all cell. Based on this sum, the planned balance for budget items was calculated. Also, the real account balance in the overall cell from budget table was calculated too.

For some time, this automation suited me. I just look at the expenses in notifications from the bank, sum the numbers for the desired category and update the corresponding cell with expenses in the table.

But there were also some inconveniences that I would still like to automate. First, sometimes I forgot to enter (or sum) expenses, causing the numbers in the corresponding row of the table to diverge from reality. Due to the fact that data for each transaction was not saved here, it was difficult to correct such errors and I had to “guess” what and where I forgot.

Secondly, it was difficult to keep statistics on expenses by month — every two weeks, I should manually sum the amounts and update the data in the table of expenses by month. The same goes for savings. Moreover, there was no adequate access to tables for previous months — there was no Org Mode machine-friendly interface for working with archived tables. Except that write many and many really complex regular expressions which will parse dates and data.

Hledger and SQL-scripts

At this point, I thought about budgeting in some simpler way. I had already heard something about plain-text accounting and read post about it in Vas3k.Club.

The ledger didn’t work out for the first time — the manuals and HOWTOs were overcomplicated with examples of processing credit accounts, stock accounts, debts and so on. I’m almost never used all of these in my life (I’m living in one of CIS countires and usually we do not have enough money to use credits or stocks like financial instruments here). Also, all of the guides assumed that I was budgeting on a rolling basis — without two-week periods, like I do. This make difficult to translate my spreadsheets into a continuous list of expenses. Also, it was not at all obvious how to adapt «money envelopes» system for ledger.

Luckily, I came across hledger and understood the principles of how it works. It turned out to be quite simple. Hledger is not a “silver bullet” that automatically calculates all the necessary things for you. It is like a small database, that allows you to SELECT data from financial transactions, filter them by account name or by date. And with a nice ability to convert from one currency to another “on the fly”.

Everything else builds on top of this functionality.

“Money envelopes” in the hledger

To make the “money envelope” technique work in hledger, I had to work hard. Fortunately, there were people, who were already using it and who knew plain-text accounting. I took the main idea from here.

First I described all my expense accounts:

account expenses                ; type: X
account expenses:hcs            ; type: X
account expenses:rent           ; type: X
account expenses:communication  ; type: X
account expenses:food           ; type: X
account expenses:household      ; type: X
account expenses:leisure        ; type: X
account expenses:transport      ; type: X
account expenses:psychotherapy  ; type: X
account expenses:health         ; type: X

And the main account from which the money is debited: main account:rub. At this stage, everything working according to the manual — we just write how much money went into the expense account and how much money left the main account.

Trying to add separate accounts for budgeting — broke the whole system for me, hledger balance showed some numbers in the report which are disconnected from the real life.

Fortunately, virtual postings saved me. They, and the rules on binding expenses and virtual accounts — helped to avoid strange numbers in the report. And a necessity to specify budget accounts for each transaction.

account budget:hcs            ; type: X
account budget:rent           ; type: X
account budget:communication  ; type: X
account budget:food           ; type: X
account budget:household      ; type: X
account budget:leisure        ; type: X
account budget:transport      ; type: X
account budget:psychotherapy  ; type: X
account budget:health         ; type: X

= ^expenses:hcs
    (budget:hcs)                *-1
= ^expenses:rent
    (budget:rent)               *-1
= ^expenses:communication
    (budget:communication)      *-1
= ^expenses:food
    (budget:food)               *-1
= ^expenses:household
    (budget:household)          *-1
= ^expenses:leisure
    (budget:leisure)            *-1
= ^expenses:transport
    (budget:transport)          *-1
= ^expenses:psychotherapy
    (budget:psychotherapy)      *-1
= ^expenses:health
    (budget:health)             *-1

Another difficulties were that every two weeks I fill the new budget and create the new envelopers. Also, I have a lot of archive data from Org Mode, where there is no 100% convergence of the budget — because transfers to savings are not reflected anywhere. And calculating them was difficult and tedious. Even if I calculate them, then hledger’s report would again show unrelated to reality numbers.

I got rid of the weird numbers using the -b YYYY-MM-DD and -e YYYY-MM-DD keys, with which hledger only looks at data for a few weeks of the budget iteration. So that it doesn’t get confused about income, savings and expenses at the beginning of each budget period, I use a special script to add explicitly specified savings amounts and planned expenses for new budget period to the file:

2023-12-14 "" | Balancing
    savings:touching     340.08 RUB
    savings:emergency    793.29 RUB
    savings:investments  0 RUB
    savings:foreign     $1
    savings:foreign      1 EUR
    equity:fix

2023-12-14 "" | Salary
    income:paycheck:job       -6000 RUB
    income:from past          -1489 RUB
    main account:rub

2023-12-14 "" | Budgeting
    (budget:hcs)            500 RUB
    (budget:rent)           2000 RUB
    (budget:food)           2500 RUB
    (budget:communication)  600 RUB
    (budget:household)      1000 RUB
    (budget:leisure)        1000 RUB
    (budget:transport)      500 RUB
    (budget:psychotherapy)  1000 RUB
    (budget:health)         1000 RUB

The report should include all budget and expense accounts, even if I haven’t spent anything in a couple of weeks. So, I should to add transactions with zero amounts:

2023-12-14 HCS
    expenses:hcs        0 RUB
    main account:rub    0 RUB

2023-12-14 Rent
    expenses:rent       0 RUB
    main account:rub    0 RUB

2023-12-14 Communication
    expenses:communication    0 RUB
    main account:rub          0 RUB

2023-12-14 Food
    expenses:food       0 RUB
    main account:rub    0 RUB

2023-12-14 Household
    expenses:household    0 RUB
    main account:rub      0 RUB

2023-12-14 Leisure
    expenses:leisure      0 RUB
    main account:rub      0 RUB

2023-12-14 Transport
    expenses:transport    0 RUB
    main account:rub      0 RUB

2023-12-14 Psychotherapy
    expenses:psychotherapy    0 RUB
    main account:rub          0 RUB

2023-12-14 Health
    expenses:health     0 RUB
    main account:rub    0 RUB

Roubles exchange

I have some accounts in different currencies, which makes processing the data more difficult. Hledger can help here — it can convert amounts from dollars and euros to roubles. Just use parameter ==–value=then,RUB=, for example:

hledger -s --value=then,RUB reg -M -E -O csv 'expenses'

A separate exchange rate file must be attached to the data file. This is a simple text file that looks like this:

P 2023-12-12 $ 90.9846 RUB
P 2023-12-12 EUR 98.0769 RUB
P 2023-12-14 $ 89.8926 RUB
P 2023-12-14 EUR 96.9500 RUB
P 2023-12-15 $ 89.6741 RUB
P 2023-12-15 EUR 97.7377 RUB
P 2023-12-16 $ 89.6966 RUB
P 2023-12-16 EUR 98.4186 RUB

Of course, you don’t have to fill it in manually — there are bash, curl and xsltproc for that. Exchange rates can be obtained from the website of the Central Bank of the Russian Federation, as they provide usable XML1. For example, to get the dollar exchange rate for the 2022-12-12 date, you need to run the query: https://cbr.ru/scripts/XML_dynamic.asp?date_req1=12/12/2023&date_req2=12/12/2023&VAL_NM_RQ=R01235. The following XML will be returned in response:

<ValCurs ID="R01235" DateRange1="12.12.2023" DateRange2="12.12.2023" name="Foreign Currency Market Dynamic">
    <Record Date="12.12.2023" Id="R01235">
        <Nominal>1</Nominal>
        <Value>90,9846</Value>
        <VunitRate>90,9846</VunitRate>
    </Record>
</ValCurs>

You can extract data from the necessary node using xsltproc and this XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text"/>
  <xsl:template match="ValCurs">
    <xsl:for-each select="Record">
      <xsl:value-of select="Value"/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

A bash script puts it all together:

#!/usr/bin/env bash

declare -A CURRENCY_CODES=(["$"]="R01235" ["EUR"]="R01239")
CBR_DATE=$(date '+%d/%m/%Y')
CBR_XSLT=$(mktemp /tmp/cbr.XXXXXX.xslt)

JOURNAL="$HOME/path/to/exchange_rates.journal"
JOURNAL_DATE=$(date '+%Y-%m-%d')

cat << EOF > "$CBR_XSLT"
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text"/>
  <xsl:template match="ValCurs">
    <xsl:for-each select="Record">
      <xsl:value-of select="Value"/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>
EOF

for curr in "${!CURRENCY_CODES[@]}"; do
    URL="https://cbr.ru/scripts/XML_dynamic.asp?date_req1=$CBR_DATE&date_req2=$CBR_DATE&VAL_NM_RQ=${CURRENCY_CODES[$curr]}"
    EXCHANGE_RATE=$(curl -s "$URL" | xsltproc "$CBR_XSLT" - | sed 's/,/./g')
    if [ ! -n "$EXCHANGE_RATE" ]; then
        continue
    fi
    echo "P $JOURNAL_DATE $curr $EXCHANGE_RATE RUB" >> "$JOURNAL"
done

rm -f "$CBR_XSLT"

This script is run as cron job every day. As a result, I always have fresh exchange rates every day!

Spending statistics for each month

Now comes the fun part — extending hledger’s capabilities.

Hledger can export its reports to CSV and SQLite can parse this CSV into tables in an in-memory database. This makes further processing of the whole data much easier, as SQL offers more possibilities than hledger.

I quickly found that I could get the expenses by month with the following command:

hledger -s --value=then,RUB reg -M -E -O csv 'expenses'

The file path does not need to be specified if it is already set in the LEDGER_FILE environment variable.

But the output of this command (even if you remove -O csv) is not very readable:

"txnidx","date","code","description","account","amount","total"
...
"0","2023-11-01","","","expenses:leisure","432.00 RUB","1437564.28 RUB"
"0","2023-11-01","","","expenses:psychotherapy","998.00 RUB","1454562.28 RUB"
"0","2023-11-01","","","expenses:transport","57.00 RUB","1480019.28 RUB"
"0","2023-12-01","","","expenses:leisure","0","1481009.28 RUB"
"0","2023-12-01","","","expenses:psychotherapy","237.96 RUB","1503697.24 RUB"
"0","2023-12-01","","","expenses:transport","0","1503697.24 RUB"
...

It is also not suitable for SQLite — no quotes are needed. And there are only three columns for further calculations — date, account and amount. The suffix RUB is not needed either — all amounts are already in rubles. Fortunately, the magic of AWK comes to the rescue:

awk 'BEGIN {FS=","; OFS=","}
{
    gsub(/"/, "", $0);
    gsub(" RUB", "", $0);
    print $2, $5, $6;
}'

As a result, hledger output becomes:

date,account,amount
...
2023-11-01,expenses:leisure,432.00
2023-11-01,expenses:psychotherapy,998.00
2023-11-01,expenses:transport,57.00
2023-12-01,expenses:leisure,0
2023-12-01,expenses:psychotherapy,237.96
2023-12-01,expenses:transport,0
...

This CSV can already be fed into sqlite3 for further processing the resulting table expenses with SQL:

sqlite3 -header -csv ':memory:' '.import --csv /dev/stdin expenses' "

Here, after the double quote, is the SQL that converts the raw data from the table expenses into the table I need:

  1. First, I sort the expenses table by date and by expense item, so that the entries in the table were exactly in right order:

    WITH ordered_expenses AS (
        SELECT * FROM expenses
        ORDER BY date, account),
    
    +------------+------------------------+--------+
    |    date    |        account         | amount |
    +------------+------------------------+--------+
    | 2023-11-01 | expenses:leisure       | 432.00 |
    | 2023-11-01 | expenses:psychotherapy | 998.00 |
    | 2023-11-01 | expenses:transport     | 57.00  |
    | 2023-12-01 | expenses:leisure       | 0      |
    | 2023-12-01 | expenses:psychotherapy | 237.96 |
    | 2023-12-01 | expenses:transport     | 0      |
    +------------+------------------------+--------+
    
  2. Then I use the window function so that the account names go along the X-axis and the dates stay on Y-axis. Something like that:

    |------------+---------+-----------+-----|
    |       date | leisure | transport | hcs |
    |------------+---------+-----------+-----|
    | 2023-11-01 |         |           |     |
    | 2023-12-01 |         |           |     |
    |------------+---------+-----------+-----|
    

    The following query creates a table that is sequentially populated with expenses data:

    expenses4gnuplot AS (
    SELECT date,                                           -- Reversed alphanumeric sorting here:
        lag(amount, 0) OVER expenses_win AS transport,     -- expenses:transport
        lag(amount, 1) OVER expenses_win AS rent,          -- expenses:rent
        lag(amount, 2) OVER expenses_win AS psychotherapy, -- expenses:psychotherapy
        lag(amount, 3) OVER expenses_win AS leisure,       -- expenses:leisure
        lag(amount, 4) OVER expenses_win AS household,     -- expenses:household
        lag(amount, 5) OVER expenses_win AS health,        -- expenses:health
        lag(amount, 6) OVER expenses_win AS hcs,           -- expenses:hcs
        lag(amount, 7) OVER expenses_win AS food,          -- expenses:food
        lag(amount, 8) OVER expenses_win AS communication, -- expenses:communication
        row_number() OVER expenses_win AS rownumber
    FROM ordered_expenses
    WINDOW expenses_win AS(PARTITION BY date)
    ORDER BY date, account, rownumber),
    

    Result looks like this:

    +------------+-----------+---------------+---------+-----------+
    |    date    | transport | psychotherapy | leisure | rownumber |
    +------------+-----------+---------------+---------+-----------+
    | 2023-11-01 | 432.00    |               |         | 1         |
    | 2023-11-01 | 998.00    | 432.00        |         | 2         |
    | 2023-11-01 | 57.00     | 998.00        | 432.00  | 3         |
    | 2023-12-01 | 0         |               |         | 1         |
    | 2023-12-01 | 237.96    | 0             |         | 2         |
    | 2023-12-01 | 0         | 237.96        | 0       | 3         |
    +------------+-----------+---------------+---------+-----------+
    
  3. Obviously, I only need rows where rownumber can be divided to 3. To select them — I store the dates and the corresponding maximum rownumber in a separate table last. It is clear that the list of expenditure items for each month must match, otherwise everything will break. For that I’m enter an expense of 0 rubles for each budget iteration every two weeks.

    Query for last table:

    last AS (
    SELECT date,
        max(rownumber) AS maxrownumber
    FROM expenses4gnuplot GROUP BY date)
    
    +------------+--------------+
    |    date    | maxrownumber |
    +------------+--------------+
    | 2023-11-01 | 3            |
    | 2023-12-01 | 3            |
    +------------+--------------+
    
  4. Finally, I can merge the expenses and the last tables. If use sqlite3 option ==–table= instead of --csv, then I get this nice table:

    +------------+---------+---------------+-----------+
    |    Date    | Leisure | Psychotherapy | Transport |
    +------------+---------+---------------+-----------+
    | 2023-11-01 | 432.00  | 998.00        | 57.00     |
    | 2023-12-01 | 0       | 237.96        | 0         |
    +------------+---------+---------------+-----------+
    

    With the next SQL-query:

    SELECT e4g.date AS Date,
        e4g.communication AS Communication,
        e4g.food AS Food,
        e4g.hcs AS HCS,
        e4g.health AS Health,
        e4g.household AS Household,
        e4g.leisure AS Leisure,
        e4g.psychotherapy AS Psychotherapy,
        e4g.rent AS Rent,
        e4g.transport AS Transport
    FROM expenses4gnuplot AS e4g
    JOIN last ON e4g.date = last.date AND e4g.rownumber = last.maxrownumber
    ORDER BY e4g.date;
    

Savings data

With saving by month is exactly the same as with expenses by month. But we should query data from hledger by the savings accounts.

New budget planning

New budget plans via bash-script with dialog:

Start date of new budget selection

Fields for savings amounts

Sum of incoming salary plus remainders from previous budget iteration

Budgeting

Checking that we haven’t gone into the red

I realized it simple enough:

  • Receive necessary data from user via dialog forms — dates, savings amounts, spending plan for budget.

  • To avoid having to enter a lot of data, I extract savings amounts from the previous budget iteration and paste it into the form:

    SAVINGS_DATA=$(hledger -s --value=then,RUB reg -E -O csv 'savings' |
        grep -v 'savings:foreign' |
        awk 'BEGIN {FS=","; OFS=","}
        {
            gsub(/"/, "", $0);
            gsub(" RUB", "", $0);
            print $2, $5, $6;
        }' |
        sqlite3 -csv ':memory:' '.import --csv /dev/stdin savings' "
        WITH ordered_savings AS (
            SELECT date, account, amount FROM savings
            WHERE date = (SELECT max(date) FROM savings)
            ORDER BY date, account),
        savings4gnuplot AS (
            SELECT date,                                        -- Reversed alphanumeric sorting here:
                lag(amount, 0) OVER savings_win AS touching,    -- savings:touching
                lag(amount, 1) OVER savings_win AS investments, -- savings:investments
                lag(amount, 2) OVER savings_win AS emergency,   -- savings:emergency
                row_number() OVER savings_win AS rownumber
            FROM ordered_savings
            WINDOW savings_win AS(PARTITION BY date)
            ORDER BY date, account, rownumber),
        last AS (
            SELECT date,
                max(rownumber) AS maxrownumber
            FROM savings4gnuplot GROUP BY date)
        SELECT s4g.touching||' '||s4g.emergency||' '||s4g.investments
        FROM savings4gnuplot AS s4g
        JOIN last ON s4g.date = last.date AND s4g.rownumber = last.maxrownumber
        ORDER BY s4g.date;" |
        sed 's/\"//g')
    
    OLDIFS="$IFS"
    read -ra SAVINGS_ARRAY <<< "$SAVINGS_DATA"
    IFS="$OLDIFS"
    
    SAVINGS=(${SAVINGS_ARRAY[0]} ${SAVINGS_ARRAY[1]} ${SAVINGS_ARRAY[2]} 0 0)
    BUDGET_SAVINGS=($($DIALOG --form "Savings information:" 11 43 5 \
        "Touching fund (RUB)" 1 1 "${SAVINGS[0]}" 1 22 15 10 \
        "Emergency fund (RUB)" 2 1 "${SAVINGS[1]}" 2 22 15 10 \
        "Investments (RUB)" 3 1 "${SAVINGS[2]}" 3 22 15 10 \
        "USD" 4 1 "${SAVINGS[3]}" 4 22 15 10 \
        "EUR" 5 1 "${SAVINGS[4]}" 5 22 15 10 \
        2>&1 1>&3))
    
  • Simply redirect results, in a form suitable for hledger, to a file $LEDGER_FILE.


  1. Technical description here: https://cbr.ru/development/SXML/