I’m trying to get my own personal finances in order and would love to get the job done in Julia.
Why Julia and not just use hledger or gnucash? I could say that it’s easier to produce reports and graphs in Julia than in those programs, or that Julia is more flexible and includes more of the functionalities needed for personal financing, but the truth is that this interest is a bit language-centric…
So, please ping in and let me know if you’ve already started working on a personal financing package, or if you’re interested in this kind of thing, or if you have any opinions at all on the matter.
Depends on what you want to do, but for simply keeping track of spending, I would just import data as a DataFrame (most banks allow you to export CSV), and make the plots I want by month, spending category, etc.
Recreating anything more complicated, approaching the tools you linked, is a significant undertaking. And more importantly, I am not sure Julia has a competitive advantage there.
This is what I’m trying to do right now. The crux of the problem is in the categorization. Presently I clean the comment of the expense and manually set it to a specific category. Pretty tedious for thousands of expenses. Using regular expressions (what hledger does) is easier, but categorizing mistakes are common and edge cases are hard to predict (e.g. r"cafe" pointing at a TeaBreak category for a "Cafe Bar" comment).
I agree that it would be a lot more complicated. If there is no need for it to be more complicated it’ll stay at it’s simple status.
What this makes me realize is that none of those programs has really solved the problem of categorizing the expenses. It would be awesome to some how link these 3 entities:
The comment (String) banks stick on each expense they register on our personal accounts.
The business the expense was payed to.
A list of categorization labels that business is associated with.
The labels in #3 come from some community curated list of hierarchal categories, something like:
It will be impossible to get everyone to agree on what categories should exist, or which business should be associated with which categories, but something is better than nothing.
I mean, we’ve managed to do something very similar with music and genres, why not with expenses and categories?
Julia has regular expressions, so of course you can use them.
The categorization problem is indeed difficult, but this is not Julia-specific. I would just ask the user for each new recipient and save the result of the query in a lightweight database (eg a TOML file).
I don’t personally use any of these services so I don’t have a clear grasp of how hard / easy it is to code one of them, but wanted to mention that I’m working on TableWidgets package to allow graphical table editing, modification, filtering etc and I think it could work well for this kind of application. For me to understand, in terms of UI you mainly need a way to add/delete/edit rows and a “spinbox” widget to put in a number, a textbox with autocomplete for the categorization and a date selector?
This is very cool and useful and could function as the front-end to many things, including whatever (if ever) a future personal financing package might look like.
I’m not so sure what would be needed. Right now, the main problem is categorizing the expenses. I think the ideal thing would be some way to relate the three entities I listed before. Once the structure is in place, yes, we’ll need some kind of interface for users to create, read, update, and delete associations between bank comments, businesses, and categories. No simple task…
I see. I’ll probably still add a small demo of sth like this as I think it’s a cool usecase for TableWidgets, feel free to open a feature request there when you have a clearer idea of what structure you want and what’s needed on the UI side.
You know what @piever, falling short on my hopes for a more global solution with linking bank comments to businesses to categories, an awesome solution for now would be:
Having a table with unique (cleaned up) bank comments and associated categories. The table will have two columns, one column for the comments, and one for the categories. The possible values a category can have are finite and come from a list of predefined categories. The initial value of each row in the category column can be missing or a Uncategorized category (whichever). For instance:
That would be awesome, cause then I can just scroll down the table and choose a fitting category for each bank comment. This table will then be used to create a key to categorize future expenses.
A whole separate discussion might be how to best use this data (say a thousand comment => category pairs) to best automatically categorize never-before-encountered expenses (key words, small variations of the same comment, some other higher order associations, etc).
I’ve done it in Julia a couple of months ago (is it 2 years ago?) as I learned Julia. I personally create a Dict for each expense I’ve encountered. Sure, the first few months were hard, as I had to manually enter new Dict entries. Now, after a couple of months it’s simply a matter 1-2 minutes to spot uncategorized transaction.
Workflow is quite simple (and improvements could easily be done, like creating a proper sql DB for instance). I created a module containing a (growing) global Dict.
Loop over all csv files with preprocess(csv_path_unedited) # create a new csv file csv_file_category with category found in a static Dict
Look at which transaction is filed as uncategoried and edit the Dict accordingly.
Re-run preprocess(csv_path_unedited)
Analyze with analyzefinance(csv_file_path_edited) # functions that do what I want in terms of analysis
Awesome. I’ve been working on this on and off since a few days now and you’re (more or less) describing what I’ve got up to now. I have these parts:
manually download the csv and xlsx files from my (and my partner’s) bank.
read the content and clean it up a bit (e.g. remove redundant dates in comments).
check if the cleaned comment of each expense is already in the Dict (you were describing) and get its associated category.
if not, ask the user for the correct category (I’m using googler to get some extra info about the expense to help me remember what it was about). I have predefined like 26 categories that I have a lookup table for (so I just input the number and it gets the correct sub/category).
and that’s where I’m at now. It would be useful to know what stats you’re using. My dream is to have a past, present, and future reports. Where past is a time-series plots of how it’s been going per category etc. Present is how I’m holding up my set budget, like how much money I have left to spend in the say Coffee category. And future is of course for prognosis, predictions, saving goals etc…
Well, the analysis part is rather simple right now. My aim was to get an idea of how much money was associated with each category. That is where I saw that we were spending way too much on grocery!
So, right now, it’s mostly timeseries of categories (each category goes into a 3-tiers, for plotting purpose). Ideally, spending would be divided on a quarterly sub-division (or seasonal).
How about you create accounts if you want to follow your budget?
mutable struct account
name::String
amount::Float64
end
Then, you create all your accounts on january 1st
grocery = account("Grocery", 10_000.00)
When you process your transactions during the year, you simply deduct the appropriate amount and save the current state to jld2 file.
grocery = grocery - transaction # of course you need to define the appropriate arithmetic operator for the account type
Well, something along those lines. On my end, I do not follow the numbers. I only check if we were above or below the approximate monthly budget for a given category and try to reduce spending if we got over the previous months.
Not to detail the thread, but YNAB is the best personal finance system I’m aware of. The software is not very hackable though, especially since they moved to the web (hasn’t stopped many people from trying but…)
If you want to recreate a system for implementing the YNAB method in julia, you’d have at least one additional contributor
Yes! I remember reading about it. The main thing with YNAB, if I’m not mistaken, is how each category in the budget has a set pot of money and you follow how much you used per category as time goes (rolling with the punches etc). I loved that idea and want to implement that specific feature in my work-flow (btw, please inform me if YNAB has any other killer ideas/features).
I wanted to compile a budget from my past spendings. How hard/easy (unrealistic/wasteful) a certain budget is will depend on the percentile I take per category (e.g. 25% of past Restaurant spendings will be really hard to accomplish while 75% might be too easy). I’m imagining a bar indicating percent spent per category and a line indicating where we are in the month. So if my Restaurant budget for one month (30 days) is $100 and I spent $40 after 15 days it might look like:
Restaurant
▩ ▩ ▩ ▩ □|□ □ □ □ □ $60 left
with color indicators if I’ve overspent for the time in the month (or for the whole month), .
All this is fine and dandy, and I could even upload these reports to some website I can access on my phone for on-the-spot checks (like, will my budget allow me that extra expensive meal or should I pass?), but this all will suffer from the first step of this work-flow: manually downloading the csv and xlsx files from my banks. It drives me crazy I can’t use some API to automatically get that data. But even with that handicap, this whole thing is still worth it.
I feel your pain. How hard would it be to open some sort of REST API to get your own spending. I did this type of analysis with Pandas, but I never updated the data because it’s a manual step.
Don’t get me started… And how hard will it be for the bank to have a column for the business ID number? That would remove all the guess work of what business this is and make categorization a lot easier. Information and access to it is power…
Give every dollar a job (this is what you’re referring too)
Embrace your true expenses. This is sort of corollary of rule 1 - if you have to pay $150 every year for car registration, you should be budgeting for that so it doesn’t come as a shock. If you absolutely need a new phone every 2 years that costs $600, you should budget for that. Etc.
Roll with the punches (you mentioned this too). A huge part of making rules 1 and 2 palatable - you have to expect that you’re not going to be perfect. After ~4 years of my wife and I using YNAB, we are almost never surprised. In the first year, it seemed like we had to radically alter our budget every other month. This is OK.
Age your money. This is a sort of revolutionary thing (or it was for me) - basically, you can work towards always paying next month’s expenses with this month’s income. I was always paying for things with a credit card, and then at the end of the month, my whole paycheck would go to paying off the balance. Now, I still use the card, but I don’t buy things unless I have the cash to cover it immediately.
The “killer feature” is the way these rules work together, and make your spending forward-looking instead of (exclusively) backward looking. For me at least, using Mint and other so-called budgeting software, I would make an ideal budget, see that I was never meeting it, and give up after a couple of months.
It seems to me this falls under the “embrace your true expenses.” There’s definitely an aspect of using your budget to alter your spending habits, but I’ve found that this happens naturally when you’re actually budgeting for your true expenses. When you can see explicitly how overspending in the “restaurants” category must pull money from something else you care about (whether that’s gaming, or saving), it takes much less mental energy to resist the splurging.
This is fair enough, though I’ve found the act of manual entry is really helpful in consolidating the mental model that YNAB is seeking to promote. Now, every time I eat out, I enter the amount in real time, and I see immediately how much I’ve got in the budget or if I’ve over spent. If the later, I’ve got to go find another category to take it from.
Thanks for the awesome distillation! I love all of those ideas. I can definitely see how this would work.
All of that sounds great, but the thought of manually inputing every single expense in real-time is really off-putting. The thing I loved about Mint was the automatic and often accurate categorization.
Do you really think that an automatic logging (inputing and categorizing) of your expenses would somehow break YNAB’s magic? Like, if everything gets logged automatically, and before you buy something you check and see if you have any money left for a specific category, and make your decision based on that. You could also have a set of rules for how to deal with overspending: a priority list for the categories, draw evenly from all the categories that have funds left, some special pot just for that, etc.
You’d still need to check in and look to see how you’re holding up the budget, you’d need to consult it before most of your purchases to see how you’re doing in relation to the time in the month, but you’d at least skip the manual input and categorization.
All of this is of course science fiction. Cause you know. It’s just 2018. I guess we’ll have that about 500 years after the hover-boards…
To be clear - the new web-based YNAB app does have automatic import, so clearly they don’t think it’s a hindrance (I’m still using the old version 4 that only has CSV upload). And certainly having that auto-import can be quite useful. If you’re diligent about reviewing your budget, I think it would be fine, but getting the habit of manual entry has its value. Knowing myself, I think it would have taken me much longer to “get it.”