all posts

The future is already here.

It's just not evenly distributed. Well, maybe.

It's tax season. Yay. Please keep reading. With all the hype around agentic AI supposedly killing all our jobs, I wanted to run a small, practical experiment: are we at a point where AI can do at least part of my taxes? Maybe something contained and boring like a travel-cost report? Pretty please?

The setup was deliberately low-friction: a paid ChatGPT account, Google connectors activated, no custom tooling. The underlying question was also practical: could I hand a workflow like this to non-technical family members with businesses at some point?

During 2025, I had already prepared the travel-cost calculation myself in LibreOffice. The AI did not know about that file, did not see it, and had no access to my existing result. What it had was access to the equivalent 2024 spreadsheet as an example. I then explained in detail and asked it to independently reconstruct the same kind of overview from scattered source material: shared calendars, planning sheets, email receipts, bank transactions, and the public website related to my band project.

The question was not: "Can AI do my very limited accounting?" The question was: "How far can AI get when it has to do the legwork of collecting, cross-checking, and structuring messy real-world information — while I already know the correct outcome?"

What worked well

The AI was genuinely useful as a research and reconciliation layer. It pulled together calendar entries, planning data, email receipts, bank transactions, and public event information. It found relevant dates, surfaced missing receipts, and identified inconsistencies between sources. Chapeau. It also made the review process more transparent. Instead of just producing a number, it created a trail of assumptions: this event was relevant, that one was not; this receipt matched, that one belonged elsewhere; this trip looked plausible, but needed human confirmation.

What did not work

Context was still the hard part. The AI cannot know what it cannot know. Some information simply was not encoded in any calendar entry, email, or spreadsheet: the cash taxi fee, the event that looked like part of the project but was billed differently, the fact that I was sick and did not attend something. That context had to come from me — either by telling it, or by keeping the system of record in order at the time.

There were also things it could theoretically have known, but did not figure out. For example: there was a separate shared calendar for the project. The Google Calendar API exposed to the tool did not even provide a usable "list calendars" feature, so I had to find and provide the calendar ID manually.

The workflow also ran into Google API rate limits several times, which stalled progress. And email attachments were not readable in the way I expected. That was pretty disappointing.

The most important failure mode: it made mistakes when calculating totals "in its head." The individual source findings were useful, but the moment numbers were summarized in prose, errors crept in. The workflow only became reliable once I forced the calculations into spreadsheet formulas — but well, sometimes it also forgot to update the sum ranges...

These are exactly the small "big" details I expected to be solved already — the kind of things that make you wonder: has anyone actually used this workflow end to end before? I can't be that avantgarde?!

The main takeaways

  • AI is strong when it searches, structures, compares, and asks useful questions.
  • AI is useful for turning scattered information into a reviewable working document.
  • AI is weak when it silently makes final accounting decisions or does arithmetic in natural language.
  • Human context is still the control layer.
  • Spreadsheet formulas, not generated text, should calculate the final totals.

This was a useful glimpse of where AI copilots can already help with messy admin work: not by replacing judgment, but by doing a lot of the tedious legwork and making assumptions visible enough to challenge.

But I am not handing this workflow to anyone yet. The AI is still too confidently wrong, too often.

all posts