← All posts
·11 min read

What AI Can and Cannot Do in Accounting

An honest assessment for skeptical CFOs: where AI genuinely adds value in accounting, where it does not, and what this means for how you staff and manage the function.

R
Ryan MFounder

The AI pitches aimed at CFOs have a pattern. They start with a claim about how much time accounting teams waste on manual work, pivot to a demo of something that looks impressive, and land on a number ("reduce close time by 70%") that is technically possible under optimal conditions and rarely achieved in practice.

This article is not that.

The goal here is to give a skeptical finance leader an accurate map of what AI is actually good at, what it is genuinely not good at, and what the correct division of labor between AI and humans looks like in an accounting function today. If you are evaluating AI accounting software, this is a better framework than any vendor's pitch deck.


What AI Is Genuinely Good At

These are tasks where AI consistently outperforms manual processes or provides step-function improvements in speed and scale.

Pattern matching at scale

This is AI's clearest strength in accounting. Given a large set of transactions, AI can classify them against a chart of accounts with high accuracy once it has learned the organization's patterns. It can match bank statement lines to ledger entries in bulk. It can identify duplicates across thousands of vendor invoices. It can flag transactions that deviate from historical patterns for a given vendor, account, or time period.

The key qualifier is "at scale." For 50 transactions per month, a good bookkeeper is faster and cheaper. For 5,000 transactions per month, AI is faster, more consistent, and substantially cheaper.

Specific applications where this works well:

  • Bank reconciliation: matching statement lines to postings, flagging timing differences, identifying unmatched items.
  • Transaction coding: classifying expenses to the correct cost center and account based on vendor name, memo, and amount.
  • AP matching: three-way matching of purchase orders, receipts, and invoices at volume.
  • Duplicate detection: identifying potential duplicate payments or invoices across large datasets.

Document extraction from structured formats

AI-driven document extraction from invoices, purchase orders, bank statements, and contracts has reached a level of accuracy that makes it practical for production use. The key word is "structured" — documents that follow a consistent format (e.g., PDF invoices from major vendors, bank statement exports) extract with high accuracy. Non-standard or handwritten documents extract with lower accuracy and require human review.

What "high accuracy" means in practice: For a typical vendor invoice from a known vendor, extraction accuracy for header fields (vendor name, date, invoice number, total amount) is consistently above 95%. Line-item extraction from complex multi-page invoices is lower, typically 85–92%, with more variation.

Errors happen at the margin, which is why human review of exceptions is necessary. The practical question is not "is it perfect?" but "is it more accurate and faster than manual entry?" For most AP volumes above 200 invoices per month, the answer is yes.

Rule-based posting

If the accounting treatment for a transaction type is deterministic and can be expressed as rules, AI can apply those rules consistently across every transaction. Examples:

  • Depreciation: given an asset record with cost, useful life, salvage value, and method, compute and post monthly depreciation without human initiation.
  • Prepaid amortization: given a prepaid expense and its amortization schedule, post each month's amortization.
  • Recurring accruals: given defined rules ("accrue $X to account Y on the last day of each month"), post without prompting.
  • Revenue recognition on standard subscriptions: given contract start/end dates and contract value, recognize ratably and track deferred revenue.

These are not glamorous capabilities, but they collectively represent a significant portion of standard month-end journal entries. Automating them removes a category of work that accountants currently perform manually, check against last month, and occasionally get wrong.

Anomaly detection against historical patterns

AI can establish a baseline of normal behavior for each vendor, account, and transaction type, and flag deviations. A vendor who normally invoices $8,000–$12,000 per month submitting a $47,000 invoice will be flagged. An expense account that normally runs at $15,000/month showing $85,000 in a single week will be flagged.

This is not sophisticated reasoning about why something is anomalous. It is pattern-matching against history, which is what it sounds like. The value is catching errors and fraud that manual review misses because humans do not consistently compare every line item against every historical transaction. AI does this without additional effort.

Generating recognition schedules from contract terms

Given a contract with defined terms (subscription period, contract value, performance obligations, discounts), AI can build a revenue recognition schedule that applies the relevant accounting standard (ASC 606 for most U.S. companies) and tracks deferred revenue over the contract life.

This is high-value for SaaS companies with hundreds or thousands of active contracts. The AI does not make the judgment calls about what constitutes a distinct performance obligation or what the correct standalone selling price is. Those inputs come from the accounting team's policy decisions. Once the policy is set and the inputs are defined, the schedule generation and ongoing tracking is mechanical.


What AI Is Not Good At

These are equally important. Vendors who skip this section are not being honest with you.

Judgment calls requiring business context

The question of whether to accrue an uncertain liability requires weighing the probability of a future outflow, the company's history with similar situations, current management guidance, and a view of what level of conservatism is appropriate given the audit context and the company's financial position. This is not pattern matching. It is judgment that draws on information that exists in relationships, conversations, and context outside any accounting system.

Specific examples: Should we write down this inventory to net realizable value given what we know about the market? Is this contingent liability probable enough to accrue? Should we capitalize this internally-developed software cost or expense it? These require a finance professional who understands the business to make a call.

AI can surface the relevant facts, remind you of the accounting standard, and show you what you did in similar situations last year. It cannot make the call.

Complex estimates requiring professional judgment

Useful life of fixed assets. Allowance for doubtful accounts. Warranty reserves. Goodwill impairment assessments. Fair value measurements for illiquid assets. These estimates require professional judgment informed by industry knowledge, the company's specific circumstances, and an understanding of how auditors will evaluate the reasonableness of management's estimate.

AI can generate a proposed number based on historical data. The historical data is a useful input, not the answer. A company whose industry is undergoing structural change, whose customer base has shifted, or whose operational conditions are meaningfully different from the historical period may need an estimate that diverges significantly from what historical patterns suggest.

The Controller who has worked in the industry for fifteen years and understands why this company's historical bad debt rate is no longer representative brings something that no AI system currently replicates.

Interpreting novel contract structures

AI performs well on contracts that resemble contracts it has seen. A standard SaaS subscription, a typical professional services agreement, a plain-vanilla equipment lease, these process well.

When a contract is genuinely novel, such as a bespoke revenue-sharing arrangement, a complex joint venture agreement, or a contract with contingent milestones tied to third-party events, AI extracts what it can and flags what it cannot handle. A finance professional still needs to analyze the economics and determine the accounting treatment. The more unusual the structure, the less useful the AI extraction is.

Tax positions

Tax law is jurisdiction-specific, regularly updated, and interpreted differently across practitioners and jurisdictions. Uncertain tax positions require analysis that weighs the technical merits of the position, the probability of a favorable outcome, and the disclosure requirements under ASC 740. This is specialized work that a tax professional performs. Accounting AI systems do not take tax positions.

What AI does do in the tax context: prepare supporting schedules for the tax provision, organize transaction data for the tax team, and flag transactions that may have tax implications for review. This is genuinely useful and reduces the prep time for the tax provision. It is not tax advice and should not be treated as such.

Anything requiring understanding of strategy or relationships

Does this expense belong in R&D or COGS? It depends on management's intent and how the company classifies its development activities, not on the expense description. Should this customer's account be placed on credit hold? That depends on the relationship, the strategic value of the account, and what leadership has decided about collection policy. Is this intercompany pricing arrangement at arm's length? That requires understanding the business rationale for the structure.

These questions have accounting answers, but arriving at the right answer requires knowledge of the organization that does not live in the transaction data.


The Correct Division of Labor

Given the above, the correct structure for an AI-assisted or AI-autonomous accounting function is:

AI handles: Volume work (transaction classification, bank reconciliation, document extraction, matching), rule-based posting (recurring entries, depreciation, standard accruals), schedule generation from defined inputs (revenue recognition, lease amortization), anomaly detection and flagging, close status monitoring, and workpaper preparation.

Humans handle: Judgment calls (estimates, uncertain liabilities, complex contract interpretation), exception review (AI flags, human decides), policy decisions (SSP setting, useful life determinations, accounting policy choices), audit liaison, tax positions, and strategic questions.

The ratio in practice: at a company with moderate complexity, AI handles roughly 70–80% of the transactions and entries by count. The 20–30% that remains human-led contains most of the judgment and almost all of the risk.

This is not a failure of AI. It is the correct division of labor. The value proposition is not that AI makes judgment calls better than experienced accountants. It is that AI handles the mechanical volume so that experienced accountants can focus entirely on the judgment calls, which is where their value is concentrated.


What This Means for How You Staff

A finance function using AI-autonomous accounting does not eliminate accounting staff. It changes what they do.

The tasks that go away: manual transaction entry, reconciliation execution, recurring journal entry processing, chasing down AP approvals for standard invoices, maintaining close checklists by polling team members. These are real tasks that currently consume substantial time.

The tasks that remain and expand: exception review, variance analysis, audit support, business partnership (explaining what happened and why), policy governance (setting the rules that AI executes), complex accounting analysis, and reporting to leadership.

The net effect on headcount depends on growth. A company that is growing its transaction volume and complexity may be able to avoid adding accounting headcount as it scales. A company with flat transaction volume may be able to reduce it. Neither outcome is automatic — it depends on what the displaced time is redirected toward.

The more important implication is for hiring. A team member who spent 60% of her time on transaction processing and 40% on judgment work now has the inverse allocation. The value of hiring someone who is exceptionally good at judgment, analysis, and communication increases. The value of hiring someone who is fast at data entry decreases.


How to Evaluate Claims from AI Accounting Vendors

A few filters that help separate credible vendors from hype:

They acknowledge limits. Any vendor who tells you their system handles all accounting decisions without human input is either not being honest about what the system does or is not aware of what good accounting requires. The credible vendors are specific about what goes to AI and what stays with humans.

The demo matches your actual workload. Demos show the happy path: clean invoices from well-formatted PDFs, standard transactions, simple contracts. Ask the vendor to show you how the system handles your messy cases: unusual vendor formats, invoices with unclear line items, contracts with variable pricing. The response tells you a lot.

References at your company size and complexity. A vendor whose references are all 10-person startups is not validated at $100M complexity. Ask for references at similar revenue, transaction volume, and entity count to yours.

They can explain the error rate and how errors surface. Errors are inevitable. The question is how quickly they surface and whether the review workflow is designed to catch them before they matter. Good vendors have clear answers about error rates by task type and how exceptions are routed.

The honest answer from a credible AI accounting vendor is: "Our system handles most of the volume accurately and surfaces the rest for your review. You still need experienced accountants. You need fewer hours of their time on mechanical tasks."

That is what is true, and it is a good enough value proposition on its own.