How to build flexible automations that don't break
Key Takeaways
Usually automation makes it through about 70% of the cases and you're stuck with the bag on the manual work
New Agentic AI is founded on reasoning instead of brittle rules, taking case to case exceptions to make automations more dynamic
This actually works best with high-volume processes with lots of variation
The 70/30 Issue That Is Killing Your Productivity
I used to work at a payment gateway that handled millions of SMBs called Razorpay. I saw the same routine repeat itself over and over.
SMBs would spend hours creating automations no code solutions like n8n and zapier. It would work the happy scenarios fine, only to encounter an exception which would take hours of repair even where the logic was slightly off.
An invoice system ERP extracts data well, until a vendor changes their format. Your payment match is fine until there is a discrepancy to be researched. Your lead routing is fine - until a customer doesn't fit into your pre-defined segments.
Most of the manual work brings the costs back to run these functions.
A sample: Had a $15M B2B with decent automation on bills. Default three-way match, auto-approve for orders under $5K, escalation for anything over. Functioned flawlessly.
Until one vendor issued a combined bill on three individual purchase orders, some items with alternate approval streams. The automation picked it up and they spent 90 minutes searching around:
Which POs the bill paid, why certain of the amounts differed, etc.
This actually did happen 4 times a week - taking up 6+ hours of someone's week just to fix these automation failures.
This did happen 3-4 times a week. That's 6+ hours of someone's week fixing automation failures.
Classic automation isn't bad, it's just not end to end.
What "Agentic" Actually Means (No Hype)
I know - another AI buzzword. But "agentic" is something quite specific and has taken on the shape of having an LLM run in a loop until it can do something.
Traditional automation: "What's my next instruction?"
Agentic AI: "What's the best way to accomplish this goal based on what I'm perceiving?"
It's the difference between preprogramming every contingency to actually making Agents useful.
Three Capabilities That Change Everything
1. Genuine Decision-Making (Not Just Rule-Following)
Traditional invoice handling:
Invoice = PO ? approve
Otherwise, hold for manual review
Agentic handling:
Check if invoice is matching PO
Otherwise, determine why:
Price variation? Check contract terms
Quantity variation? Check receiving history
New item? Check email trail
Categorize if variance within tolerant range
Auto-adjust, flag with context, or forward to correct approver
We used this to a manufacturing company. Before: 40-50 invoice variances flagged each week, and every 20-30 minutes to clear. After: system cleared 35 automatically, flagged 5 with enough context that clearing took 5 minutes.
ROI: ~15 hours/week saved = $30K+ yearly.
2. Multi-Tool Orchestration
Here's what I discovered at the payment gateway: SMBs are not working in a single integrated, clean environment. They are operating 8-12 different tools and all meaningful information lives scattered all across them.
I remember meeting a $10M e-commerce store. Their customer refund process alone touched:
Payment gateway (transaction information)
Shopify (order information)
Zendesk (support tickets)
Google Sheets (fraud analysis)
Slack (approvals)
QuickBooks (accounting)
Their ops person would literally have all of those tabs open, switching between them a dozen times with every refund. It was merciless to watch.
Agentic workflows tap from wherever information lives and use the right tool for the job. Not because you've exhaustively mapped all integrations, but because the system is aware of what's necessary.
Example: Customer question through email. An agentic system:
Queries HubSpot for customer history
Reaches out to your knowledge base in Notion
Pulls accounting transaction information
Confirms ongoing projects in your spreadsheet
Builds response from all that context
Routes to the proper team member if need be
The system is aware of the steps because it has knowledge of the aim.
3. Self-Correction That Builds Trust
One of the biggest complaints I heard from SMBs at the gateway was about chargebacks and disputes. They'd have software that would flag potential fraud, but either it was too aggressive (flagging legitimate customers) or too lax (letting fraud slip through).
The problem? The system never informed you why it made its choice. It just made a simple binary decision, and if you disputed it, bad luck—you had to overrule it by hand.
Prior to making decisions, agentic systems check their own work. This is huge when you have to have faith in automation handling mission-critical processes.
Financial close example: AI processes journal entries, but before posting:
Screens entries are reasonable given account balances
Screens for common errors (transpositions debit/credit, decimal issues)
Flags anything out of the ordinary
Exercises reasoning for every entry
One CFO told us: "I trust it more than our previous junior accountant. Not because it's intelligent, but because it actually shows its work and puts doubt where it should, rather than charging ahead."
That's the difference between automation you monitor every step of the way and automation you can trust.
Applications in the Real World
In the payment gateway, reconciliation was our biggest merchant complaint. We'd send them daily settlement reports—a PDF of all their transactions, fees, refunds, chargebacks, etc.
Easy reports for small merchants. But our mid-sized customers? Theirs were 50+ page reports with multiple currencies, different fee schedules for different types of cards, reserve holdings, rolling reserves—a complete mess.
I watched a $20M merchant finance employee spend 4 hours on every Monday manually reconciling our settlement report against their accounting system. She had built an enormous Excel template with formulas, but whenever we changed the format of the report (which we did quarterly), she'd have to rebuild her entire process.
Actual operations documents are tacky:
Vendor agreements with terms just thrown around everywhere
Invoices referencing several contracts
Financial reports with tables, charts, and written descriptions
Compliance reports with cross-references
Agentic approach:
Look at document structure first
Decide on extraction strategy by section
Cross-reference against similar documents
Add context to data from other systems
Check completeness before committing
Financial Operations Handling Complexity
Large companies have financial data everywhere: accounting systems, budget spreadsheets, unstructured docs, email threads.
Here's a real one I saw over and over: A merchant would receive a charge disputed by a customer. To settle, their ops person would need to:
Retrieve the transaction out of our gateway
Search the original order in their online store
See if the product shipped (shipping interface)
Read any customer communication (email/support tickets)
Read their refund policy (buried in a PDF somewhere)
See if this customer had a history of disputes (spreadsheet tracker)
Make a discretionary call whether to refund
All that took 30-45 minutes per dispute. 10-20 disputes a week, and you're looking at 10+ hours of just switching between systems.
Agentic systems
Query for payment history
Refer contracts for terms
Cross-check budget information
Check approval chains from email
Smart payment suggestions with full context
The same for bank reconciliation. Not only matching transactions, the system investigates differences by analyzing numerous sources, understanding business context, and flagging only what actually needs human attention.
Outcome: What took 8 hours/month previously now takes 2 with better accuracy.
Adaptive Process Routing
Traditional automation treats all cases alike. Agentic systems understand context.
I remember one of the vendors in the payment gateway, who were processing about $2M/month in volume. They had this complicated spreadsheet for suspicious order flagging. It had like 30 various rules—order value, ship-to address, customer history, hour of day, IP location, all that.
The problem? All of the flagged orders were treated the same way: put into a manual review queue. A $5,000 order by a new customer at 2am was given the same priority as a $50 order where billing and shipping addresses didn't match.
Their employees were burdened. Large-dollar legitimate orders were being held up (angry customers), and outright fraud was in the same queue.
Customer query comes in with agentic routing:
Categorize inquiry type (billing, feature request, urgent support)
Identify best sources of information
Tune response approach to customer segment
Route accordingly or handle end-to-end
Not only quicker—smarter. Makes the same judgment decisions your best team members would, at scale.
How to Actually Do This
Start with High-Volume, High-Variance Processes
Ideal candidates:
Ginormous volume (100+ items/week)
Require judgment because they vary
Come from many data sources
Well-defined success criteria, variable paths
Shared starting points: AP/AR, customer onboarding, multi-step approvals, compliance documents.
Since my payment gateway days, the sweet spot was always those processes with that blend of high volume and high variation. Payment reconciliation was perfect—hundreds of transactions per day, but each merchant had slightly different settings. Customer disputes was another—typical pattern, but each case had special facts.
Build for Transparency
You need to be able to observe what the system is doing:
Decision logs showing data sources used
Confidence scores on auto-decisions
Clear escalation paths
Standard audit processes
Transparency is worth more than perfect precision.
This was crucial with our traders. The ones who relied on their automation the most weren't the ones who had the most accurate systems—they were the ones who could look at exactly what the system was doing and why.
egr ate With Your Existing Stack
Work with your tools, don't duplicate them. Most platforms have collaborative tools like Slack, email, Google Sheets, HubSpot, Notion, and Google Calendar.
Something the payment gateway did teach us: SMBs won't swap out their core systems for your solution, no matter how awesome it is. They have QuickBooks, Shopify, whatever. You'll have to work with what they have.
Monitor ROI
Establish baselines and monitor:
Time per process (hours/week)
Error rates and cost of corrections
Process completion time
Frequency of manual intervention
Average ROI timeline: Positive 3-6 months, accelerating as you scale.
Common pitfalls
Trying to Automate Everything At Once
Start with one process, show value, then scale. This builds team confidence.
I saw so many traders at the gateway attempting to automate all at once. They'd buy some expensive tool, implement it for three months, and then not use it because it was too complex. The ones who succeeded started with one cumbersome process—usually invoice handling or reconciliation—showed it worked, then scaled.
Ignoring Change Management
Talk about as "task-automation, not job automation." These systems automate repetitive labor so humans get to work on strategy and relationships.
Inadequate Data Quality
Make source data available and in structure. The system can manage differences but cannot solve down-to-the-bone-broken data.
One customer I consulted for wanted to automate their accounts receivable, but their data was in disarray. Customer three different ways, invoices with no PO number, payment terms buried in email threads. We spent two months cleaning their data before there was even a chance of automating.
Getting Started
Identify highest-impact process: High volume, significant variation, clear ROI
Map your data landscape: Document systems and data sources
Establish success metrics: Measure current baseline performance
Pilot a process: Embody the concept
Build team expertise: Develop capability to work with AI workflows
Scale thoughtfully: Scale up on pilot insights
FAQs
Why is this new?
Classic automation is rule-based. Agentic workflows determine contextually—addressing the 30% of cases that break normal automation.
What's a reasonable success rate?
Expect 60-80% man-hour reduction. Not 100%—these systems are still subject to supervision—but markedly better than the normal 70% coverage.
Which processes do we automate first?
Start with high-volume judgment processes: AR, invoices, customer onboarding. Avoid low-volume, high-complexity processes first.
In my experience, the best place to start is whatever process your ops team complains about the most. If they are spending hours a week on some clunky repeat process that takes just a bit of judgment every time, then that's your contender.
How long before we begin to realize ROI?
Most teams will begin to get positive ROI in 3-6 months.
Do we need to replace the tools we're using now?
No. Agentic workflows integrate with your existing stack. They add to your tools, not substitute them.
What about errors?
Agentic systems will err. But with proper transparency (decision logs, confidence scores, audit processes), error rates tend to be less than manual processing.
On the payment gateway, I learned that absolute accuracy is not the goal—it's accuracy relative to people but far, far faster and you get to trace how you arrived. How technical is it? Less than conventional automating. You instruct it in plain language what you want instead of having to program the rules. But you do require somebody who understands your procedures.
Can it handle financial tasks?
Yes. These systems excel at thinking across multiple sources of data—accounting systems, contracts, budgets, approval chains.
Integrations?
Most platforms support key tools. Verify individual integrations prior to purchase.
How do we know it's successful?
Track time saved per process, reduction in errors, frequency of manual action, and process completion times. Compare against your baselines.

