LLMs in production are nothing like LLMs in demos. We integrated AI into Intellis ERP, CrewHRM, and BookMyDoctor — and learned hard lessons about latency, hallucination, user trust, and when AI genuinely helps versus when it's just theatre.
Everyone is adding AI to their products in 2025–2026. The demos look impressive. The announcements get LinkedIn engagement. The reality of running AI features in production, serving real users in Bangladesh, is considerably more sobering.
We integrated AI into three products: Intellis ERP (an AI assistant for financial analysis), CrewHRM (an AI-powered performance review summariser), and BookMyDoctor (a symptom-based appointment routing feature). Here's what actually happened.
A natural language query interface for ERP financial data. The user types "What were our five biggest expenses last quarter and how did they compare to the same quarter last year?" and gets a structured answer with a mini chart.
The core use case — answering ad hoc financial questions that would otherwise require pulling a report, exporting to Excel, and spending 20 minutes on analysis — worked well for straightforward questions. Senior finance staff adopted it quickly. Time to insight for common questions dropped from ~25 minutes to under 60 seconds.
Ambiguous questions produced confident but wrong answers. "Show me underperforming products" requires defining "underperforming" — but the model would make an assumption and present it as fact without surfacing the assumption. We had one instance where a sales manager used an AI-generated analysis in a meeting without realising the comparison period was wrong.
We added explicit assumption display ("I've assumed this means sales below last year's average for the category") and a confidence indicator. This helped, but it also revealed that many users skipped reading the assumption when they were in a hurry — the exact situation where the assumption matters most.
API latency to Claude and GPT-4 from Bangladesh is higher than from the US or Europe — typically 800–1,500ms for a simple query, up to 4–5 seconds for complex ones. For a feature that replaces a 25-minute process, 5 seconds is fine. But the user experience of waiting felt slow compared to the rest of the application. We added streaming responses, which helped the perceived latency significantly.
Managers in CrewHRM complete structured performance reviews for each direct report. The AI feature takes the structured inputs — ratings, free-text comments, goal progress — and generates a draft narrative summary for the HR record.
This was our most successful AI feature. The use case was clean: take structured data, produce structured narrative. The output quality was consistently good. Managers spent 5–8 minutes reviewing and editing the AI draft rather than 20–30 minutes writing from scratch.
Adoption was high because it solved a real pain point — managers universally find performance review documentation tedious — without adding new complexity to the workflow.
Two cultural issues we hadn't anticipated. First, some Bangladeshi managers were uncomfortable with AI-generated text in formal HR records — the concern was about authenticity and defensibility if the review was ever challenged. We added a clear "AI-assisted draft" watermark and editing trail.
Second: tone calibration for South Asian professional context. The AI would generate reviews with direct critical feedback phrased in ways that felt harsh in Bangladeshi professional culture, where critical feedback is typically more indirect. We added a tone parameter and made "constructive" the default.
When a patient books an appointment on BookMyDoctor, they can optionally describe their symptoms. The AI suggests the most relevant specialty based on the description, to help patients who don't know whether they need a cardiologist or a gastroenterologist.
This feature was removed from production after six months. The problems were fundamental:
We replaced it with a simple symptom-to-specialty lookup table — hand-curated, medically reviewed, no AI — that does 90% of what the AI version did with zero hallucination risk.
2 of 3
AI features still in production
after 12-month review
61%
User adoption (ERP assistant)
of active finance users
22 min
Time saved per review (CrewHRM)
per performance review
1:3
Time to build vs maintain
ongoing maintenance is real work
Work With Us
From ERP to HealthTech to custom SaaS — we partner with businesses that want software built properly.