ai-agents

Scaling AI-Generated Materials Lists Across 7+ Trade Services Without Hallucinating Products

Ilya PrudnikauMarch 21, 2026 12 min read

ai-agentsprompt-engineeringstructured-datacase-study

Scaling AI-Generated Materials Lists Across 7+ Trade Services Without Hallucinating Products

AI is brilliant at generating text. Ask it for a blog post, a summary, or a customer email — you'll get something usable in seconds. But ask it to generate structured data for a domain-specific business? That's where things fall apart.

We learned this the hard way while building an instant price estimation platform for trade businesses. The system needed to auto-generate materials and labour lists when a tradesperson completed onboarding. No uploads, no spreadsheets. Just tell us your industry and services, and we'll populate your pricing database with realistic items, correct units, and sensible prices.

Sounds straightforward. It wasn't.

The Problem: AI Loves to Hallucinate Products

Here's what happens when you give a large language model a simple prompt like "Generate a materials list for an electrical business that installs EV chargers and does switchboard upgrades":

You get EV chargers on the list. Solar panels. Air conditioning units. Switchboard enclosures priced at $2,500 each.

These aren't materials. They're products — items a business sells to customers, not consumables used during installation. The AI doesn't understand the distinction between a $12 cable gland and a $3,000 EV charger. Both are "related to EV charger installation," so both end up on the list.

This isn't a minor inconvenience. These materials feed directly into our AI estimation engine. If a $3,000 charger appears as a "material," every estimate balloons by thousands of dollars. The entire pricing model breaks.

We needed the AI to generate accurate, granular, domain-specific structured data — and we needed it to do this reliably across multiple industries and dozens of service types.

The Solution: 60% Constraints, 40% Instructions

After weeks of iteration, we arrived at a counterintuitive insight: the majority of an effective structured-data prompt isn't telling the AI what to do — it's telling it what NOT to do.

Our final production prompt for materials generation is substantial. Roughly 60% of it is constraints, guardrails, and exclusion rules. Here's the approach we took.

1. Product Exclusion Rules

The single most important rule in our prompt:

DO NOT include physical products that would be sold to customers
(e.g., EV chargers, solar panels, air conditioners, appliances).

ONLY include consumable materials, installation components, and labour rates.

We reinforce this rule multiple times in the prompt — in the system message, in the user message, and in a negative examples section. Redundancy isn't elegant, but it works. Without reinforcement, the model occasionally slips a solar panel or heat pump into the list.

We also maintain a strict type taxonomy, mapping AI-generated types to a normalised set. For example, when the model returns an industry name like "electrical" as the item type (instead of "material"), our post-processing layer catches and corrects it automatically.

2. Dynamic Scaling Based on Service Count

A tradesperson who only offers one service doesn't need dozens of materials. Someone offering seven services does. Static item counts produce either sparse lists or bloated ones.

We implemented a scaling approach that adjusts generation targets based on the number of services the business offers. Fewer services → fewer items. More services → proportionally more items.

Why ranges instead of exact counts? Because exact counts create pressure to pad. If you ask for exactly 30 items and the model naturally produces 24 good ones, you get 6 filler entries. Ranges let the model stop when the list is comprehensive.

3. Granular Unit Pricing

This was a subtle but critical rule. Early versions of our materials lists had entries like:

Sarking/underlay: $180/roll
Paint: $65/can
Concrete mix: $11.50/bag

These prices are technically correct, but they're useless for estimation. If a roofing job needs 47 square metres of sarking, how does the estimation engine calculate cost from "$180/roll"? How many square metres is a roll?

The fix: always price at the smallest practical unit of measurement.

For example:

Sarking/underlay → price per sqm, NOT per roll
Paint → price per litre, NOT per can
Cable → price per metre, NOT per roll

This lets our estimation engine do simple multiplication: 47 sqm × $4.50/sqm = $211.50. No unit conversion required.

4. Unit Normalisation

Even with strict unit instructions, the AI produces variations. Sometimes it returns "meter", sometimes "metres", sometimes "m", sometimes "per length". All mean the same thing.

We built a normalisation layer that maps dozens of variations to a canonical set of units. Empty or null values default to a sensible fallback. Unrecognised values are flagged rather than silently accepted, so we can catch new variations we haven't mapped yet.

5. Localised Language

Our platform serves a specific regional market. Using American English spellings like "labor" instead of "labour" or "meter" instead of "metre" erodes trust — it signals the content was generated by an AI model not built for their market.

We enforce regional spelling in the prompt and handle residual variations in the normalisation layer.

Want us to build this for you?

We've done this 70+ times — from concept to production-ready AI product in 2–4 weeks.

Book a Free Consultation

What We Learned

1. Treat AI-generated structured data like user input. Validate everything. Normalise everything. Never trust the shape or content of the response, even when the prompt is airtight.

2. Negative examples outperform positive examples. Telling the AI what NOT to include works better than listing what to include. The model generalises from negative examples more reliably.

3. Redundancy in prompts isn't a code smell — it's a reliability strategy. Repeating critical rules in multiple places reduces the failure rate by a measurable margin.

4. Post-processing isn't a fallback — it's a requirement. No prompt is 100% reliable. Production systems need deterministic validation on top of AI generation.

5. Domain expertise is the moat. Anyone can call an AI API. The value is in knowing the domain — what units to use, what types are valid, what prices are realistic. This knowledge lives in the prompt, the normalisation maps, and the scaling logic — not in the model.

The Bigger Picture

We've applied these same principles across our entire platform: form generation, estimation calculations, email templates. The pattern is always the same: use AI for reasoning and generation, but wrap it in deterministic validation, normalisation, and business logic. The AI is the engine, not the chassis.

At IT Flow AI, we build AI systems that handle real-world complexity — not just demos. If you're building AI for a domain-specific use case, book a free consultation at itflowai.com/book.

FAQ

Q: Why not just let users upload their own materials lists instead of generating them with AI? A: We support uploads too. But many users completing onboarding don't have a ready-made list. AI generation provides a working starting point they can edit, reducing onboarding friction from hours to minutes.

Q: How do you handle industries you haven't seen before? A: We support predefined industries plus a custom "other" category. For custom industries, the AI generates materials based on the services the user enters. The normalisation and validation still apply — the constraints are industry-agnostic.

Q: How do you prevent the AI from generating unrealistic prices? A: The prompt specifies realistic regional prices, and the model is reasonably accurate for common trade materials. However, prices are always editable by the user. The generated list is a starting point, not a locked-in database.

ai-agents

AI Agents in Production: What We Learned Building 16+ AI Products

We built 16+ AI agent systems for real businesses. Here are the hard-won lessons about architecture, deployment, and keeping AI reliable in production.

Mar 10 12 min

Book a Free AI Consultation