Building AI for Industries That Don't Exist on the Internet

Arjun S

July 30, 2025

mins read

In this article

I want you to take the example I will share in the next paragraph and really think about the engineering challenge here.

Nike is about to invest $50 million in a new running shoe. But first, they need answers from 10,000 consumers across 12 countries.

A survey programmer at a market research agency is writing code to make this happen.

But she's not using Python or JavaScript.

She's programming in Decipher; a proprietary language with barely any Stack Overflow posts, GitHub repos, and exactly zero presence in GPT-4's training data.

Her survey code must handle conditional logic like:

If age < 35 AND runs_per_week > 3, show questions A through F
If purchased_shoes_last_year < 2, branch to section X

Meanwhile, enforce hard quotas: exactly 300 responses per age group, balanced across income levels, from specific sources.

One bug in this routing logic, and Nike bases a $50 million decision on corrupted data.

Here’s the fun part - every product you use; from your iPhone to your breakfast cereal; was shaped by market research.

Yet the entire industry runs on proprietary tools and domain-specific languages that might as well not exist as far as modern AI is concerned.

Now imagine trying to build an AI assistant for her.

You quickly discover that ChatGPT, despite its brilliance at writing React apps, has no idea what Decipher syntax looks like.

This is the fascinating engineering challenge we're solving for at Metaforms.

The Paradox of a Tech-Enabled Traditional Industry

Market research is simultaneously sophisticated and archaic.

Agencies use advanced statistical methods, deploy surveys to millions of respondents globally, and generate insights that drive billion-dollar business decisions.

Yet the core tools they rely on were often written decades ago.

Survey programming happens in tools like Decipher or Confirmit with their own proprietary DSL syntaxes.
Data processing relies heavily on SPSS—statistical software that's been around since 1968.
Some agencies still use tools so old they only run on 32-bit Windows systems.
File formats are proprietary, workflows are desktop-centric, and much of the tribal knowledge exists only in the minds of senior practitioners.

This creates a unique situation for the industry: it needs automation but operates with systems that modern AI barely understands.

The Data Problem: When LLMs Meet Legacy Workflows

When we started building AI agents for market research, we quickly discovered that large language models, despite their impressive capabilities on general tasks, were limited in this domain.

They could write elegant React apps with fully functional python backends, but ask it to generate Decipher survey logic or troubleshoot SPSS syntax errors, and it would struggle.

This isn't because market research problems are inherently harder than software development. It's because the knowledge simply isn't in the training data.

Survey programming languages, data processing workflows, statistical methodologies, quality control procedures—none of this made it into the internet corpus that trained these models.

The fundamental challenge became: How do you build AI agents for a domain where off-the-shelf models have limited knowledge?

Our first instinct was to fine-tune existing models, but we quickly realized we lacked the high-quality training data needed. Customer data was off-limits for obvious privacy reasons, and we couldn't find sufficient public datasets that captured the nuances of real market research workflows.

This led to our first unconventional decision: we established a dedicated office in Mumbai focused entirely on domain expertise. We hired senior survey programmers, data processing specialists, and others—as consultants and advisors who could help us understand the industry from the inside.

These experts became our "ground truth" generators. They would walk through real project scenarios, explain their decision-making processes, and help us create synthetic datasets that captured the complexity of actual market research work. More importantly, they became our early adopters, stress-testing our AI agents against real-world requirements and edge cases.

This pragmatic approach taught us something crucial: building AI for specialized domains isn't just about better algorithms—it's about deeply understanding the cognitive patterns of domain experts.

The Human-AI Collaboration Challenge

While full AI autonomy is certainly our long-term goal, current AI models aren't quite there yet for the complex, high-stakes workflows in market research. But we know we're getting closer with every new model release. The challenge for us is building the right abstraction at every stage of this evolution.

What might be optimal for a fully autonomous agent isn't necessarily best for an agent that requires handholding or guidance. After every major model release, the UX often needs to be reimagined from the ground up, not just incrementally improved.

This is especially critical when building products for automation and efficiency improvements in workflows that are highly time-sensitive and deadline-driven. The UX for interacting with AI needs to be less exploratory and more systematic—almost like a playbook that can be followed consistently.

Consider this example: many AI tools like ChatGPT have a "Think Longer" mode that users can toggle. But when building an AI copilot for a survey programmer working under tight deadlines, you have to ask fundamental questions: How would this person know when to choose that mode versus not? If the decision criteria were deterministic enough, why doesn't the tool make that choice automatically? If it's not deterministic, how would the survey programmer decide when to use it? Do they have to try both modes each time and compare results? That approach doesn't make sense for professional workflows where time is money and decisions need to be made quickly and confidently.

These are the fun UX problems we get to solve at Metaforms—designing human-AI interaction patterns that evolve with rapidly improving AI capabilities while serving the real-world constraints of professional workflows.

Real-World Performance: Lessons from Production

After deploying our agents with several market research agencies, we've learned some crucial lessons that have shaped our engineering approach:

Build High-Quality Tools for AI Agents: We realized that giving our agents effective tools to solve problems is just as important as the AI models themselves. For an AI agent that does survey programming, the agent should be able to run lint checks on the scripts it writes to catch and correct syntax errors automatically. It should be able to generate test survey links and simulate user interactions—clicking through the survey flow to verify logical branching and catch routing errors.

Study Expert Workflows Exhaustively: Oliver Wendell Holmes said, "I would not give a fig for the simplicity that lies on this side of complexity, but I would give my life for the simplicity that lies on the other side of complexity." This perfectly captures our approach to building AI for market research. Survey programmers have developed subtle techniques over decades—specific ways they structure code for maintainability, particular validation patterns they use for different question types, edge case handling that isn't documented anywhere. We spend significant time observing, interviewing, and learning from these experts before building our abstractions. Understanding the intrinsic complexity of the domain is essential before attempting to simplify it through automation.

Invest Heavily in Data Quality: Especially as models progress and become exceptional at instruction following, we’ve seen that the bottleneck quickly shifts to dataset quality. Small inconsistencies and errors lead to large deviations in output quality. There are several steps you can take to minimise this - manual verification by domain experts, AI-assisted verification systems that cross-check outputs against multiple quality criteria, and continuous monitoring for edge cases that agents haven't encountered before. The investment in quality infrastructure is substantial, but we’ve repeatedly learnt how important it is.

What's Ahead: From AI Agents to AI Crew

We've successfully built AI agents for survey programming and data processing, but our vision extends far beyond individual tools. We're working toward an integrated AI crew that can handle the complete market research lifecycle—from initial project scoping to final insight delivery.

Imagine a future where multiple specialized agents collaborate seamlessly:

Bidding Manager Agent: Analyzes RFPs (Request for Proposals), estimates project complexity, and generates competitive bid responses with detailed methodology sections.

Project Manager Agent: Orchestrates project timelines, identifies potential bottlenecks, and coordinates resource allocation across multiple concurrent studies.

Research Manager Agent: Designs sampling strategies, choose appropriate methodologies, write survey questionnaires and discussion guides, ensures statistical validity requirements are met, and finally, identify and report insights.

Charting Agent: Create visualisations and charts from given cross tabulations based on report storyline and branding guidelines.

Each agent would have deep specialization in its domain while seamlessly collaborating with others. The project manager would understand what the survey programmer discovered about sample complexity. The data processor would know what quality controls were implemented during survey design. And so on…

This creates fascinating distributed systems challenges ahead of us:

State Management: How do you maintain consistency across agents when project requirements change mid-stream? What happens when the client requests survey modifications that affect sampling strategy?

Error Propagation: If the survey programming agent makes an optimization that inadvertently affects data processing requirements, how do downstream agents detect and adapt to this change?

Context Preservation: How do you ensure that critical project context (client preferences, methodological constraints, budget limitations) propagates appropriately through the agent pipeline?

Failure Recovery: When one agent encounters an edge case it can't handle, how do you gracefully hand off to human experts while preserving the work already completed?

Workflow Orchestration: Different project types require different agent collaboration patterns. A brand tracking study follows a different workflow than a concept testing project or a segmentation study.

The technical complexity multiplies when you consider that different agencies have different workflow preferences, different quality standards, and different client requirements. We need to build systems that are both standardized enough for reliable automation and flexible enough for agency-specific customization.

The Broader Impact: Patterns for Industry Transformation

While we're focused on market research, the engineering patterns we're developing have broader applicability to other traditional industries ready for AI transformation.

Domain-Specific Data Generation: Most specialized industries face the same challenge we encountered—their workflows and knowledge aren't well-represented in general AI training data. The approaches we've developed for creating high-quality synthetic datasets, working with domain experts, and iterative refinement could be applied to legal document review, medical diagnosis, financial analysis, or any field where expertise is tacit and distributed.

Expert-in-the-Loop Systems: The balance between AI autonomy and human oversight that we've achieved in market research translates directly to other professional services. Accountants, consultants, analysts, and other knowledge workers face similar challenges: they need AI to handle routine tasks while maintaining control over nuanced decisions.

Legacy System Integration: Many industries rely on decades-old software with proprietary formats and non-standard workflows. The techniques we've developed for parsing legacy data formats, translating between different system languages, and building APIs around closed systems could accelerate AI adoption across numerous traditional industries.

Workflow Orchestration: Professional services often involve complex, multi-step processes where different specialists hand off work between stages. The multi-agent coordination patterns we've developed could be adapted to legal case management, medical treatment planning, financial advisory processes, or any domain that requires specialized expertise at different stages.

The Human Element: Building AI That Augments Rather Than Replaces

One of our core philosophical principles is that the goal isn't to replace market research professionals, but to amplify their capabilities. The most successful AI implementations we've seen don't eliminate human judgment—they free humans to focus on higher-value activities.

A survey programmer using our AI can spend less time on routine syntax debugging and more time on strategic questionnaire design. A data processor can focus on interpreting statistical patterns rather than cleaning data formatting issues. A project manager can concentrate on client relationship management while the AI handles scheduling optimization.

This human-centric approach has influenced our technical architecture in important ways. Our agents are designed to be transparent about their limitations, to gracefully hand off complex edge cases to humans, and to learn from human corrections over time.

Engineering Culture: First Principles and Pragmatic Execution

Building AI for a traditional industry requires a different engineering mindset than building consumer applications or enterprise SaaS products. You can't rely on standard patterns, existing libraries, or well-documented APIs. You have to solve problems from first principles while maintaining practical focus on real business value.

Our team culture reflects this balance. We're deeply curious—we dive into complex workflows and systems and exhaustively understand them to then automate them. But we're also intensely pragmatic—every technical decision is evaluated against real-world usage patterns and measurable business outcomes.

When we realized that data quality was a bottleneck, we hired domain experts, built ground-truth datasets, and created feedback loops with actual users. When we discovered that perfect AI autonomy wasn't the goal, we redesigned our entire UX paradigm around human-AI collaboration.

Why This Matters: The Intersection of AI and Domain Expertise

I believe we're at an inflection point where AI capabilities are finally sophisticated enough to tackle complex professional workflows, but general-purpose models aren't sufficient for specialized domains. The real value lies at the intersection of cutting-edge AI techniques and deep domain expertise.

The frontier AI labs are solving important general problems—language understanding, reasoning, code generation. But there's enormous opportunity in applying these capabilities to specific industries with complex, established workflows and high-value professional activities.

Market research is just one example. Legal research, medical diagnosis, financial analysis, engineering design, scientific research—all of these domains have similar characteristics: they require deep expertise, involve complex workflows, generate high business value, and have been underserved by generic AI solutions.

The engineering challenges in these domains are fascinating: How do you build reliable AI systems when the cost of errors is measured in real business impact? How do you integrate AI capabilities with decades of established professional practices? How do you design human-AI interfaces that enhance rather than disrupt expert workflows?

These aren't just technical problems—they're sociotechnical challenges that require understanding both the technological possibilities and the human realities of professional work.

The Road Ahead

We're still early in our journey at Metaforms, but the problems ahead are incredibly exciting. As AI capabilities continue to advance rapidly, we have the opportunity to fundamentally transform how professional services operate.

The next wave of challenges will involve deeper statistical reasoning, more sophisticated human-AI collaboration, and expansion into additional workflow areas. We're exploring how to build AI systems that can reason about experimental design, detect subtle data quality issues that humans miss, and generate insights that surprise even experienced researchers.

But the most interesting problems aren't just technical—they're about understanding how AI can best augment human expertise in complex professional domains. This requires not just building better algorithms, but developing new paradigms for human-computer collaboration in high-stakes environments.

If you want to work on AI challenges that require deep domain understanding, build systems that handle the complexity of professional workflows, and see your code directly impact how billion-dollar decisions get made—let's connect.

The challenges are hard, the domain is fascinating, and the impact is immediate. We're not just building impressive demos—we're creating AI systems that solve real problems for real businesses, with measurable impacts on how professional work gets done.

We're actively hiring AI engineers, full-stack developers, and technical product managers. If these problems sound interesting, let's talk about what we're building and where we're headed.

View All Blogs