CRO

Full-Stack Testing With APIs and MCP Servers: Convert's Practical Playbook + 12 Plug-and-Play Node.js SDK Templates

2026-02-09 Uwemedimo Usa

Originally published by Convert

How often do your biggest product decisions go untested because full-stack feels too costly to spin up?

Pricing logic, eligibility rules, feature access, rollout behavior. These decisions live deep in the stack, compound quickly, but they rarely get the same attention as surface-level changes.

Meet Convert's new full-stack testing templates. They package common decision-level experiments into plug-and-play patterns that work directly in your stack.

Is your site converting as well as it could be?

Get a CRO Audit - $99

This guide walks through what's included, who these templates are for, and how to start testing across the stack in a way that's practical from day one.

Introducing: Convert's Full-Stack Testing Templates

Our new full-stack testing templates turn common decision-level experiments into plug-and-play patterns that run directly in your stack.

Each template comes preconfigured with SDK logic, event tracking, and guardrails, so you can move from test idea to setup in minutes.

This first release focuses on the experiments teams tend to reach for early as they push experimentation upstream, spanning across growth stages. The library will continue to grow as full-stack testing becomes part of how teams ship and learn.

Get Started with the Full-Stack Templates

Jumpstart your full-stack testing program with Convert's new plug-and-play templates. Get started with ideas and the production-ready code snippets to launch them.

Why Convert Is Bullish on Full-Stack Testing

The impact of UI improvements fades as decisions move deeper into the stack. Pricing logic, personalization rules, eligibility checks, search ranking, and feature access are resolved before a page even loads. Testing only the UI means testing after those decisions have already been made.

Convert's bet on full-stack testing starts there.

Learn More:Client-side Testing vs. Server-side Testing

As George, Lead Architect at Convert and Product Owner of Convert Full-Stack, explains:

The future of testing is about validating the entire customer journey from the server to the screen. At Convert, we are bullish on full-stack because it moves testing from a 'frontend polish' exercise to the core of our decision-making process.

George Crewe

But what really is full-stack testing? And why should you care about it?

What Is Full-Stack Testing?

Full-stack testing is experimentation that runs across the customer journey from the server to the screen, allowing teams to test backend decisions such as logic, APIs, rules, and algorithms alongside the user experience.

Instead of limiting experiments to visible UI changes, full-stack testing extends it to the systems that determine what users see, get, and experience across channels.

Learn More: Full-Stack Experimentation: A Comprehensive Guide

Full-stack ecosystem

It covers:

Server-side or edge-side variation assignment, where experiments are decided before content reaches the browser or app.
SDK-based implementation that integrates experiments directly into backend services using native SDKs.
API-driven testing for orchestration and data flows. APIs control experiment logic, variation delivery, and event logging.
Experimentation on backend logic for testing pricing rules, search ranking, feature eligibility, personalization logic, etc.
Consistent delivery across channels (web, mobile apps, email, and offline touchpoints).

As experimentation moves deeper into systems, APIs, and orchestration layers, we're intentionally building in this direction to support how modern teams test, learn, and scale decision-making.

This opens up great opportunities to unlock new levels in experimentation program maturity with APIs and MCP servers.

API-driven Experimentation Improves Control and Data Integrity

Running experiments through an API opens up what you can automate, monitor, and operationalize in your experimentation program. That's why we built Convert API v2 - to give you programmatic access to everything in your account: account data, project configuration, experiences, audiences, goals, reporting, and live data.

Here's what that unlocks:

Pipe results into your own dashboards and pull aggregated and daily reports where you're most comfortable
Automate experiment operations such as updating traffic allocations over time, pause or start experiments on a schedule, trigger notifications when milestones are hit, etc.
Programmatically manage projects, collaborators, and domains, verify tracking codes, and fetch live logs for quick debugging
Pull change history, see when changes happened, and keep an audit trail for internal accountability

And for full-stack testing specifically, it also changes how you handle variation assignment and data collection.

Instead of relying on client-side scripts that execute after page load, you can assign variations earlier in the request lifecycle. This reduces visual inconsistencies, avoids flicker, and produces cleaner data when performance, latency, or ordering matters.

MCP Servers Make Experimentation Operational At Scale

As experimentation ventures beyond the UI, teams need better ways to manage complexity.

MCP (Model Context Protocol) servers act as connectors between experimentation workflows, internal tools, and AI systems. They make it easier to automate repetitive tasks such as experiment setup, QA checks, analysis retrieval, and guardrail monitoring, without hard-wiring brittle integrations.

What this means is that you can open an AI assistant like Claude, ask it to analyze results from your latest experiment, surface insights, and even help configure a follow-up test using live Convert data through the MCP server.

Convert's MCP Server connects your Convert data directly to any MCP-compatible AI assistant and exposes Convert's REST API v2 through a consistent interface, adds built-in knowledge from Convert documentation, and keeps access controlled through clear permission levels.

You can then, in natural language, ask it:

"Show me the performance of all active experiments."
"Summarize experiments with conversion rate above 5% in the last 30 days."
"Generate a summary report of completed experiments this quarter."

And get answers as you would in a typical AI conversation, right in your IDE.

You can also create tests, update traffic splits, and pause experiments.

How to run Convert MCP Server

Install/run locally: 'npx @convertcom/mcp-server@latest' (requires Node.js 20+)
Start with 'TOOLS_FOR_CLIENT=reporting' for insight workflows
Escalate to 'readOnly' or 'all' when operational control becomes useful

{
  "mcpServers": {
    "convert": {
      "command": "npx",
      "args": ["-y", "@convertcom/mcp-server@latest"],
      "env": {
        "CONVERT_APPLICATION_ID": "your_app_id",
        "CONVERT_SECRET_KEY": "your_secret_key",
        "TOOLS_FOR_CLIENT": "all"
      }
    }
  }
}

Learn More: How to Use the Convert MCP Server

Worried about its security? Don't be. Requests are signed (HMAC-SHA256), expire quickly, and credentials stay local. Also, permissions are narrow by default and expand only when needed.

How the Convert SDK and MCP Server work together

The Convert Full-Stack SDK runs inside your application. It's responsible for assigning variations, delivering experiences consistently, and tracking conversions in production code.

The Convert MCP Server, on the other hand, connects your experimentation data to AI assistants like Claude or Cursor. It gives you a conversational way to work with your experimentation data and configuration.

You use the MCP Server to plan, analyze, and guide experimentation decisions, and then you rely on the SDK to execute and measure those decisions reliably inside your product.

None of This Means Full-Stack Testing Cancels Out UI Testing

UI testing still has a solid place in experimentation, and Convert continues to support it. We are identifying that UI testing is limited to the decision surface.

Full-stack testing (emphasis on "full") extends experimentation to the decisions that shape every channel, touchpoint, and outcome.

The 12 Most Popular Convert Full-Stack Templates (With Step-by-Step Implementation)

The jump from full-stack testing theory to actual execution feels expensive. You're worried about new SDKs, backend coordination, unfamiliar workflows, and unclear blast radius. But don't fret, these Convert full-stack templates make it easier to get it right.

Each recipe packages a common full-stack experimentation pattern into a repeatable implementation. This way, you're starting from a known, working structure.

Below are 12 of the most commonly used full-stack templates, based on real usage patterns across Convert customers. Each recipe includes what it's for, how it works, and when to use it.

How we selected these templates

This first release of Convert's full-stack template library reflects the experiment patterns we've seen teams request most as they adopt full-stack testing.

While the catalog is new and expanding, this roundup focuses on broadly applicable templates that span activation, monetization, retention, and feature adoption, and that rely on backend or API-level decision-making. It's a practical starting point. We plan to release more specialized patterns as the library grows.

Activation

● Free trial length test
● Product onboarding walkthrough test
● First project default
● User role onboarding

Monetization

● Pricing page test
● Decoy price tier test
● Annual discount anchor test
● Feature metering threshold

Retention

● Re-engagement nudge test
● Notification digest frequency test

Feature Adoption

● Feature flag / kill switch
● New feature rollout

1. Free Trial Length Test

Use the Free Trial Length Test template

This template powers a full-stack experiment that tests different free-trial durations (7, 14, or 30 days) by assigning the trial length at signup, before any UI is rendered.

What you can test

Trial duration tied to user creation
Downstream conversion from trial to paid
Impact of shorter vs longer exposure on activation quality

How to implement the free trial length test template

The code below shows a complete implementation of the Free Trial Length Test template using Convert's Full-Stack SDK in a Node.js environment.

Start by downloading the preconfigured JSON file for this template (available on the template's page) and importing it into your Convert project via Configuration > Import, which automatically sets up the experience and goal tracking.

In this example, the experiment assigns a trial length using the 'trialDays' value returned by the experience. That value is then persisted in your database to calculate the trial expiration date. This logic should run once, immediately after sign-up, so the decision remains consistent throughout the user's trial.

While the implementation code below is in Node.js, equivalent implementations are available in TypeScript right now. Support for Python, Go, and Java will follow as the template library expands.

Plain text

Copy to clipboard

Open code in new window

EnlighterJS 3 Syntax Highlighter

import { ConvertSDK } from ‘@convertcom/js-sdk’;

const convert = new ConvertSDK({

sdkKey: ‘YOUR_SDK_KEY’,

dataStore: null // Use built-in memory store

});

const context = convert.createContext({

id: user.id,

attributes: {

// Add user attributes here

}

});

const variationValue = context.runExperience(‘trial-length-test’, { userId: user.id });

// Implement logic based on variation

let trialDays = 14; // Default to control if logic fails

if (variationValue === 7) {

trialDays = 7;

} else if (variationValue === 30) {

trialDays = 30;

}

// Apply to the user object in your database

user.trialEnds = new Date(Date.now() + trialDays * 24 * 60 * 60 * 1000);

import { ConvertSDK } from ‘@convertcom/js-sdk’;

const convert = new ConvertSDK({
sdkKey: ‘YOUR_SDK_KEY’,
dataStore: null // Use built-in memory store
});

const context = convert.createContext({
id: user.id,
attributes: {
// Add user attributes here
}
});

const variationValue = context.runExperience(‘trial-length-test’, { userId: user.id });

// Implement logic based on variation
let trialDays = 14; // Default to control if logic fails
if (variationValue === 7) {
trialDays = 7;
} else if (variationValue === 30) {
trialDays = 30;
}
// Apply to the user object in your database
user.trialEnds = new Date(Date.now() + trialDays * 24 * 60 * 60 * 1000);

Other templates in the library follow the same core flow:

● Initialize the Convert SDK (here's how to integrate Convert's full-stack SDK)
● Choose the template you want to run
● Download the template's pre-configured JSON file and import it into your Convert project
● Create a persistent user context (the user ID should remain stable across sessions)
● Run the experience and track the relevant conversion event

Each template includes its own implementation notes, example code, and setup steps, available when you click the "Use this template" link below each one.

Learn More: Integrating Convert.com’s Full-Stack SDK and Web Testing for Enhanced A/B Testing

When to use the free trial length test template

Trial expiration, billing, or entitlement logic is enforced server-side
You want consistent trial behavior across web, app, and email

Real-world use

This pattern lets teams test trial strategy without UI hacks or race conditions. Trial length is decided once, stored once, and enforced everywhere.

Use the Free Trial Length Test template.

2. Product Onboarding Walkthrough

Use the Product Walkthrough Test template

This full-stack experiment template lets you determine which onboarding walkthrough a user sees. You can have variants by format (video walkthrough, tooltips).

What you can test

Walkthrough format effectiveness
Whether onboarding helps users or distracts
Activation impact of guided vs unguided starts

When to use it

Onboarding state needs to persist across sessions
Walkthrough logic is reused across surfaces
You want clean attribution for onboarding completion

Real-world use

Because the decision is made via full-stack logic, the same user always gets the same onboarding experience, regardless of device or session.

Use the Product Walkthrough Test template.

3. First Project Default

Use the First Project Default template

A server-side experiment that decides whether a new user gets a pre-populated "seed" project created automatically after signup.

What you can test

Blank slate vs guided first experience
Impact of pre-filled data on activation
Time-to-value for first-time users

When to use it

Activation depends on users completing a first action
Creating a default object requires backend logic
You want to avoid conditional UI hacks

Real-world use

This recipe tests a decision, not a layout. The UI simply reflects whether a project exists.

Use the First Project Default template.

4. User Role Onboarding

Use the User Role Onboarding template

This is the template for routing users into different onboarding flows based on their selected role (e.g. Developer, Marketer).

What you can test

Role-based onboarding paths
Personalized feature exposure
Activation differences by job-to-be-done

When to use it

Role selection happens early
Onboarding logic branches meaningfully
You want consistent routing logic across channels

Real-world use

Because routing happens server-side, role logic stays centralized and doesn't drift across frontend implementations.

Use the User Role Onboarding template.

5. Pricing Page Test

Use the Pricing Page Test template

A full-stack experiment template that tests different pricing page configurations, such as layout structure or which plans are highlighted, with the decision made before the page renders.

What you can test

Pricing page layouts
Highlighted or default plans
Tier emphasis based on user attributes

When to use it

Pricing logic depends on user context
You want consistent pricing behavior across sessions
Client-side swaps risk flicker or misattribution

Real-world use

This pattern keeps pricing decisions deterministic and repeatable, avoiding last-second UI overrides that distort revenue data.

Use the Pricing Page Test template.

6. Decoy Price Tier

Use the Decoy Price Tier template

This is for full-stack experiments that introduce or remove a high-priced "decoy" tier to influence plan selection behavior.

What you can test

Presence of a decoy tier
Tier ordering and positioning
Impact on middle-tier selection

When to use it

Pricing tiers are generated server-side
Entitlements and plans must stay in sync
You want clean purchase attribution

Real-world use

Because tier configuration is controlled centrally, pricing logic remains consistent across checkout, invoices, and billing systems.

Use the Decoy Price Tier template.

7. Annual Discount Anchor

Use the Annual Discount Anchor template

A full-stack experiment that tests how annual pricing discounts are framed, such as displaying a percentage off versus showing the absolute amount saved.

What you can test

"Save 20%" vs "Save $50" messaging
Discount framing on upgrade or pricing pages
Perceived value of annual plans

When to use it

Annual pricing logic is handled server-side
Discount calculations must stay consistent across surfaces
You want to influence plan selection without changing actual prices

Real-world use

This is 'perception testing', not a pricing test. The underlying billing logic remains untouched, while messaging adapts dynamically based on the experiment outcome.

Use the Annual Discount Anchor template.

8. Feature Metering Threshold

Use the Feature Metering Threshold template

A server-side experiment that tests different usage limits, such as API calls or project counts, before requiring an upgrade.

What you can test

Usage caps (e.g., 50 vs 100 units)
Upgrade trigger timing
Monetization pressure points

When to use it

Usage is tracked in backend services
Limits must be enforced reliably
Client-side checks would be bypassable

Real-world use

This recipe lets teams test monetization boundaries without hard-coding assumptions into business logic.

Use the Feature Metering Threshold template.

9. Re-Engagement Nudge

Use the Re-Engagement Nudge template

This test template helps you determine which re-engagement notification an inactive user receives.

What you can test

Message type (discount vs reminder)
Trigger logic based on inactivity
Return-to-product behavior

When to use it

Re-engagement depends on backend signals
Messaging logic must be coordinated across systems
You want consistent treatment per user

Real-world use

Because the decision is made server-side, users receive consistent messaging even if notifications are sent through different channels.

Use the Re-Engagement Nudge template.

10. Notification Digest Frequency

Use the Notification Digest Frequency template

This full-stack experiment tests how often users receive notification digests, such as daily versus weekly, based on a server-side scheduling decision.

What you can test

Daily vs weekly notification cadence
Engagement without fatigue
Optimal reminder frequency for retention

When to use it

Notification delivery is orchestrated server-side
Schedules are stored centrally
You want consistent behavior across channels and sessions

Real-world use

This approach lets teams test notification strategy without duplicating logic across cron jobs, queues, or messaging services.

Use the Notification Digest Frequency template.

11. Feature Flag / Kill Switch

Use the Feature Flag / Kill Switch template

This full-stack feature flag template controls whether new functionality is enabled, allowing it to be disabled instantly if issues arise.

What you can test

Gradual feature exposure
Risk-free rollout strategies
Fallback behavior when features are disabled

When to use it

New logic must be safely gated
Rollbacks need to be immediate
UI-only toggles are too risky

Real-world use

This recipe expands experimentation from growth tactic to safety mechanism. Feature exposure becomes reversible by design.

Use the Feature Flag / Kill Switch template.

Learn More: Exploring the Use of Feature Flags and Rollouts in Product Experimentation

12. New Feature Rollout

Use the New Feature Rollout template

This template allows you to test how a new feature is announced to users, such as via a modal, banner, or tooltip.

What you can test

Announcement format effectiveness
Feature discovery patterns
Engagement with newly released functionality

When to use it

Feature exposure should be controlled
Messaging logic must stay consistent
You want clean attribution for feature engagement

Real-world use

Because the announcement decision is made server-side, teams can coordinate launches across product, marketing, and lifecycle surfaces.

Explore the Essential Full-Stack Template Library Browse all plug-and-play templates, with more on the way. Each one is designed to help you move experimentation beyond the UI and into the systems that drive deeper decisions. Explore

Quick Tips to Get Started With Full-Stack Testing

If you're exploring full-stack testing, you're likely balancing curiosity with caution.

You want to test decisions that live deeper in your product, and you're also aware of how easily that can turn into architecture debates, blurred ownership, or weeks of setup.

Teams that get started and build momentum here tend to do less, more deliberately, at the start.

Begin with a decision your backend already owns, such as trial length, feature limits, eligibility rules, and pricing logic. These are familiar, contained, and already enforced server-side, which removes half the friction before you even start.

Pick a single surface to work on, either one backend service or one web flow, and keep everything else out of scope until you've seen a result in your own environment.

And, importantly, before changing any logic, make sure your signals are solid. Define what success looks like, add one guardrail metric, and confirm both fire reliably.

Then run something low-risk. Early full-stack experiments tend to work best when they're operationally boring (think: trial duration, feature exposure, announcement style, usage thresholds). Once you've shipped one clean experiment end-to-end, the rest tends to move faster with far less resistance.

Who full-stack testing fits best (right now)

Full-stack testing pays off fastest for teams where real leverage already sits behind the UI:

Product-led SaaS with trials, plans, usage limits, or gated features
Ecommerce teams making pricing, merchandising, or search decisions upstream
Marketplaces running ranking, matching, or allocation logic

If your product is mostly static pages with minimal personalization, UI testing may still suffice for a while.

Where teams usually stumble

These patterns keep showing up:

Ownership gaps: Growth wants the outcome. Product owns the logic. Engineering owns the system. When no one owns the experiment end-to-end, it drifts.
Logging as an afterthought: Backend logic changes without updating events. Suddenly, you're arguing about results instead of learning from them.
Trying to prove too much at once: Teams try to justify full-stack testing with a "big win" experiment. The complexity kills momentum.

The teams that succeed do the opposite. Small scope, clear ownership, and clean signals. That's it.

Once you've run a couple of these, full-stack testing stops feeling so humongous and more like a new default way of learning.

Conclusion: Experiment Where Your Product's Decisions Take Shape

As products grow more complex, the decisions that matter most tend to live deeper in the stack. These systems shape outcomes long before anything reaches your users' eyes.

The templates in this library are an on-ramp, a practical way to bring experimentation into those decision points while reducing setup time by reusing pre-wired logic for assignment, tracking, and guardrails.

Start small, prove the value, and expand as your experimentation program matures. Gradually, full-stack testing will become a natural extension of how your team learns, ships, and adapts.

Full-Stack Testing FAQs

Q1. What's the difference between client-side and server-side A/B testing?

Client-side testing changes what the user sees after the page loads. Server-side testing decides what the user gets before anything renders.

That distinction sounds subtle until you test something like pricing, eligibility, or feature access. Once the decision lives in the backend, trying to test it in the browser becomes fragile fast.

If the UI is just reflecting a decision that already happened, the experiment belongs upstream.

Q2. Does full-stack testing replace UI testing?

No. And it shouldn't.

UI testing is still the fastest way to learn about copy, layout, and interaction design. Full-stack testing picks up where UI testing stops. It covers the decisions that shape those interfaces in the first place.

Most mature teams end up running both, each where it makes sense.

Q3. What's an MCP server, in plain terms, and why does it show up in experimentation?

At a high level, an MCP server is a way to connect experimentation workflows with tools and AI systems through a shared context.

In practice, that means things like:

Automating experiment setup or QA checks
Pulling experiment context into analysis tools
Reducing one-off scripts that nobody wants to maintain

We have an MCP server (@convertcom/mcp-server) that exposes our API to AI assistants such as Claude, Cursor, etc. Once set up, you can ask things like “show me all active experiments” or “which variation is winning” directly in your IDE.

You don’t need MCP to run experiments. The SDK does that. MCP adds a conversational layer for managing and analyzing experiments.

Q4. Do I need engineers to run full-stack experiments?

At the beginning, yes.

Someone has to add the SDK, define where decisions happen, and make sure guardrails are in place.

The good news is that once those patterns exist, a lot of experimentation stops requiring constant engineering involvement. The upfront work pays off by reducing friction later.

Q5. How do SDKs change implementation and governance?

SDKs move experimentation out of brittle scripts and into real application code.

That has a few practical effects:

Logic is versioned and reviewable
Rollbacks are cleaner
Experiments don't drift across implementations

It's less about control and more about predictability. When experiments touch core systems, predictability matters.

← Back to Blog

Introducing: Convert's Full-Stack Testing Templates

Why Convert Is Bullish on Full-Stack Testing

What Is Full-Stack Testing?

API-driven Experimentation Improves Control and Data Integrity

MCP Servers Make Experimentation Operational At Scale

None of This Means Full-Stack Testing Cancels Out UI Testing

The 12 Most Popular Convert Full-Stack Templates (With Step-by-Step Implementation)

1. Free Trial Length Test

2. Product Onboarding Walkthrough

3. First Project Default

4. User Role Onboarding

5. Pricing Page Test

6. Decoy Price Tier

7. Annual Discount Anchor

8. Feature Metering Threshold

9. Re-Engagement Nudge

10. Notification Digest Frequency

11. Feature Flag / Kill Switch

12. New Feature Rollout

Quick Tips to Get Started With Full-Stack Testing

Who full-stack testing fits best (right now)

Where teams usually stumble

Conclusion: Experiment Where Your Product's Decisions Take Shape

Full-Stack Testing FAQs

Related Posts