How often do your biggest product decisions go untested because full-stack feels too costly to spin up?
Pricing logic, eligibility rules, feature access, rollout behavior. These decisions live deep in the stack, compound quickly, but they rarely get the same attention as surface-level changes.
Meet Convert's new full-stack testing templates. They package common decision-level experiments into plug-and-play patterns that work directly in your stack.
Is your site converting as well as it could be?
Get a CRO Audit - $99This guide walks through what's included, who these templates are for, and how to start testing across the stack in a way that's practical from day one.
Introducing: Convert's Full-Stack Testing Templates
Our new full-stack testing templates turn common decision-level experiments into plug-and-play patterns that run directly in your stack.
Each template comes preconfigured with SDK logic, event tracking, and guardrails, so you can move from test idea to setup in minutes.
This first release focuses on the experiments teams tend to reach for early as they push experimentation upstream, spanning across growth stages. The library will continue to grow as full-stack testing becomes part of how teams ship and learn.
Get Started with the Full-Stack Templates
Jumpstart your full-stack testing program with Convert's new plug-and-play templates. Get started with ideas and the production-ready code snippets to launch them.
Why Convert Is Bullish on Full-Stack Testing
The impact of UI improvements fades as decisions move deeper into the stack. Pricing logic, personalization rules, eligibility checks, search ranking, and feature access are resolved before a page even loads. Testing only the UI means testing after those decisions have already been made.
Convert's bet on full-stack testing starts there.
Learn More:Client-side Testing vs. Server-side Testing
As George, Lead Architect at Convert and Product Owner of Convert Full-Stack, explains:
The future of testing is about validating the entire customer journey from the server to the screen. At Convert, we are bullish on full-stack because it moves testing from a 'frontend polish' exercise to the core of our decision-making process.
George Crewe
But what really is full-stack testing? And why should you care about it?
What Is Full-Stack Testing?
Full-stack testing is experimentation that runs across the customer journey from the server to the screen, allowing teams to test backend decisions such as logic, APIs, rules, and algorithms alongside the user experience.
Instead of limiting experiments to visible UI changes, full-stack testing extends it to the systems that determine what users see, get, and experience across channels.
It covers:
- Server-side or edge-side variation assignment, where experiments are decided before content reaches the browser or app.
- SDK-based implementation that integrates experiments directly into backend services using native SDKs.
- API-driven testing for orchestration and data flows. APIs control experiment logic, variation delivery, and event logging.
- Experimentation on backend logic for testing pricing rules, search ranking, feature eligibility, personalization logic, etc.
- Consistent delivery across channels (web, mobile apps, email, and offline touchpoints).
As experimentation moves deeper into systems, APIs, and orchestration layers, we're intentionally building in this direction to support how modern teams test, learn, and scale decision-making.
This opens up great opportunities to unlock new levels in experimentation program maturity with APIs and MCP servers.
API-driven Experimentation Improves Control and Data Integrity
Running experiments through an API opens up what you can automate, monitor, and operationalize in your experimentation program. That's why we built Convert API v2 - to give you programmatic access to everything in your account: account data, project configuration, experiences, audiences, goals, reporting, and live data.
Here's what that unlocks:
- Pipe results into your own dashboards and pull aggregated and daily reports where you're most comfortable
- Automate experiment operations such as updating traffic allocations over time, pause or start experiments on a schedule, trigger notifications when milestones are hit, etc.
- Programmatically manage projects, collaborators, and domains, verify tracking codes, and fetch live logs for quick debugging
- Pull change history, see when changes happened, and keep an audit trail for internal accountability
And for full-stack testing specifically, it also changes how you handle variation assignment and data collection.
Instead of relying on client-side scripts that execute after page load, you can assign variations earlier in the request lifecycle. This reduces visual inconsistencies, avoids flicker, and produces cleaner data when performance, latency, or ordering matters.
MCP Servers Make Experimentation Operational At Scale
As experimentation ventures beyond the UI, teams need better ways to manage complexity.
MCP (Model Context Protocol) servers act as connectors between experimentation workflows, internal tools, and AI systems. They make it easier to automate repetitive tasks such as experiment setup, QA checks, analysis retrieval, and guardrail monitoring, without hard-wiring brittle integrations.
What this means is that you can open an AI assistant like Claude, ask it to analyze results from your latest experiment, surface insights, and even help configure a follow-up test using live Convert data through the MCP server.
Convert's MCP Server connects your Convert data directly to any MCP-compatible AI assistant and exposes Convert's REST API v2 through a consistent interface, adds built-in knowledge from Convert documentation, and keeps access controlled through clear permission levels.
You can then, in natural language, ask it:
- "Show me the performance of all active experiments."
- "Summarize experiments with conversion rate above 5% in the last 30 days."
- "Generate a summary report of completed experiments this quarter."
And get answers as you would in a typical AI conversation, right in your IDE.
You can also create tests, update traffic splits, and pause experiments.
How to run Convert MCP Server
- Install/run locally: 'npx @convertcom/mcp-server@latest' (requires Node.js 20+)
- Start with 'TOOLS_FOR_CLIENT=reporting' for insight workflows
- Escalate to 'readOnly' or 'all' when operational control becomes useful
{
"mcpServers": {
"convert": {
"command": "npx",
"args": ["-y", "@convertcom/mcp-server@latest"],
"env": {
"CONVERT_APPLICATION_ID": "your_app_id",
"CONVERT_SECRET_KEY": "your_secret_key",
"TOOLS_FOR_CLIENT": "all"
}
}
}
}
Learn More: How to Use the Convert MCP Server
Worried about its security? Don't be. Requests are signed (HMAC-SHA256), expire quickly, and credentials stay local. Also, permissions are narrow by default and expand only when needed.
How the Convert SDK and MCP Server work together
The Convert Full-Stack SDK runs inside your application. It's responsible for assigning variations, delivering experiences consistently, and tracking conversions in production code.
The Convert MCP Server, on the other hand, connects your experimentation data to AI assistants like Claude or Cursor. It gives you a conversational way to work with your experimentation data and configuration.
You use the MCP Server to plan, analyze, and guide experimentation decisions, and then you rely on the SDK to execute and measure those decisions reliably inside your product.
None of This Means Full-Stack Testing Cancels Out UI Testing
UI testing still has a solid place in experimentation, and Convert continues to support it. We are identifying that UI testing is limited to the decision surface.
Full-stack testing (emphasis on "full") extends experimentation to the decisions that shape every channel, touchpoint, and outcome.
The 12 Most Popular Convert Full-Stack Templates (With Step-by-Step Implementation)
The jump from full-stack testing theory to actual execution feels expensive. You're worried about new SDKs, backend coordination, unfamiliar workflows, and unclear blast radius. But don't fret, these Convert full-stack templates make it easier to get it right.
Each recipe packages a common full-stack experimentation pattern into a repeatable implementation. This way, you're starting from a known, working structure.
Below are 12 of the most commonly used full-stack templates, based on real usage patterns across Convert customers. Each recipe includes what it's for, how it works, and when to use it.
How we selected these templates
This first release of Convert's full-stack template library reflects the experiment patterns we've seen teams request most as they adopt full-stack testing.
While the catalog is new and expanding, this roundup focuses on broadly applicable templates that span activation, monetization, retention, and feature adoption, and that rely on backend or API-level decision-making. It's a practical starting point. We plan to release more specialized patterns as the library grows.
Activation
● Free trial length test
● Product onboarding walkthrough test
● First project default
● User role onboarding
Monetization
● Pricing page test
● Decoy price tier test
● Annual discount anchor test
● Feature metering threshold
Retention
● Re-engagement nudge test
● Notification digest frequency test
Feature Adoption
1. Free Trial Length Test
This template powers a full-stack experiment that tests different free-trial durations (7, 14, or 30 days) by assigning the trial length at signup, before any UI is rendered.
What you can test
- Trial duration tied to user creation
- Downstream conversion from trial to paid
- Impact of shorter vs longer exposure on activation quality
How to implement the free trial length test template
The code below shows a complete implementation of the Free Trial Length Test template using Convert's Full-Stack SDK in a Node.js environment.
Start by downloading the preconfigured JSON file for this template (available on the template's page) and importing it into your Convert project via Configuration > Import, which automatically sets up the experience and goal tracking.
In this example, the experiment assigns a trial length using the 'trialDays' value returned by the experience. That value is then persisted in your database to calculate the trial expiration date. This logic should run once, immediately after sign-up, so the decision remains consistent throughout the user's trial.
While the implementation code below is in Node.js, equivalent implementations are available in TypeScript right now. Support for Python, Go, and Java will follow as the template library expands.
const convert = new ConvertSDK({
sdkKey: ‘YOUR_SDK_KEY’,
dataStore: null // Use built-in memory store
});
const context = convert.createContext({
id: user.id,
attributes: {
// Add user attributes here
}
});
const variationValue = context.runExperience(‘trial-length-test’, { userId: user.id });
// Implement logic based on variation
let trialDays = 14; // Default to control if logic fails
if (variationValue === 7) {
trialDays = 7;
} else if (variationValue === 30) {
trialDays = 30;
}
// Apply to the user object in your database
user.trialEnds = new Date(Date.now() + trialDays * 24 * 60 * 60 * 1000);
Other templates in the library follow the same core flow:
● Initialize the Convert SDK (here's how to integrate Convert's full-stack SDK)
● Choose the template you want to run
● Download the template's pre-configured JSON file and import it into your Convert project
● Create a persistent user context (the user ID should remain stable across sessions)
● Run the experience and track the relevant conversion event
Each template includes its own implementation notes, example code, and setup steps, available when you click the "Use this template" link below each one.
When to use the free trial length test template
- Trial expiration, billing, or entitlement logic is enforced server-side
- You want consistent trial behavior across web, app, and email
Real-world use
This pattern lets teams test trial strategy without UI hacks or race conditions. Trial length is decided once, stored once, and enforced everywhere.
Use the Free Trial Length Test template.
2. Product Onboarding Walkthrough
This full-stack experiment template lets you determine which onboarding walkthrough a user sees. You can have variants by format (video walkthrough, tooltips).
What you can test
- Walkthrough format effectiveness
- Whether onboarding helps users or distracts
- Activation impact of guided vs unguided starts
When to use it
- Onboarding state needs to persist across sessions
- Walkthrough logic is reused across surfaces
- You want clean attribution for onboarding completion
Real-world use
Because the decision is made via full-stack logic, the same user always gets the same onboarding experience, regardless of device or session.
Use the Product Walkthrough Test template.
3. First Project Default
A server-side experiment that decides whether a new user gets a pre-populated "seed" project created automatically after signup.
What you can test
- Blank slate vs guided first experience
- Impact of pre-filled data on activation
- Time-to-value for first-time users
When to use it
- Activation depends on users completing a first action
- Creating a default object requires backend logic
- You want to avoid conditional UI hacks
Real-world use
This recipe tests a decision, not a layout. The UI simply reflects whether a project exists.
Use the First Project Default template.
4. User Role Onboarding
This is the template for routing users into different onboarding flows based on their selected role (e.g. Developer, Marketer).
What you can test
- Role-based onboarding paths
- Personalized feature exposure
- Activation differences by job-to-be-done
When to use it
- Role selection happens early
- Onboarding logic branches meaningfully
- You want consistent routing logic across channels
Real-world use
Because routing happens server-side, role logic stays centralized and doesn't drift across frontend implementations.
Use the User Role Onboarding template.
5. Pricing Page Test
A full-stack experiment template that tests different pricing page configurations, such as layout structure or which plans are highlighted, with the decision made before the page renders.
What you can test
- Pricing page layouts
- Highlighted or default plans
- Tier emphasis based on user attributes
When to use it
- Pricing logic depends on user context
- You want consistent pricing behavior across sessions
- Client-side swaps risk flicker or misattribution
Real-world use
This pattern keeps pricing decisions deterministic and repeatable, avoiding last-second UI overrides that distort revenue data.
Use the Pricing Page Test template.
6. Decoy Price Tier
This is for full-stack experiments that introduce or remove a high-priced "decoy" tier to influence plan selection behavior.
What you can test
- Presence of a decoy tier
- Tier ordering and positioning
- Impact on middle-tier selection
When to use it
- Pricing tiers are generated server-side
- Entitlements and plans must stay in sync
- You want clean purchase attribution
Real-world use
Because tier configuration is controlled centrally, pricing logic remains consistent across checkout, invoices, and billing systems.
Use the Decoy Price Tier template.
7. Annual Discount Anchor
A full-stack experiment that tests how annual pricing discounts are framed, such as displaying a percentage off versus showing the absolute amount saved.
What you can test
- "Save 20%" vs "Save $50" messaging
- Discount framing on upgrade or pricing pages
- Perceived value of annual plans
When to use it
- Annual pricing logic is handled server-side
- Discount calculations must stay consistent across surfaces
- You want to influence plan selection without changing actual prices
Real-world use
This is 'perception testing', not a pricing test. The underlying billing logic remains untouched, while messaging adapts dynamically based on the experiment outcome.
Use the Annual Discount Anchor template.
8. Feature Metering Threshold
A server-side experiment that tests different usage limits, such as API calls or project counts, before requiring an upgrade.
What you can test
- Usage caps (e.g., 50 vs 100 units)
- Upgrade trigger timing
- Monetization pressure points
When to use it
- Usage is tracked in backend services
- Limits must be enforced reliably
- Client-side checks would be bypassable
Real-world use
This recipe lets teams test monetization boundaries without hard-coding assumptions into business logic.
Use the Feature Metering Threshold template.
9. Re-Engagement Nudge
This test template helps you determine which re-engagement notification an inactive user receives.
What you can test
- Message type (discount vs reminder)
- Trigger logic based on inactivity
- Return-to-product behavior
When to use it
- Re-engagement depends on backend signals
- Messaging logic must be coordinated across systems
- You want consistent treatment per user
Real-world use
Because the decision is made server-side, users receive consistent messaging even if notifications are sent through different channels.
Use the Re-Engagement Nudge template.
10. Notification Digest Frequency
This full-stack experiment tests how often users receive notification digests, such as daily versus weekly, based on a server-side scheduling decision.
What you can test
- Daily vs weekly notification cadence
- Engagement without fatigue
- Optimal reminder frequency for retention
When to use it
- Notification delivery is orchestrated server-side
- Schedules are stored centrally
- You want consistent behavior across channels and sessions
Real-world use
This approach lets teams test notification strategy without duplicating logic across cron jobs, queues, or messaging services.
Use the Notification Digest Frequency template.
11. Feature Flag / Kill Switch
This full-stack feature flag template controls whether new functionality is enabled, allowing it to be disabled instantly if issues arise.
What you can test
- Gradual feature exposure
- Risk-free rollout strategies
- Fallback behavior when features are disabled
When to use it
- New logic must be safely gated
- Rollbacks need to be immediate
- UI-only toggles are too risky
Real-world use
This recipe expands experimentation from growth tactic to safety mechanism. Feature exposure becomes reversible by design.
Use the Feature Flag / Kill Switch template.
12. New Feature Rollout
This template allows you to test how a new feature is announced to users, such as via a modal, banner, or tooltip.
What you can test
- Announcement format effectiveness
- Feature discovery patterns
- Engagement with newly released functionality
When to use it
- Feature exposure should be controlled
- Messaging logic must stay consistent
- You want clean attribution for feature engagement
Real-world use
Because the announcement decision is made server-side, teams can coordinate launches across product, marketing, and lifecycle surfaces.
Quick Tips to Get Started With Full-Stack Testing
If you're exploring full-stack testing, you're likely balancing curiosity with caution.
You want to test decisions that live deeper in your product, and you're also aware of how easily that can turn into architecture debates, blurred ownership, or weeks of setup.
Teams that get started and build momentum here tend to do less, more deliberately, at the start.
Begin with a decision your backend already owns, such as trial length, feature limits, eligibility rules, and pricing logic. These are familiar, contained, and already enforced server-side, which removes half the friction before you even start.
Pick a single surface to work on, either one backend service or one web flow, and keep everything else out of scope until you've seen a result in your own environment.
And, importantly, before changing any logic, make sure your signals are solid. Define what success looks like, add one guardrail metric, and confirm both fire reliably.
Then run something low-risk. Early full-stack experiments tend to work best when they're operationally boring (think: trial duration, feature exposure, announcement style, usage thresholds). Once you've shipped one clean experiment end-to-end, the rest tends to move faster with far less resistance.
Who full-stack testing fits best (right now)
Full-stack testing pays off fastest for teams where real leverage already sits behind the UI:
- Product-led SaaS with trials, plans, usage limits, or gated features
- Ecommerce teams making pricing, merchandising, or search decisions upstream
- Marketplaces running ranking, matching, or allocation logic
If your product is mostly static pages with minimal personalization, UI testing may still suffice for a while.
Where teams usually stumble
These patterns keep showing up:
- Ownership gaps: Growth wants the outcome. Product owns the logic. Engineering owns the system. When no one owns the experiment end-to-end, it drifts.
- Logging as an afterthought: Backend logic changes without updating events. Suddenly, you're arguing about results instead of learning from them.
- Trying to prove too much at once: Teams try to justify full-stack testing with a "big win" experiment. The complexity kills momentum.
The teams that succeed do the opposite. Small scope, clear ownership, and clean signals. That's it.
Once you've run a couple of these, full-stack testing stops feeling so humongous and more like a new default way of learning.
Conclusion: Experiment Where Your Product's Decisions Take Shape
As products grow more complex, the decisions that matter most tend to live deeper in the stack. These systems shape outcomes long before anything reaches your users' eyes.
The templates in this library are an on-ramp, a practical way to bring experimentation into those decision points while reducing setup time by reusing pre-wired logic for assignment, tracking, and guardrails.
Start small, prove the value, and expand as your experimentation program matures. Gradually, full-stack testing will become a natural extension of how your team learns, ships, and adapts.
Full-Stack Testing FAQs
Q1. What's the difference between client-side and server-side A/B testing?
Client-side testing changes what the user sees after the page loads. Server-side testing decides what the user gets before anything renders.
That distinction sounds subtle until you test something like pricing, eligibility, or feature access. Once the decision lives in the backend, trying to test it in the browser becomes fragile fast.
If the UI is just reflecting a decision that already happened, the experiment belongs upstream.
Q2. Does full-stack testing replace UI testing?
No. And it shouldn't.
UI testing is still the fastest way to learn about copy, layout, and interaction design. Full-stack testing picks up where UI testing stops. It covers the decisions that shape those interfaces in the first place.
Most mature teams end up running both, each where it makes sense.
Q3. What's an MCP server, in plain terms, and why does it show up in experimentation?
At a high level, an MCP server is a way to connect experimentation workflows with tools and AI systems through a shared context.
In practice, that means things like:
- Automating experiment setup or QA checks
- Pulling experiment context into analysis tools
- Reducing one-off scripts that nobody wants to maintain
We have an MCP server (@convertcom/mcp-server) that exposes our API to AI assistants such as Claude, Cursor, etc. Once set up, you can ask things like “show me all active experiments” or “which variation is winning” directly in your IDE.
You don’t need MCP to run experiments. The SDK does that. MCP adds a conversational layer for managing and analyzing experiments.
Q4. Do I need engineers to run full-stack experiments?
At the beginning, yes.
Someone has to add the SDK, define where decisions happen, and make sure guardrails are in place.
The good news is that once those patterns exist, a lot of experimentation stops requiring constant engineering involvement. The upfront work pays off by reducing friction later.
Q5. How do SDKs change implementation and governance?
SDKs move experimentation out of brittle scripts and into real application code.
That has a few practical effects:
- Logic is versioned and reviewable
- Rollbacks are cleaner
- Experiments don't drift across implementations
It's less about control and more about predictability. When experiments touch core systems, predictability matters.