Pipeline API concepts¶

This page explains the data model behind the Pipeline API from the inside out — the "why" behind each concept. For complete field tables and validation rules, see the object reference.

For hands-on examples, start with the Quickstart or Building a custom pipeline.

The big picture¶

The Pipeline API operates on generic events. A pipeline definition answers: "Of all the events in a date range, how many unique visitors (or sessions) matched each of my defined stages?"

Steps are independent — no funnel requirement. Each step counts distinct values of a chosen field across all events that passed its filters.

Pipeline definition
│
├── step 0: "Widget started"   → count distinct visitor_id WHERE category='widget' AND action='start'
├── step 1: "Widget clicked"   → count distinct visitor_id WHERE category='widget' AND action='click'
└── step 2: "Goals reached"    → count distinct visitor_id WHERE category='goal'  AND action='trigger'

On top of this foundation, three optional features add depth:

Metrics — a step can also return grouped breakdowns (e.g. unique visitors by goal name).
Dimensions — slice all step counts at query time by a shared property (e.g. interaction, goal) without changing the definition.
Value sets — store named lists of IDs that are swapped into event filters by reference, so the definition never needs updating when your interaction or goal lists change.

Steps and event filters¶

A step's count_field sets what to count distinctly (visitor_id, session_id, or event_id). Its events list is a set of event filters: conditions are AND-ed within one filter, OR-ed across filters. When cases only differ in values for the same field, use a list rather than multiple filter objects:

{ "category": "widget", "action": ["openchat", "leadform", "starttask"], "label": "$interaction-ids" }

See the event filters reference for supported fields, value set syntax, and properties filter behaviour.

Value sets¶

A value set is a named list of strings stored per organization. It decouples "which IDs to track" from the pipeline definition — the same definition produces different counts for each org because $interaction-ids resolves to that org's list at query time:

Organization A:  "interaction-ids" → ["id-1", "id-2", "id-3"]
Organization B:  "interaction-ids" → ["id-99", "id-100"]

Info

If the value set doesn't exist, the filter condition is omitted and the step is unconstrained on that field.

Metrics¶

Metrics are optional sub-queries on a step, only evaluated by the step-detail endpoint. Where a step's count answers "how many visitors matched?", a metric answers "how were they distributed?" — e.g. unique visitors by goal name, or total cart value.

Three parts: measure (what to aggregate), group_by (which fields to split by), and optional mapping_functions (translate raw IDs in group_by results to human-readable names).

See the metrics reference for supported fields, aggregation functions, and mapping_functions IDs.

Dimensions¶

Why dimensions?¶

Filtering by a property like "interaction" across a multi-step pipeline has two complications:

The same logical property maps to different values on different event types. Widget events use label for interaction IDs; goal events use label for goal IDs. Blindly filtering every step on label = "int-abc-123" would match the wrong events on the goal step.
Some steps don't carry the property at all. There is no field to filter on directly — yet you still want those steps restricted to only the sessions that matched on another step.

A dimension declaration solves both at definition time: for each step you specify either which field to match directly, or which other step's matching sessions to inherit. At query time you pass a single filter value and the API applies the correct logic per step automatically.

This separates what to measure (the definition) from how to slice it (the query-time filter value).

Two mapping modes¶

Each entry in step_mappings uses one of two modes:

Direct field mapping (field) — filter this step's events on the named field. Used on steps whose events carry the filterable value directly (e.g. widget steps where label = interaction ID).
Cohort mapping (cohort_from_step) — instead of filtering by field, inherit the session_ids of sessions that matched the filter on another step. Used on steps whose events don't carry the value (e.g. a goal step where label = goal ID, not interaction ID).

A full dimension must have exactly one mapping per step. See the dimensions reference for complete field tables, validation rules, and worked examples.

Partial dimensions and filtering "by project"¶

A partial dimension ("partial": true) covers only a subset of steps — steps not listed are completely unaffected when it is used as a filter. The intended pattern is to define multiple partial dimensions that together cover all steps, so applying them simultaneously acts as a single compound filter.

A concrete example: define one partial dimension for interaction IDs covering widget steps, and another for goal IDs covering the goal step. Pass both at query time, each populated with IDs from a single project, and the entire pipeline is effectively filtered "by project" — no cohort wiring between steps needed.

See the partial dimensions reference for the full worked example.

Filters at query time¶

Filters are passed in the request body and don't change the definition. Two types:

base_filters — a global pre-filter applied to all steps before any step-specific matching. Supported on raw event fields: geo_country, geo_city, room_ids.
filters — dimension-specific filters that reference a named dimension declared on the pipeline. Apply different filtering logic per step based on the step's mapping (direct or cohort).

Both types accept apply_to_pageview to also restrict pageview counts.

See the filters reference for field tables and examples.

A complete annotated example¶

{
  "name": "Impressions to goals (filtered)",
  "pageview_source_field": "referrer_medium",

  "steps": [

    {
      "name": "Impressions",
      "count_field": "visitor_id",
      "events": [
        {
          "category": "widget",
          "action": "start",
          "label": "$interaction-ids"   // resolved at query time from the org's value set
        }
      ],
      "metrics": [
        {
          "name": "Impressions by interaction",
          "measure": { "field": "visitor_id", "aggregations": ["count_distinct"] },
          "group_by": ["label"],
          "mapping_functions": { "label": "f$interactions" }  // translate IDs to names
        }
      ]
    },

    {
      "name": "Engaged with giosg",
      "count_field": "visitor_id",
      "events": [
        { "category": "widget", "action": "click", "label": "$interaction-ids" }
      ]
      // no metrics on this step
    },

    {
      "name": "Goals reached",
      "count_field": "visitor_id",
      "events": [
        { "category": "goal", "action": "trigger", "label": "$goal-ids" }
      ],
      "metrics": [
        {
          "name": "Users by goal",
          "measure": { "field": "visitor_id", "aggregations": ["count_distinct"] },
          "group_by": ["label"],
          "mapping_functions": { "label": "f$goals" }
        }
      ]
    }

  ],

  "dimensions": [

    {
      "name": "interaction",
      "label": "Interaction",
      "step_mappings": [
        { "step_index": 0, "field": "label" },          // direct: widget events carry interaction IDs in label
        { "step_index": 1, "field": "label" },          // direct: same
        { "step_index": 2, "cohort_from_step": 0 }      // cohort: goal events use label for goal IDs, not interaction IDs
      ],
      "pageview_cohort_from_step": 0
    },

    {
      "name": "goal",
      "label": "Goal",
      "partial": true,                                  // only covers step 2
      "step_mappings": [
        { "step_index": 2, "field": "label" }           // goal events carry goal IDs in label
      ]
      // steps 0 and 1 are unaffected when this dimension is used
    }

  ]
}

At query time, base_filters pre-filter all steps by a raw event field (e.g. geo_country), while filters reference a named dimension — here goal is partial, so only step 2 is affected. The step-detail endpoint for step 2 additionally returns the Users by goal metric with IDs translated to names via mapping_functions. See the query endpoints reference for full payload and response shapes.

Summary¶

Concept	Defined at	Evaluated at	Purpose
Step	Definition	Counts / step-detail	What to count and how to filter it
Event filter	Definition	Counts / step-detail	Which events match a step
Value set	Org-level store	Counts / step-detail	Reusable ID lists, swapped in at query time
Metric	Definition	Step-detail only	Grouped sub-aggregations within a step
mapping_functions	Definition	Step-detail only	Translate ID values to names in metric results
Dimension	Definition	Counts / step-detail	Declare how a property maps across steps for slicing
Partial dimension	Definition	Counts / step-detail	Like a dimension, but covers only selected steps
Base filter	Query request	Counts / step-detail	Global event pre-filter applied to all steps
Active dimension filter	Query request	Counts / step-detail	Per-step slice using a declared dimension