Generic Event query examples for Target Level creation

Querying generic events¶

After you have successfully sent Generic Events, you can start querying your data and create Conversational Reporting Targets. Manual queries work by making POST requests to https://api.giosg.com/api/events/v2/orgs/<organization_uuid>/fetch/ or https://api.giosg.com/api/events/v2/users/<user_id>/fetch/

For example if you would like to get report about daily impressions (views) of different widgets for previous week they could query it like this: HTTP POST https://api.giosg.com/api/events/v2/orgs/<organization_uuid>/fetch/

{
    "sources": ["untrusted"],
    "interval": {
        "start": "2019-01-07T00:00:00.000Z",
        "end": "2019-01-14T00:00:00.000Z",
        "time_zone": "UTC"
    },
    "granularity": "day",
    "group_by": ["category", "action", "label"],
    "vendor": "your.company.xyz",
    "aggregations": ["sum"],
    "filters": {
        "type": "and",
        "fields": [
            {"type": "selector", "dimension": "category", "value": "widget"},
            {"type": "selector", "dimension": "action", "value": "impression"}
        ]
    }
}

They would get following response back from the API. See documentation below for more details about the query request payload and response format.

{
    "fields": [
        {
            "name": "timestamp",
            "type": "dimension"
        },
        {
            "name": "category",
            "type": "dimension"
        },
        {
            "name": "action",
            "type": "dimension"
        },
        {
            "name": "label",
            "type": "dimension"
        },
        {
            "name": "sum_value",
            "type": "metric"
        }
    ],
    "data": [
        [
            "2019-01-07T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            212
        ],
        [
            "2019-01-07T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-2"
            2
        ],
        [
            "2019-01-08T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-2"
            94
        ],
        [
            "2019-01-09T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            128
        ],
        [
            "2019-01-09T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-3"
            122
        ],
        [
            "2019-01-10T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            378
        ],
        [
            "2019-01-11T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            489
        ],
        [
            "2019-01-12T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            489
        ],
        [
            "2019-01-13T00:00:00.000Z",
            "widget",
            "impression",
            "xyz-widget-1"
            489
        ]
    ]
}

Query request payload¶

HTTP POST request to https://api.giosg.com/api/events/v2/orgs/<organization_id>/fetch/ or query part in Conversational Reporting Target definition requires JSON payload with following schema.

Name	Type	Required	Description
interval	`Object`	required	Interval object containing `start` and `end` timestamps, and `time_zone`. See "Interval object" below.
granularity	`String`	required	Time granularity of the response. Choices: `all`, `minute`, `fifteen_minute`, `thirty_minute`, `hour`, `day`, `week`, `month`, or `year`. See "Granularity choices" below.
vendor	`String`	required	Vendor name of event sender you want to query. For example: `your.company.xyz`.
aggregations	`Array<String>`	required	List of aggregations you want to get. Choices: `min`, `max`, `sum`, `count`, `avg`, `category_uniq`, `action_uniq`, `label_uniq`, `visitor_id_uniq`, `session_id_uniq`, `user_id_uniq`.
sources	`Array<String>`	required	List of event sources you want to include in response. Choices: `trusted`, `untrusted`.
group_by	`Array<String>`	optional (Default: `[]`)	List of dimension columns you want to group by with. For example: `["category", "action", "browser_name"]`. See List of dimension columns.
filters	`Object`	optional (Default: `{}`)	Filter object containing specification on how to filter the data. See List of dimension columns.

Querying user owned data¶

HTTP POST request to https://api.giosg.com/api/events/v1/users/<user_id>/fetch/ will fetch user owned data. That is data that does not belong to any organization but user, thus this end point only return data where the organization_id is null. This end point works exactly same as organization level end point with filters and aggregations, but will only return user owned data.

Interval object¶

Interval object specifies the time range to query. Note that start timestamp is inclusive, thus if you specify 2019-01-01T00:00:00.000Z you will get also events that happened during that microsecond. Timestamp on end field however is exclusive so if you specify 2019-01-02T00:00:00.000Z as value you will get events back until 2019-01-01T23:59:99.999Z. Interval object contains also field time_zone for time zone information. If you exclude timezone information from the timestamp it will default to UTC.

Name	Type	Required	Description
start	`String`	required	Start timestamp of the interval to query. Value should be timestamp in ISO8601 format. You can give timestamp with offset, for example: `2019-01-30T16:32:17.879+02:00`. Start time is inclusive.
end	`String`	required	End timestamp of the interval to query. Value should be timestamp in ISO8601 format. You can give timestamp with offset, for example: `2019-01-30T16:32:17.879-02:00`. End time is exclusive.
time_zone	`String`	optional (Default: `UTC`)	The time zone used to calculate time bucket divider moments in different query granularities. E.g. `Europe/Helsinki` with granularity of `day` would produce buckets similar to `2019-01-01T00:00:00.000+02:00`, `2019-01-02T00:00:00.000+02:00`, `2019-01-03T00:00:00.000+02:00`.

Aggregations¶

Request payload field aggregations takes list of wanted aggregates. Aggregate max returns maximum value of events value field per group. Aggregate min of course returns minimum value and sum returns sum of all values. count aggregate can be used to get count of events in case you have used some other number on value field than 1. avg aggregate will return average of value field per group for given granularity. For example if there is three events with values 1, 2 and 3 the avg would return 2.0. Adding avg to aggregations will also add sum and count as those are needed for the calculations.

There is also possibility to get aggregations of unique counts of certain dimensions with the dimension name plus underscore(_) and word uniq e.g. category_uniq. This will return the number of unique values for that field in the data set. The valid dimensions for unique counts are category_uniq, action_uniq, label_uniq, visitor_id_uniq, session_id_uniq, user_id_uniq. The counts are calculated using HyperLogLog algorithm. The algorithm has maximum error rate of 2% and thus should not be used if there is a need for precise results.

Filter object¶

Filter object specifies how data should be filtered before calculating requested aggregations.

Filter object supports multiple different types of filters. These include: selector, and, or, not and regex. Filters can be nested to make complex queries.

Selector¶

Type selector should be used for simple equality comparisons.

Below example would return rows where category matches value widget.

{
    "filters": {
        "type": "selector",
        "dimension": "category",
        "value": "widget"
    }
}

And¶

Type and should be used for checking multiple conditions.

Below example would return rows where dimension category value matches widget and dimension action value matches created.

{
    "filters": {
        "type": "and",
        "fields": [
            {"type": "selector", "dimension": "category", "value": "widget"},
            {"type": "selector", "dimension": "action", "value": "created"}
        ]
    }
}

Or¶

Type or should be used for checking or conditions.

Below example would return rows where dimension category value matches widget or form.

{
    "filters": {
        "type": "or",
        "fields": [
            {"type": "selector", "dimension": "category", "value": "widget"},
            {"type": "selector", "dimension": "category", "value": "form"}
        ]
    }
}

Regex¶

Type regex can be used to execute reqular expression queries.

Below example would return rows where dimension action value matches one of updated, deleted, or created.

{
    "filters": {
        "type": "regex",
        "dimension": "action",
        "pattern": "updated|deleted|created"
    }
}

Not¶

Type not can be used to execute not queries. This returns results which do not match with the filter.

Below example would return rows where dimension action value does not match created.

{
    "filters": {
        "type": "not",
        "field": {
            "type": "selector",
            "dimension": "action",
            "value": "created"
        }
    }
}

Example of nesting filters¶

Below query would match events whose dimension category is either video or movie, dimension action is either play or watch and dimension label contains either Back to the Future or Back to the Future Part II.

{
    "filters": {
        "type": "and",
        "fields": [
            {
                "type": "or",
                "fields": [
                    {"type": "selector", "dimension": "category", "value": "movie"},
                    {"type": "selector", "dimension": "category", "value": "video"}
                ]
            },
            {
                "type": "or",
                "fields": [
                    {"type": "selector", "dimension": "action", "value": "play"},
                    {"type": "selector", "dimension": "action", "value": "watch"}
                ]
            },
            {"type": "regex", "dimension": "label", "pattern": "Back to the Future( Part II|)" }
        ]
    }
}

Group by¶

Group by queries can be used when you want to answer questions like "How many times each widget was shown to visitors?" The query could look something like this.

{
    "sources": ["untrusted"],
    "interval": {
        "start": "2019-01-07T00:00:00.000Z",
        "end": "2019-01-14T00:00:00.000Z"
    },
    "granularity": "day",
    "group_by": ["label"],
    "vendor": "your.company.xyz",
    "aggregations": ["sum"],
    "filters": {
        "type": "selector",
        "dimension": "category",
        "value": "widget"
    }
}

Notice the group by label which gives sums for each different label.

Using custom dimensions¶

Custom dimensions are best to be used when there is a need to do group by for an information that can not be fitted into the named fields.

Lets say you want to answer question "How many users made it to each step on widget X?". That means that you need to add information about the step that the visitor was when event is generated. You also want to make group by query based on that information. Thus you can not add the information to properties list as grouping by list can lead to an confusing results. Here you make use of custom dimensions which can be easily grouped by. You add the information about the visitors step to dim1 and then query with group by dim1.

Granularity choices¶

Granularity specifies how query results should be time bucketed.

Supported granularity choices are all, minute, fifteen_minute, thirty_minute, hour, day, week, month, or year.

Lets assume that you made a query for interval 2019-01-01T00:00:00.000Z - 2019-01-11T00:00:00.000Z, with time_zone: "UTC", and see how results would have been time bucketed in different granularities.

all¶

If granularity all is used then whole interval is considered to be single bucket and you get one row of data back.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            123
        ]
    ]
}

minute¶

If granularity minute is used then whole interval is divided in minute size buckets.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            123
        ],
        [
            "2019-01-01T00:01:00.000Z",
            456
        ],[
            "2019-01-01T00:02:00.000Z",
            922
        ]
        ...
        [
            "2019-01-10T23:59:00.000Z",
            922
        ]
    ]
}

fifteen_minute¶

If granularity fifteen_minute is used then whole interval is divided in fifteen minute size buckets.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            123
        ],
        [
            "2019-01-01T00:15:00.000Z",
            341
        ],[
            "2019-01-01T00:30:00.000Z",
            922
        ]
        ...
        [
            "2019-01-10T23:45:00.000Z",
            922
        ]
    ]
}

thirty_minute¶

If granularity thirty_minute is used then whole interval is divided in 30 minute size buckets.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
        [
            "2019-01-01T00:30:00.000Z",
            55
        ],[
            "2019-01-01T01:00:00.000Z",
            12
        ]
        ...
        [
            "2019-01-10T23:30:00.000Z",
            922
        ]
    ]
}

hour¶

If granularity hour is used then whole interval is divided in 30 minute size buckets.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
        [
            "2019-01-01T01:00:00.000Z",
            55
        ],[
            "2019-01-01T02:00:00.000Z",
            12
        ]
        ...
        [
            "2019-01-10T23:00:00.000Z",
            12
        ]
    ]
}

day¶

If granularity day is used then whole interval is divided in one day size buckets.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
        [
            "2019-01-02T00:00:00.000Z",
            55
        ],[
            "2019-01-03T00:00:00.000Z",
            12
        ]
        ...
        [
            "2019-01-10T00:00:00.000Z",
            922
        ]
    ]
}

week¶

If granularity week is used then whole interval is divided in weekly buckets. With our interval there would be only one week.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
    ]
}

month¶

If granularity month is used then whole interval is divided in monthly buckets. With our interval there would be only one month.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
    ]
}

year¶

If granularity year is used then whole interval is divided in yearly buckets. With our interval there would be only one year.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            23
        ],
    ]
}

Query response¶

Response schema of successful report request.

Name	Type	Description
fields	`Array<Object>`	Array of field definition objects. See "Fields array" below.
data	`Array<Array>`	Array of aggregated data points. See "Data array" below.

Fields array¶

Fields array defines column names for data points in data array of response.

Name	Type	Description
name	`String`	Name of the column.
type	`String`	Type of column, either `dimension` or `metric`.

Example of fields array in response:

{
    "fields": [
        {
            "name": "timestamp",
            "type": "dimension"
        },
        {
            "name": "category",
            "type": "dimension"
        },
        {
            "name": "sum_value",
            "type": "metric"
        }
    ]
}

Data array¶

Response contains data attribute which is array of the actual data points. If you for example requested sum and max aggregations (Using aggregations) and grouping (Using group_by) with category result would look as like below if granularity day would have been used.

{
    "data": [
        [
            "2019-01-01T00:00:00.000Z",
            "widget",
            212,
            45
        ],
        [
            "2019-01-01T00:00:00.000Z",
            "form",
            22,
            8
        ],
        [
            "2019-01-02T00:00:00.000Z",
            "widget",
            432,
            67
        ],
        [
            "2019-01-02T00:00:00.000Z",
            "form",
            332,
            33
        ],
        [
            "2019-01-03T00:00:00.000Z",
            "form",
            128,
            19
        ]
    ]
}

If there are buckets without data in between buckets which do have data they will show up zero values. However if there are zero buckets outside buckets which do have data those are omitted. For example if the following data is pushed to the system.

[
    {
        "timestamp": "2019-02-10T00:00:00.000Z",
        "category": "widget",
        "value": 2
    }
    {
        "timestamp": "2019-02-10T05:00:00.000Z",
        "category": "widget",
        "value": 4
    }
    {
        "timestamp": "2019-02-10T06:00:00.000Z",
        "category": "widget",
        "value": 8
    }
]

And the query would have interval the following interval

{
   "start": "2019-02-10T00:00:00.000Z",
   "end": "2019-02-11T00:00:00.000Z"
}

If the query filter would be set to filter only data where category equals widget and aggregations were set to sum with granularity set to hour the result would look as following:

{
    "data": [
        [
            "2019-02-10T00:00:00.000Z",
            2
        ],
        [
            "2019-02-10T01:00:00.000Z",
            0
        ],
        [
            "2019-02-10T02:00:00.000Z",
            0
        ],
        [
            "2019-02-10T03:00:00.000Z",
            0
        ],
        [
            "2019-02-10T04:00:00.000Z",
            0
        ],
        [
            "2019-02-10T05:00:00.000Z",
            4
        ],
        [
            "2019-02-10T06:00:00.000Z",
            8
        ]
    ]
}

Notice that there aren't any buckets for hours from 07 to 23 as there is no data and they aren't between data points.

Individual item in data array represents values mapping to columns defined in fields array.

data array would look like this:

[
    "2019-01-03T00:00:00.000Z",  // timestamp
    "form",  // Value from category dimension
    128,  // Sum aggregation for group
    19  // Max aggregation for group
]

fields array would look like this:

{
    "fields": [
        {
            "name": "timestamp",
            "type": "dimension"
        },
        {
            "name": "category",
            "type": "dimension"
        },
        {
            "name": "sum_value",
            "type": "metric"
        },
        {
            "name": "max_value",
            "type": "metric"
        }
    ]
}

Response codes and meanings¶

HTTP 200 OK¶

Response code 200 will be always used if the request was successful.

HTTP 400 Bad Request¶

Response code 400 will be used if the request was unsuccessful due to missing fields or data validation issues. You should retry with fixed payload.

Response body will contain validation errors.

Example:

{"granularity": ["This field is required."]}

HTTP 401 Unauthorized¶

Response code 401 will be returned if you did not provide Authorization header or if it was incorrect.

CORS support¶

Note that CORS (https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) is supported so you can send events from any domain.

List of dimension columns¶

Name	Type	Description
event_version	`Int`	Event version
vendor	`String`	Vendor of event
source	`String`	Source of event, either "trusted" or "untrusted"
category	`String`	Event category
label	`String`	Event label
action	`String`	Event action
organization_id	`String`	Owner organization UUID
properties	`Array<String>`	List of string properties describing event
dim1	`String`	Custom dimension
dim2	`String`	Custom dimension
dim3	`String`	Custom dimension
dim4	`String`	Custom dimension
dim5	`String`	Custom dimension
visitor_id	`String`	Giosg visitor identifier
session_id	`String`	Giosg visitor session identifier
user_id	`String`	Giosg user id
browser_name	`String`	Visitor browser name
browser_version	`String`	Visitor browser version number
device_screen_height	`String`	Visitor screen height
device_screen_width	`String`	Visitor screen width
device_type	`String`	Visitor device type
geo_city	`String`	Visitor city
geo_country	`String`	Visitor country
ip_organization	`String`	Visitor company or name of the ISP
os_name	`String`	Visitor operating system
os_version	`String`	Visitor operating system version