Sisu provides weighted metric types to allow you to understand a column’s sum or average when the occurrence of certain row “events” is unequal or unbalanced, as opposed to simply calculating the aggregate. This allows you to “weight” certain rows of your data table when some rows in the data are more important or significant than others.
Related Articles:
Weighted Average Metrics
Weighting provides a way to incorporate logic and reality beyond the raw data into Sisu to inform the Metric’s calculation.
A common use of the "Weighted Average" or "Weighted Sum" metric types is managing a aggregated or rolled up dataset. These data sets have multiple occurrences of an event represented as one row. Defining the Metric (average or sum) based on this aggregated dataset would not provide a realistic picture, so using a weighted Metric will capture the context lost when the dataset was “rolled up”. Each value in the weight column corresponds to the number of observations of the given row in the Metric column.
Here are some typical examples:
- In marketing, a sample Metric might be cost per lead weighted by number of leads
- When investigating financial fraud, a sample Metric might be the fraudulent transaction amount per cohort weighted by number of transactions per cohort
- For customer success, a sample Metric might be average hours spent on customer per product/service weighted by number of customers of that product/service
Weighted Examples
Let’s look at an example of these Metrics using Weighted Average and Sum for a rollup of Marketing data, using the sample data below.
Dataset aggregated by lead |
|||
Campaign ID |
Cost per lead |
Number of leads |
Source |
Campaign 0001 |
20 |
20 |
Facebook Ads |
Campaign 0002 |
5 |
30 |
Facebook Ads |
Campaign 0003 |
2.5 |
10 |
Google Ads |
Campaign 0004 |
8 |
50 |
Google Ads |
Campaign 0005 |
5 |
10 |
Google Ads |
Weighted Sum Example
To identify the subgroups increasing or decreasing the average cost per lead, we would calculate the metric as follows
Weighted average of cost per lead =
SUM(Cost per Lead * Number of Leads) / SUM(Number of Leads)
Therefore,
Average Cost Per Lead = (20*20 + 5*30 + 2.5*10 + 8*50 + 5*10) / (20+30+10+50+10) = 8.54
An example Fact that Sisu might return with the Analysis is a negative impact on weighted average:
Source = Facebook Ads → Metric = (20*20 + 5*30) / (20+30) = 11
Weighted Sum Example
In a similar fashion, weighted sum could be used to identify the total cost of leads.
Weighted sum of Cost per Lead =
SUM(Cost per Lead * Number of Leads) for each lead
Total Cost of Leads = (20*20 + 5*30 + 2.5*10 + 8*50 + 5*10) = $1,025
Configuring a Weighted Average Metric
To configure a weighted average Metric in Sisu:
- Create a new Metric, choose your data, and select “Weighted average” or “Weighted sum” as the metric calculation type.
- Define the metric dimension and weight dimension. See the equation below the menus to view how the metric is calculated. Sisu automatically performs the weighting calculation on the metric dimension.
The weight dimension should be the dimension with the unit of observations (e.g. transaction count, number of events, etc.) for the results to be statistically significant.
Using other types of dimensions as weights (e.g. transaction amount on a dataset of financial transactions) might yield results that over-focus on weights at the expense of statistical significance (e.g. focus on high-value transactions on a financial dataset). |
Interpreting a Weighted Metric's Results
The metric value represents the weighted average or weighted sum of the metric dimension weighted by the selected dimension. Another summary statistic indicates the total weight of the metric (equivalent to the row count for a non-weighted metric, i.e. non-aggregated dataset).
Refer to Understanding Impact for more details. |
For example, the fact “ORDER_STORE_CITY = Los Angeles” in the example below analyzing a weighted average Order Value metric can be described as:
Where Store City is Los Angeles,
The weighted average of Order Value weighted by Number of Items is
51 in the first time period and 39.6 in the second time period
Given the relative weight of 9.7%, Store City is Los Angeles contributed to a 1.1% decrease in the overall metric value.
The same analysis results for a weighted sum (total) Order Value metric defined on the same dimensions are shown below.