Roll Ups
Last updated
Was this helpful?
Last updated
Was this helpful?
Roll up functions allow you to summarise your data in some way. They are broadly split into two categories.
For tools which handle hierarchical data well, like OrgVue, a Roll up will apply some aggregation (e.g. sum or average) to all of a node's descendants e.g. direct reports. This allows you to get a holistic picture of certain metrics e.g. total annual compensation.
For tools which don't handle hierarchical data so well, Roll ups are normally aggregations, reducing the number of rows and applying some sum/average calculation. Typically this is achieved through some version of a pivot table.
Excel does not handle hierarchical data well, but you can use a Pivot Table to show various levels of detail and aggregations in a tabular format. Select a range of cells > go to the Insert tab > Pivot Table and follow the instructions in the dialogue.
Whilst Tableau has no inbuilt hierarchy rollup function, other options are available.
Option 1: Go to Analysis tab > Totals > Show Grand Totals / add Subtotals
Option 2: Use 'level of detail' (LOD) calculations to show a different level of aggregation to what is displayed in the view. For more information on LOD calcs, see Tableau's helpful blog.
Use the Summarize tool (in the Transform palette):
(group by field, sum/average/count by measure).
node.~rollUp!~('‹aggregator›', '‹measure›') OR node.~rollUp!~(‹aggregator›, n=>n.‹measure›)
~GROUP BY ROLLUP!~(‹dimension›) OR ~GROUP BY!~ ‹dimension› WITH ~ROLLUP!~
NB. The ROLLUP
function allows SQL Server to create subtotals and grand totals, while it groups data using the GROUP BY
clause.
Python does not support a native Rollup function, but you can aggregate using a pivot table operation:
Which outputs:
Thanks to Wes McKinney's answer on Stack Overflow for the example.
NB. This requires the Pandas library (import pandas as pd
). Once downloaded, the above example assumes you have done the following:
Imported the numpy library as 'np': import numpy as np
Defined a data frame as 'df', e.g.:
where 'State', 'City', 'SalesMTD', 'SalesToday', and 'SalesYTD' are all columns.
NB. Including r
before the start of the file path forces pandas to treat it as a raw string, avoiding any issues caused by including \
characters.