History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: MONDRIAN-552
Type: Bug Bug
Status: Open Open
Priority: Medium Medium
Assignee: Julian Hyde
Reporter: Nick Goodman
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Pentaho Analysis {Mondrian}

Avg() on CalculatedMember spikes CPUs and takes a very long time

Created: 28/Apr/09 01:58 PM   Updated: 17/Jun/09 01:10 PM
Component/s: Caching
Affects Version/s: 3.0.5
Fix Version/s: Future Release

Time Tracking:
Not Specified


 Description  « Hide
It appears as if something is off with Mondrians evaluation of calculated members based on other calculated members that can produce extraordinarily long query times on relatively small datasets in memory.

Found in customer site using 3.0.4 (Pentaho 3.0 RC) and reproduced on Mondrian 3.0.2 embedded / Foodmart.

 - Query takes 15 seconds to execute
 - CPU spikes to 100%
 - Cache is filled in - I see no additional SQL being executed. All execution time is Mondrian only. The first time takes longer (as cache fills in, but it takes 15 seconds without any SQL executing).
 - It's not a heap issue - Heap has plenty of room so this isn't a GC CPU spike kind of thing

The issue is doing a calculated member (Avg) on top of a running total calculated member.
 - Avg on top of a standard measure works fine (< 1 sec)
 - Running total is fine ( < 1 sec)

Put them together, and you get a big CPU spike, no SQL, and a really really long MDX execution time.


MDX on Foodmart:
with member [Measures].[Profit Change] as '([Measures].[Profit], [Time].CurrentMember) - ([Measures].[Profit], [Time].PrevMember)'
member [Measures].[Running Total] as '
([Measures].[Profit], [Time].CurrentMember) + ([Measures].[Running Total], [Time].PrevMember)
'
member [Measures].[Average Daily Running Total] as '
Avg(Descendants([Time].CurrentMember, [Time.Weekly].[Day]), [Measures].[Running Total])

'
 select {[Measures].[Unit Sales], [Measures].[Store Cost], [Measures].[Store Sales], [Measures].[Profit Change], [Measures].[Running Total], [Measures].[Average Daily Running Total]} ON COLUMNS,
  {[Time.Weekly].[Week].Members} ON ROWS
from [Sales]

 All   Comments   Work Log   Change History   Transitions      Sort Order: Ascending order - Click to sort in descending order
Julian Hyde - 28/Apr/09 04:12 PM
I have created a test case for this a PerformanceTest.testBugMondrian552 (not checked in yet) and it reproduces the problem. A stack dump reveals a lot of recursion:

"main" prio=6 tid=0x009f9400 nid=0x13e4 runnable [0x00380000..0x0039fe28]
   java.lang.Thread.State: RUNNABLE
at mondrian.olap.HierarchyBase.getDimension(HierarchyBase.java:124)
at mondrian.olap.LevelBase.getDimension(LevelBase.java:84)
at mondrian.olap.MemberBase.getDimension(MemberBase.java:126)
at mondrian.rolap.RolapEvaluator.checkRecursion(RolapEvaluator.java:521)
at mondrian.rolap.RolapEvaluator.setExpanding(RolapEvaluator.java:461)
at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:446)
at mondrian.calc.impl.MemberValueCalc.evaluate(MemberValueCalc.java:72)
at mondrian.calc.impl.GenericCalc.evaluateDouble(GenericCalc.java:55)
at mondrian.olap.fun.BuiltinFunTable$51$1.evaluateDouble(BuiltinFunTable.java:1407)
at mondrian.calc.impl.AbstractDoubleCalc.evaluate(AbstractDoubleCalc.java:42)
at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:449)
at mondrian.calc.impl.MemberValueCalc.evaluate(MemberValueCalc.java:72)
at mondrian.calc.impl.GenericCalc.evaluateDouble(GenericCalc.java:55)
at mondrian.olap.fun.BuiltinFunTable$51$1.evaluateDouble(BuiltinFunTable.java:1407)
...

Looks like mondrian is evaluating [Running Total] recursively. Excessive recursion is a concern because it can blow the stack (in this case, the stack depth will be the number of weeks - about 100). Worse, in this case, is that each cell is used several times, and that gives an O(n^2) performance.

Mondrian supports expression caching, where each expression is computed at most once. This is done automatically by the Rank function, for example, but it is not used for general expressions because it increases the cost of evaluating an expression by 2x or 3x.

Workaround is to use the Cache function:

with member [Measures].[Profit Change] as '([Measures].[Profit], [Time].CurrentMember) - ([Measures].[Profit], [Time].PrevMember)'
member [Measures].[Running Total] as '
([Measures].[Profit], [Time].CurrentMember) + ([Measures].[Running Total], [Time].PrevMember)
' , CACHE = 1
member [Measures].[Average Daily Running Total] as '
Avg(Descendants([Time].CurrentMember, [Time.Weekly].[Day]), [Measures].[Running Total])

'
 select {[Measures].[Unit Sales], [Measures].[Store Cost], [Measures].[Store Sales], [Measures].[Profit Change], [Measures].[Running Total], [Measures].[Average Daily Running Total]} ON COLUMNS,
  {[Time.Weekly].[Week].Members} ON ROWS
from [Sales]

This takes the query from to 24s to 2.8s on my machine.

In future I would like to add a member property CACHE. Values:
 * 0 - do not cache
 * 1 - always cache
 * null (or not specified) - mondrian should decide whether to cache, based on (a) it being used in another calculated member or set, (b) having a non-trivial formula.

In this case, you could just add 'CACHE = 1' to the definition of [Running Total]. It would have similar effect to applying the Cache function. But automatic expression caching would help prevent these problems from arising.

We also need to deal with excessive stack depth. No idea how to deal with that right now.

Julian Hyde - 28/Apr/09 04:13 PM
Reducing severity since we have a workaround.

Nick Goodman - 11/May/09 08:01 PM
Julian - I tried this workaround at the customer and it didn't work - well... that's what we thought until I realized that you had the wrong buffer for the copy and paste.

What I think you wanted to write:
--- BEGIN ---
Workaround is to use the Cache function:

with member [Measures].[Profit Change] as 'Cache(([Measures].[Profit], [Time].CurrentMember) - ([Measures].[Profit], [Time].PrevMember))'
member [Measures].[Running Total] as 'Cache(
([Measures].[Profit], [Time].CurrentMember) + ([Measures].[Running Total], [Time].PrevMember)
)' ,
member [Measures].[Average Daily Running Total] as '
Cache(
Avg(Descendants([Time].CurrentMember, [Time.Weekly].[Day]), [Measures].[Running Total])
)
'
 select {[Measures].[Unit Sales], [Measures].[Store Cost], [Measures].[Store Sales], [Measures].[Profit Change], [Measures].[Running Total], [Measures].[Average Daily Running Total]} ON COLUMNS,
  {[Time.Weekly].[Week].Members} ON ROWS
from [Sales]
--- END ---

This actually does improve the performance of the query. I think your buffer was the "proposed" new cache member property. Can you confirm that this is the case? Workaround is to actually use the Cache() function not the CACHE=1 in the original buffer paste?

Julian Hyde - 11/May/09 08:36 PM
Oops, yes, you guessed right. Wrap the expression in the Cache() function.

I wrote about this issue in my blog [ http://julianhyde.blogspot.com/2009/04/how-mondrian-evaluates-expressions.html ] but only did slightly better - I wrote the 'before' query but didn't bother to write the 'after' query including the Cache function.