8 votesNitin Agarwal supported this idea ·
Hi Eliot - Could you please elaborate more on this use case? Specifically, when we encounter bad values for the inputs, what's the expectation?
break out and don't perform the calculation - similar to If BadValue('Input') then NoOutput() else ...substitute a user defined value for the inputuse last good valueother ?
In response to Zev Arnold, "Suppose that, for instance: * I introduc..."
Hi Zev - We are currently working on supporting automatic recalculation feature for Analytics. When available, this should solve 2/3 scenarios that you described above - (1) updated input/source data and (2) late arriving or out-of-order data. The PI Analysis service will monitor such updates to data for analysis inputs and trigger recalculations as required.
The third use case that you described (introducing a new sub-element) is a bit tricky and is different than other two, in the sense that (1) and (2) will in most cases only require recalculating a limited range of data (depending on the range for updated data and calculation logic), while (3) requires recalculating analysis for the entire duration that it has been running. This can be really expensive, and may not even be desirable in all cases. Currently, we are not planning to address automatic recalculation for configuration changes for analyses - i.e. updates to calculation logic or updating list of roll-up inputs, but will love to hear feedback from the community.
Unfortunately due to unforeseen circumstances, this has not been completed. It needs to be re-prioritized for a later release.
In response to Roger Palmen, "In cases where there is specific busines..."
@Arie - I agree with your use case - when the analysis triggers again for the same timestamp, potentially with new input data, it should evaluate again and replace the previously written result. We would handle this when we implement support for auto-recalculation.
@Roger - That's a good point. Just to add to it, I think there is a difference between the following two use cases:
Evaluate analyses as soon as input data is retrieved. When more/new input data is available (say with a delay), automatically recalculate - replacing previously written results. In this approach, consumers of calculation results have to expect that in some cases results may be incorrect, but would be eventually correct when all input data is available. The advantage is that you have calculation results available immediately (using the best available information for inputs at the time) - which could be useful (or even required) in some cases.Use some sort of validation logic (like you described) to only write outputs when all input data is available. In this case, consumers of calculation results would have the confidence that the results are correct. However, this approach requires handling cases when validation may never pass or could take a really long time (e.g. one of the input never arrived or arrived a day or week later).I can see both being useful in different situations. Analytics implementation allows you to do (1) [when auto-recalculation is supported]. While (2) may be possible using the approach you described, it can get a bit tedious since it's not that well supported out-of-the-box.
In response to Arie van Boven, "Hi Nitin, What you describe here, is e..."
Hi Arie - As Roger Palmen mentioned below, there is an inbuilt delay (default=5 seconds), such that multiple events (from different attributes) with same timestamp would not trigger the analysis more than once. Currently this wait time cannot be configured per analysis, but even if it could be, there could always be cases where some events would be received with delay. We intend to handle these use cases more generally when we implement automatic recalculation for handling late arriving or out-of-order events. The use case you described appears to be a special case of the same issue, and we would keep it in mind.
Hello Arie - I am trying to understand your use case. Does the analysis summarize two inputs independently and just writes the outputs to two different outputs, or does your final result depend on the two summaries. In the first case, you could possibly write these as two independent analyses which would trigger independently. The second case is more interesting. If you do in fact require both inputs to produce final output, when it evaluated at 14:00 the first time, the result essentially is incorrect as the analysis has not yet received the data for the other input (which arrives late at 14:02). If we were to evaluate the analysis again (when the second input arrives at 14:02), would you expect the new (correct) output value to replace the one that was previously written?