Skip to content

Conversation

@zer0n
Copy link
Contributor

@zer0n zer0n commented Jul 11, 2018

No description provided.

Copy link
Contributor Author

@zer0n zer0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't I see any deletion?

return window


def nab_score(y_true, y_pred):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not call it nab score. It's a fairly standard metric.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't deleted the earlier evaluation (mAP) module. For the new metric, I have created an additional file. I'll change the name from nab_score to weighted_EarlyDetection_score.

@zer0n zer0n self-assigned this Jul 18, 2018
dtseng and others added 2 commits July 19, 2018 17:14
Copy link
Contributor Author

@zer0n zer0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satyanshukla Please update this

@param window_scale_limit (float): The largest amount the windows to be expanded by. For example, for
window_scale_limit=2, the new windows will be at most
2 * (current window size).
@param goal_sparsity (int): The goal sparsity of the window after increasing the window size.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_sparsity would be more precise

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""

for i, window in enumerate(windows):
if index <= window[1] and index >= window[0]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

window[0] <= index and index <= window[1] is easier to read.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if index <= window[1] and index >= window[0]:
return window
elif index > window[1] and index < windows[i+1][0]:
return window
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You return window in both cases? Don't you want to return None here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you return at the end of the loop?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I had these two cases for clarity, have combined them into one.
  2. There will never be the end of the loop, those cases have already been taken care before calling this function.

import numpy as np

def scaledSigmoid(relativePositionInWindow):
if relativePositionInWindow > 3.0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it ever happen? The condition for this function to be called is that the prediction point in properly within the window, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the point is predicted right outside the window, the scaled sigmoid is used to calculate its score.

sparsity = sum(y_true)/float(len(y_true))
label_windows = add_buffer_to_label(sparsity, label_windows, 0, len(y_true))

detection_info = {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use set instead

tp_score = 0
fp_score = 0
fn_score = 0
for i in range(len(y_pred)):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is i the index of the time series? If so, I would name it t to be clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if i < label_windows[0][0]:
fp_score += -1.0*fp_weight
elif i > label_windows[-1][1]:
position = abs(label_windows[-1][1]-i)/float(label_windows[-1][1]-label_windows[-1][0])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the latest dicussion, should it be fp_weight score?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you can not give the same score to fp_weight and fn_weight. For standard score, NAB suggests tp_weight = 1.0, fp_weight = 0.11 and fn_weight = 1.0. This is to compensate for the fact that it is okay to have a couple of false positives if the algorithm predicts some true positives.


else:
cWindow = getCorrespondingWindow(i, label_windows)
if i <= cWindow[1] and i >= cWindow[0] and detection_info[cWindow] == 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there there are several points falling in the same window, we should not over count them.

A cleaner implementation would be:

1. Go through the windows
2. For each window:
   - Find the first prediction index in that window (from window[0] to window[1])
   - Change the subsequent 1's in the predictions, in the window range, to 0
   - Compute the score; if no anomaly prediction found, return the false negative weight
   - Return the score
3. The remaining prediction points would be false positives. Add those to the total score.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For false positives also, we need to find the preceding window. If there are several points falling in the same window, we take care using the detection_info dictionary. If a window is detected once, detection_info for that window turns to 1 and subsequent points falling in that window will not contribute to the score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear whether you we explaining that the existing code works or you're going to submit an update commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants