Add normal interval to plotratio #182

paulgessinger · 2019-10-17T09:12:32Z

I was thinking about adding the efficiency error calculation as is implemented in TEfficiency. Before I write comments and/or tests, I wanted to ask/discuss if this makes sense, since there is already the num method.

One thing I noticed with the way the errors are calculated (see https://root.cern.ch/doc/master/TEfficiency_8cxx_source.html#l02515 and below) is that if num==denom then the error will be 0. Does that even make sense? Technically, scipy.stats.norm.ppf complains if it is called with sigma = 0, so I'm masking out zero elements and manually set those to zero errors.

Thoughts?

lgray · 2019-10-17T14:01:20Z

Please fix flake8 violations:

$ python setup.py flake8
running flake8
coffea/hist/plot.py:80:1: E302 expected 2 blank lines, found 1
coffea/hist/plot.py:83:13: E226 missing whitespace around arithmetic operator
coffea/hist/plot.py:85:17: E201 whitespace after '('
coffea/hist/plot.py:85:30: E226 missing whitespace around arithmetic operator
coffea/hist/plot.py:85:50: E202 whitespace before ')'
coffea/hist/plot.py:88:15: E226 missing whitespace around arithmetic operator
coffea/hist/plot.py:88:18: E226 missing whitespace around arithmetic operator
The command "python setup.py flake8" exited with 1.

lgray · 2019-10-17T14:04:41Z

Since the error on an efficiency is derived from two correlated numbers, when the numerator and denominator are equal their variations are 100% correlated and of the same size and so the statistical error on an efficiency value of 1 is zero.

This is manifestly undercovering and there are a ton of recipes for getting good coverage and the tradeoffs between them. Typically Clopper-Pearson is favored, but having other implementations of efficiency errors is welcome.

lgray · 2019-10-17T14:05:04Z

Please add a test for this new function.

nsmith- · 2019-10-17T15:37:13Z

So, of course clopper-pearson is already available.
What this looks like is an attempt to find the frequentist coverage for a ratio of independent normal distributions, is that correct? Then can you cite https://en.wikipedia.org/wiki/Ratio_distribution#Uncorrelated_noncentral_normal_ratio rather than a line in ROOT with no context.

paulgessinger · 2019-10-17T16:00:22Z

My understanding was that Clopper-Pearson is not valid for non-integer num/denoms.

I think what you're describing is basically what is happening there. The reason I link to the ROOT code is that that's literally where I have the algorithm from, and I think it makes sense to disclose that. I can certainly add the wikipedia link you provided as well, but I'd prefer keeping the other URL in there as well.

lgray · 2019-10-17T16:07:46Z

Ah yeah, fair enough, this would be needed for efficiencies derived from weighted data (which is kind of rare, but definitely happens).

paulgessinger · 2019-10-17T16:09:37Z

Well I have this specific case right now, that's why I thought of adding it 😄

nsmith- · 2019-10-17T16:37:13Z

Ah, nevermind, this is just the result of expanding eff = p/(p+f) in partial derivatives of p and f.
Maybe we should just allow a callback function to be passed into plotratio.

coffea/hist/plot.py

nsmith- · 2019-10-17T20:37:47Z

I think we should go ahead with this PR. @paulgessinger can you fix the flake8 complaints?

paulgessinger · 2019-10-17T20:51:58Z

Will do.

lgray · 2019-10-18T08:17:17Z

@paulgessinger can you please add a test for: normal_interval? Thanks!

paulgessinger · 2019-10-18T08:19:45Z

I'm working on it.

paulgessinger · 2019-10-18T11:19:19Z

Flake8 should be fixed, I added a test that compares against test output I got from TEfficiency. The test also covers edge cases like passed == total and passed == total == 0 (which results in a nan)

lgray · 2019-10-18T11:42:32Z

Looks good, thank you!

lgray · 2019-10-18T11:54:19Z

I added a docstring to the function after the fact.
Can you take a look at:
https://github.com/CoffeaTeam/coffea/blob/master/coffea/hist/plot.py#L82-L98
and let me know if it's ok?

paulgessinger · 2019-10-18T12:41:21Z

Yeah looks good I think.

add normal interrval to plotratio

77389e8

paulgessinger changed the title ~~Add normal interrval to plotratio~~ Add normal interval to plotratio Oct 17, 2019

Merge branch 'master' into normal-interval-eff-error

983a861

nsmith- reviewed Oct 17, 2019

View reviewed changes

coffea/hist/plot.py Outdated Show resolved Hide resolved

paulgessinger added 2 commits October 18, 2019 10:12

flake8 fixes

c6971fa

s/_coverage1sd/coverage/g

217ed3e

paulgessinger added 2 commits October 18, 2019 13:18

flip sign

a4fd53a

add test for normal_interval

a8a0ddc

Merge branch 'master' into normal-interval-eff-error

74e1d7d

lgray merged commit 97cb79b into scikit-hep:master Oct 18, 2019

paulgessinger deleted the normal-interval-eff-error branch October 18, 2019 11:49

nsmith- mentioned this pull request Apr 8, 2021

feat: Add frequentist coverage intervals module scikit-hep/hist#176

Merged

Add normal interval to plotratio #182

Add normal interval to plotratio #182

Uh oh!

Conversation

paulgessinger commented Oct 17, 2019

Uh oh!

lgray commented Oct 17, 2019

Uh oh!

lgray commented Oct 17, 2019

Uh oh!

lgray commented Oct 17, 2019

Uh oh!

nsmith- commented Oct 17, 2019

Uh oh!

paulgessinger commented Oct 17, 2019

Uh oh!

lgray commented Oct 17, 2019

Uh oh!

paulgessinger commented Oct 17, 2019

Uh oh!

nsmith- commented Oct 17, 2019

Uh oh!

Uh oh!

nsmith- commented Oct 17, 2019

Uh oh!

paulgessinger commented Oct 17, 2019

Uh oh!

lgray commented Oct 18, 2019

Uh oh!

paulgessinger commented Oct 18, 2019

Uh oh!

paulgessinger commented Oct 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lgray commented Oct 18, 2019

Uh oh!

lgray commented Oct 18, 2019

Uh oh!

paulgessinger commented Oct 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

paulgessinger commented Oct 18, 2019 •

edited

Loading