Using the ROC Curve to Measure Sensitivity & Specificity

by Irina 16. October 2007 11:58


Two indices are used to evaluate the accuracy of a test that predicts dichotomous outcomes (e.g. logistic regression) – sensitivity and specificity. They describe how well a test discriminates between cases with and without a certain condition.

Sensitivity - the proportion of true positives or the proportion of cases correctly identified by the test as meeting a certain condition (e.g. in mammography testing, the proportion of patients with cancer who test positive).

Specificity - the proportion of true negatives or the proportion of cases correctly identified by the test as not meeting a certain condition (e.g. in mammography testing, the proportion of patients without cancer who test negative).

The lift -is a measure of a predictive model calculated as the ratio between the results obtained with and without the predictive model.


Choosing a Cut-off

The position of the cut-off determines the number of true positives, true negatives, false positives, and false negatives. As you increase your sensitivity (true positives) and can identify more cases with a certain condition, you also sacrifice accuracy on identifying those without the condition (specificity). This value (C) can be estimated by maximizing the index J

J=MAX(Sensitivity(C) + Specificity(C))

Receiver Operating Characateristic (ROC) Curve

A Receiver Operating Characteristic (ROC) curve is a graphical representation of the trade off between the false negative and false positive rates for every possible cut off. By tradition, the plot shows the false positive rate (1-specificity) on the X axis and the true positive rate (sensitivity or 1 - the false negative rate) on the Y axis.1 The accuracy of a test (i.e. the ability of the test to correctly classify cases with a certain condition and cases without the condition) is measured by the area under the ROC curve. An area of 1 represents a perfect test, while an area of .5 represents a worthless test. The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test; the true positive rate is high and the false positive rate is low. Statistically, more area under the curve means that it is identifying more true positives while minimizing the number/percent of false positives

  ods select parameterestimates association;
    proc logistic data=data1;
       model disease/n=age / outroc=roc1 roceps=0;
       output out=outp p=phat;
       ods output association=assoc;
       run;
        data _null_;
        set assoc;
        if label2='c' then call symput("area",cvalue2);
        title "area=&area";

        proc gplot data=roc1; 
        plot _sensit_*_1mspec_; 

        run; 
        quit; 
       run;

It is important to use the ROCEPS=0 option in the MODEL statement of PROC LOGISTIC when you fit your model because this option allows all the unique predicted values to be output to the OUTROC= data set. Otherwise, the values may be rounded yielding fewer points on the ROC plot.

A scorecard for Logistic Regression models

by Irina 25. August 2007 08:58

Scorecards are a common way of displaying the patterns found by a logistic regression model. They display in a clear, intuitive way the regression coefficients and can be used to perform risk evaluation operations (simplified predictions). For one particular state, y1, we start by extracting the coefficients (c0,c1, ...) that describe the logistic regression formula for that state.

We convert to 0 the minimal coefficient in each variable and the rest of coefficients transform in the way that difference between the minimal coefficients and the rest of coefficients remains the same These coefficients are then normalized between, say, 0 and 1000, giving an intuitive perspective on the relative importance of each coefficient. As each coefficient corresponds to a state of an input attribute, the normalized values are also describing the relative importance of each input attribute state. The score card presented here is computing these relative importance scores. Score cards check certain conditions, and for example, and if these conditions are met, points are added to an overall score.

proc logistic data=Panel OutModel= ModelParam   namelen=200
descend    ;
class &groupp
/ 	param=glm ;
model target=&groupp/selection=stepwise;
output out=toz_LOGISTIC_2 p=phat_new xbeta=xb;
ods output ParameterEstimates =  coeff_est;
run;

proc sql ;
create table score_card as
select
b.*,
sum(max_est1/counter) as sum_max,
case when est1=max_est1 then 1 else 0 end as max_cat,
round(1000*((est1)/calculated sum_max)) as score

from (select
a.*,
max(est1) as max_est1,
count(*) as counter

from (
select *,min(Estimate) as min_est,
count(*) as counter,
case when calculated min_est=Estimate
then 0
else Estimate-  calculated min_est end as est1
from coeff_est
where variable ne 'Intercept'
group by variable )  a
group by variable  )b

;
quit;

Rare Event Data

by Irina 6. July 2007 10:35
In literatures, proven to be difficult to predict two problems:
  • Popular statistical procedure, such as logistic regression, sharply underestimate the probability of rare events
  • Commonly used data collection inefficient for rare event data

    Solution :

    More efficient sampling designs exist for making valid inference:
    For example: sampling all available events and a tiny fraction of nonevents
    Enable to save as much as 99% of data collection costs or / and be able to collect much more meaningful (expensive) feature variables

    Sampling :

    Examples (x, y, s)

  • S controls the selection of examples ( 1 means selected, 0 means not selected )
  • We have only access to S=1 examples
  • s is independent of x given y
  • P(s|x,y)=P(s|y)
  • Selected examples are biased
  • The biasness only depends only on label y
  • Corresponding to change in the prior probabilities of labels

  • This kind of sampling is also called oversampling, retrospective sampling, biased sampling, or choice-based sampling.
    The oversampling method has been widely used in signal detection theory and it consists of resampling the small class at random until it contains as many examples as the other class.
    The downsizing (undersampling) method consists of the randomly removed samples from the majority class population until the minority class becomes some specific percentage of the majority class.
    This produced two different datasets for each time step: one with a churner/nonchurner ratio 1/1 and the other with a ratio 2/3.

    In the biological sciences, studies using this kind of sampling are known as case-control studies.Parameter and odds ratio estimates of the covariates (and their confidence limits) are unaffected by this type of stratified sampling . However, the intercept estimate is affected by the sampling, so any computation that is based on the full set of parameter estimates is incorrect, such as the predicted event probabilities, differences or ratios of event probabilities . you know the probabilities of events and nonevents in the population, then you can adjust the intercept either by weighting or by using an offset.

    Adjusting the Intercept
    To adjust by weighting, add a variable to your data set that takes the value p1/r1 in event observations, and the value (1-p1)/(1-r1) in nonevent observations, where p1 is the probability of an event in the population and r1 is the proportion of events in your data set. Specify this variable in the WEIGHT statement in PROC LOGISTIC. Or, to adjust by using an offset, add a variable to your data set defined as log[(r1*(1-p1)) / ((1-r1)*p1)], where log represents the natural logarithm. Specify this variable in the OFFSET= option of the MODEL statement in PROC LOGISTIC.

    Example:

            data full;
            do i=1 to 1000;
              x=rannor(12342);
              p=1/(1+exp(-(-3.35+2*x)));
              y=ranbin(98435,1,p);
              drop i;
              output;
            end;
            run;
    
          data sub;
            set full;
            if y=1 or (y=0 and ranuni(75302)<1/9) then output;
            run;
    
          proc freq data=full;
            table y / out=fullpct(where=(y=1) rename=(percent=fullpct));
            title "response counts in full data set";
            run;
          proc freq data=sub;
            table y / out=subpct(where=(y=1) rename=(percent=subpct));
            title "Response counts in oversampled, subset data set";
            run;
          data sub;
            set sub;
            if _n_=1 then set fullpct(keep=fullpct);
            if _n_=1 then set subpct(keep=subpct);
            p1=fullpct/100; r1=subpct/100;
            w=p1/r1; if y=0 then w=(1-p1)/(1-r1);
            off=log( (r1*(1-p1)) / ((1-r1)*p1) );
            run;
    
          ods select parameterestimates(persist);
          proc logistic data=sub;
            model y(event="1")=x;
            output out=out p=pnowt;
            title "True Parameters: -3.35 (intercept), 2 (X)";
            title2 "Unadjusted Model";
            run;
          proc logistic data=out;
            model y(event="1")=x; weight w;
            output out=out p=pwt;
            title2 "Weight-adjusted Model";
            run;
         proc logistic data=out;
            model y(event="1")=x / offset=off;
            output out=out xbeta=xboff;
            title2 "Offset-adjusted Model";
            run;
          data out;
            set out;
            poff=logistic(xboff-off);
            run;
          proc freq data=full noprint;
            table y / out=priors(drop=percent rename=(count=_prior_));
            run;
          proc logistic data=out;
            model y(event="1")=x;
            score data=sub prior=priors out=out2;
            title2 "Unadjusted Model; Prior-adjusted probabilities";
            run;
    
    

    Scoring a New Data Set and interpreting of results

    by Irina 7. May 2007 11:25

    There are situations where we want not only to produce predicted probabilities but understand why we received such result. Or we want to divide the population with high scores into the groups according tho the the parameters that made a great contribution to the score.

    The beginning of the process: Automatic building of logistic model .

    Using the SAS code to make this : Interpret logistic score.

     

    Tags: logistic

    SAS

    Automatic building of logistic model.

    by Irina 21. April 2007 09:19

    Description: The data is a collection of information on colleges and universities ( only for example and not pretend to be real). The primary interest is in predicting of graduation . Potential predictor variables are tuition, income, wealth and grades on different subjectes-  200 rows .

    The process:

    1.

    The first section of code splits the file into modeling and validation data sets. Validation sets constructed from the 50/50 stratified sample should be adequate for the purposes of this exercise. I took 95/5 only for example and because of very small data

    				
    						
    DATA model_college;   
                                              
     SCAN: SET  college  end=eof; 
            N+1; 
            IF NOT eof THEN GOTO SCAN; 
            K=0.95*N;       * K IS THE NUMBER TO RANDOMLY SELECT  
                                   IT MAY BE A FUNCTION OF N, 
                                   E.G.: K=.05*N FOR A 5 PERCENT SAMPLE; 
     LOOP: SET  college ; 
           PROB=K/N;        * PROB IS THE CURRENT SELECT PROBABILITY;
    IF RANUNI(123467)>PROB THEN GOTO NEXT; OUTPUT; K=K-1; * THE OBSERVATION IS SELECTED; NEXT: N=N-1; IF N>0 THEN GOTO LOOP; RUN;
    For the next steps go to: Automatic building of logistic model

    Data for model:   college.txt (10.17 kb)

    Tags: logistic

    SAS | macro

    Hosmer and Lemeshow Test

    by Irina 14. April 2007 12:13

    Hosmer-Lemeshow test of goodness-of-fit can be performed by using the lackfit option after the model statement. This test divides subjects into deciles based on predicted probabilities, then computes a chi-square from observed and expected frequencies. It tests the null hypothesis that there is no difference between the observed and predicted values of the response variable.Therefore, when the test is not significant, as in this example, we can not reject the null hypothesis and say that the model fits the data well. We can also request the generalized R-square measure for the model by using rsquare option after the model statement. SAS gives the likelihood-based pseudo R-square measure and its rescaled measure.

     Categorical Data Analysis Using The SAS System, by M. Stokes, C. Davis and G. Koch offers more details on how the generalized R-square measures that you can request are constructed and how to interpret them.

     proc logistic data = hsb2;

     class prog(ref='1') /param = ref;

    model hiwrite(event='1') = female prog read math / rsq lackfit;

    run;

    Confidence intervals for the predicted values - logistic regression

    by Irina 14. April 2007 08:36
    Using predict after logistic to get predicted probabilities and confidence intervals is somewhat tricky. The following two commands will give you predicted probabilities:
            . logistic ...
            . predict phat
    
    The following does not give you the standard error of the predicted probabilities:
            . logistic ...
            . predict se_phat, stdp
    
    Despite the name we chose, se_phat does not contain the standard error of phat. What does it contain? The standard error of the predicted index. The index is the linear combination of the estimated coefficients and the values of the independent variable for each observation in the dataset. Suppose we fit the following logistic regression model:
            . logistic y x 
    
    This model estimates b0 and b1 of the following model: P(y = 1) = exp(b0+b1*x)/(1 + exp 0+b1*x)) Here the index is b0 + b1*x. We could get predicted values of the index and its standard error as follows:
            . logistic y x
            . predict lr_index, xb
            . predict se_index, stdp
    
    We could transform our predicted value of the index into a predicted probability as follows:
    . gen p_hat = exp(lr_index)/(1+exp(lr_index))
    
    This is just what predict does by default after a logistic regression if no options are specified. Using a similar procedure, we can get a 95% confidence interval for our predicted probabilities by first generating the lower and upper bounds of a 95% confidence interval for the index and then converting these to probabilities:
    
    . gen lb = lr_index - invnorm(0.975)*se_index
    . gen ub = lr_index + invnorm(0.975)*se_index
    . gen plb = exp(lb)/(1+exp(lb))
    . gen pub = exp(ub)/(1+exp(ub))
    
    Generating the confidence intervals for the index and then converting them to probabilities to get confidence intervals for the predicted probabilities is better than estimating the standard error of the predicted probabilities and then generating the confidence intervals directly from that standard error. The distribution of the predicted index is closer to normality than the predicted probability.
  • Confidence intervals for the predicted values - logistic regression-stata
  • About the author

    Irina Spivak Irina Spivak
    Team Leader at G-Stat. More...


    Send mail Email

    Blogroll

      Disclaimer

      The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

      © Copyright 2013

      Sign in

      eXTReMe Tracker