Data Intelligence, Business Analytics
Hi,
Is cutoff point in logistic regression related to event rate in the dataset? Your inputs on this is highly appreciated.
Best,
Hari
Tags:
Permalink Reply by Sandeep Sunkara on February 15, 2011 at 1:33am Yes it is.
If you have event rate of 10%, then the predicted probabilities will cluster around 0.1 and hence the cut-off point will also be arount 0.1.
If you have event rate of 70%, then the predicted probabilities will cluster around 0.7 and hence the cut-off point will also be arount 0.7.
Hope it helps.
Permalink Reply by Minethedata on February 17, 2011 at 12:30am
Permalink Reply by Sandeep Sunkara on February 17, 2011 at 12:38am
Permalink Reply by Jozo Kovac on March 26, 2011 at 7:07am
Permalink Reply by Triveni Hiremath on March 30, 2011 at 2:22am Hi Hari,
if u r talking about cut point for probability value, u can decide it by 2 ways .
1. Calculate the misclassification cost for different probability values, and choose the one which will have least misclassification cost. .
2 . Draw lift chart for probility values, number of acuretely classified events per decile (In precise Results of KS test). Point where u get highest distance is the cut of point for your probability
Hope this helps
Permalink Reply by Arun on March 30, 2011 at 10:45am If your event rate is around 17% and you say that at 50% cutoff you're getting a very good classification, there's something fishy! How can a logistic model trained to fit only 17% be better than what information the dataset has?
Unless, you're measure of accuracy of fit is different from misclassification! Remember, the model usually fits the remaining 83% well, so the misclassification there would be low as compared to the 17%. But I'm unsure how you're getting a 50% cutoff more accurate in terms of misclassification - since, a decrease here, is going to increase it there.
The best way to find out the cutoff is by plotting for different values as already suggested, but it's usually got to be around the event rate! In cases where you fit multiple logistic models for homogeneous segments, you could generally lift the cutoff point, not otherwise from my experience!
Would be interesting to know what you find out...
© 2013 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC