I would like to know as to how one goes about deciding on an optimum sample size before embarking on building a model?For eg. lets say I am planning to build a credit risk scorecard using logistic regression on a database of 1 million customers and the bad rate is 8%.I decide to build the model not on the entire population (i.e. 1 million in this case) but only take a random sample and then furth…