I agree with Jozo. If historically your response rate is relatively low, you can get away with creating your dataset using the known responders (and they are obviously the 'ones' in terms of the Y, and using everyone else as the zeros. The reason th…
Yi-Chun,
It's less a matter of what technique to use and more a matter of what data you have available to build the model. What historical data do you have on your customers (i.e. -credit/default history, demographic, financial and credit data).
Great points Biswajit. The more information, the better. The only problem with some of the variables (wearing shoes etc...), is that you might have that information from accident reports in some cases, but even if you had that information for a acci…
I agree with you Steffen that there are still cars that are more safe and less safe than other cars due to their technical capabilities. Vincent was trying to explore whether there might be other influencing factors, aside from the car's technical s…
Good points Matt. If a difference in a statistic between two groups is found to be “statistically significant’ it simply means that based on sample size, variation, and the value of the measured statistic, the difference you have seen is not likely…