Subscribe to Vincent Granville's Weekly Digest:
Jason Liu
  • Wheeling, IL
  • United States
Share Twitter

Jason Liu's Discussions

Does this requirement make sense to you?
9 Replies

A prospect asked us to build a predictive model. They will provide some tagged data for model development and some untagged data for testing. We were required to send back the untagged data with our…Continue

Started this discussion. Last reply by Jason Liu Oct 24, 2012.

 

Jason Liu's Page

Latest Activity

Jason Liu replied to Jason Liu's discussion Does this requirement make sense to you?
"Edmund- I totally agree with you. If the client is asking for score instead of decision in the returned data, I don't have to worry about what evaluation criteria they are going to use. "
Oct 24, 2012
Edmund Freeman replied to Jason Liu's discussion Does this requirement make sense to you?
"I don't see the false positive issue you're talking about, but I do think the client can do a lot better than what they are asking for. 2x2 confusion matricies are a poor way to judge model performance. For instance, if there is a low…"
Oct 24, 2012
Ralph Winters replied to Jason Liu's discussion Does this requirement make sense to you?
"Lynne - This is a modeling contest and the judging criteria can be what ever they want it to be.  Take a look at some of the evaluation criteria used by Kaggle and you will see why. https://www.kaggle.com/wiki/Metrics"
Oct 23, 2012
Jason Liu replied to Jason Liu's discussion Does this requirement make sense to you?
"Thanks, Lynne. I agree with you theoretically, but in practice, the prospect should have a single measure to evaluate models. There is no way to minimize type 1 error and type 2 error at the same time, we need to find a balance b/w these two, and…"
Oct 22, 2012
Lynne Mysliwiec replied to Jason Liu's discussion Does this requirement make sense to you?
"It's a classification model - correct classification rate where type 1 and type 2 error are minimized is the most important measure of model fitness.  The client is actually doing it the right way. Our job as statisticians and modelers is…"
Oct 21, 2012
Ralph Winters replied to Jason Liu's discussion Does this requirement make sense to you?
"Sounds like your prospect is uninformed about the criteria for a good predictive model.  But then again, a lot of clients can be swayed by these model "competitions" (e.g. Netflix), in which the best model is judged via simple…"
Oct 20, 2012
Charlie Greenbacker replied to Jason Liu's discussion Does this requirement make sense to you?
"This sounds like an information extraction task: find something in unstructured data and tag it with the right label. The organizers will probably score submissions according to F-measure."
Oct 19, 2012
Rod Tjoelker replied to Jason Liu's discussion Does this requirement make sense to you?
"In many situations both the false positives and the false negatives are important.  There is a trade-off between increased "recall" (finding all the cases of interest) and increased "precision" (reduced false…"
Oct 18, 2012
Lynne Mysliwiec replied to Jason Liu's discussion Does this requirement make sense to you?
"The output they're requesting seems consistent with what I've seen for similar competitions in the past.  The idea is accuracy - if your models are going to be used for real time credit decisions, then the only thing they're…"
Oct 18, 2012
Jason Liu's discussion was featured

Does this requirement make sense to you?

A prospect asked us to build a predictive model. They will provide some tagged data for model development and some untagged data for testing. We were required to send back the untagged data with our decision- the data will have only two columns: ID and decision (good/bad). This is a modeling contest, our results will be compared with those of our competitors, but I feel the returned data required by the prospect does not make sense to me. If our model rejects 1000 and false positive is 20%,…See More
Oct 16, 2012
Jason Liu posted a discussion

Does this requirement make sense to you?

A prospect asked us to build a predictive model. They will provide some tagged data for model development and some untagged data for testing. We were required to send back the untagged data with our decision- the data will have only two columns: ID and decision (good/bad). This is a modeling contest, our results will be compared with those of our competitors, but I feel the returned data required by the prospect does not make sense to me. If our model rejects 1000 and false positive is 20%,…See More
Oct 16, 2012

Profile Information

Field of Expertise:
Business Analytics, Predictive Modeling, Data Mining, Marketing Databases, Statistical Programming
Professional Status:
Manager
Interests:
Networking

Comment Wall

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

  • No comments yet!
 
 
 

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service