Subscribe to Vincent Granville's Weekly Digest:

General Technical Questions (125)

Discussions Replies Latest Activity

Computation of Weight of Evidence when either the number of bads or goods in a class of a variable is 0

Hi All! I want to understand the ways in which Weight of Evidence (WoE) is computed or adjusted in the following scenarios:   1. When numbe…

Started by Sharath Dandamudi

3 Dec 11, 2012
Reply by kiran chapidi

Weight statement and oversampling or undersampling

I need an understanding of the usage of Weight statement.   Background - I had to build a logistic regression response model on rare event…

Started by RockyRambo

2 Oct 25, 2012
Reply by RockyRambo

WOE v/s using continuous variables as such

What are the advantages/disadvantages of using WOE approach viz-a-viz. using continuous variables in their original form..For e.g. Making b…

Started by RockyRambo

2 Oct 24, 2012
Reply by RockyRambo

Hadoop and Data Mining

Hadoop and Big Data are buzzwords these days. How does it affect data mining workers? Should it be completely transparent for people only u…

Started by Jason Monte

4 Aug 15, 2012
Reply by Ingo Mierswa

How do you estimate the proportion of bogus accounts on Facebook?

Facebook has 800MM users. Out of these 800MM "users", how many are duplicate (or triplicate), fake, dummy, inactive, decoy, stolen IDs, non…

Started by Capri

11 Jul 24, 2012
Reply by Lynne Mysliwiec

How To Determine If A Sample Is Representative

If the sample is obtained through simple random sampling, would it be automatically representative of the population? If not, what is the w…

Started by Jason Monte

5 Jul 24, 2012
Reply by Lynne Mysliwiec

Independent variables need to be normally distributed in multiple regression?

Below is a quote regarding logistic regression. It seems it is saying OLS regression requires independent variables to be normally distribu…

Started by Jason Monte

5 Jul 20, 2012
Reply by Sean Flanigan

How are database joins optimized? How can you do better to handle big data?

Joins are typically Cartesian products and in many database systems, can be very slow. What are the best solutions to optimize joins From…

Started by Vincent Granville

3 Jul 15, 2012
Reply by Ralph Winters

How do your quantify data as large, big, or huge?

How big your data is depends on the quantity of information that it contains (measured using entropy metrics), rather than the number of te…

Started by Vincent Granville

4 Jul 1, 2012
Reply by Rex Pruitt

Weekly Seasonal Adjustment factors for Annualized Sales?

I know that for a monthly series of data, there are 2 well known programs that are free and reliable enough to use:1) TRAMO-SEATS2) X-12 A…

Started by Dane Sorensen

1 Jun 12, 2012
Reply by Dane Sorensen

RSS

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service