Subscribe to Vincent Granville's Weekly Digest:

Information

Data Mining

Architecture, algorithms, statistical techniques, real life problems, real time, distributed systems, software. Emphasis on very large data sets.

Members: 402
Latest Activity: Feb 1

Analytic News



Data Mining Books

 

Discussion Forum

what programming language to pick up 15 Replies

Started by Yi-Chun Tsai. Last reply by Jonathan Seller Nov 18, 2011.

Comparison on RapidMiner, SAS Enterprise Miner, R and orange 13 Replies

Started by jadelim. Last reply by Nissim Matatov Jul 19, 2012.

Prediction mode for Resource(Employee ) utilization 12 Replies

Started by DataMiner. Last reply by DataMiner Aug 11, 2010.

plotting of graph in R 11 Replies

Started by jadelim. Last reply by jadelim Jul 30, 2012.

Modeling when only event data is available 11 Replies

Started by Rahul V. Last reply by Steffen Springer Nov 26, 2009.

what other data mining techniques can I use 10 Replies

Started by Yi-Chun Tsai. Last reply by Yi-Chun Tsai Oct 30, 2009.

any way of getting SAS Enterprise Miner 9 Replies

Started by Yi-Chun Tsai. Last reply by moharam Oct 28, 2011.

plot a graph by first letter of variable in R 8 Replies

Started by jadelim. Last reply by jadelim Jul 22, 2012.

Comprehensive Listing of Data Mining Methodologies (by Dr. Saed Sayad) 8 Replies

Started by Titus. Last reply by Irina Sered Mar 14, 2011.

Initialization methods for k-means clustering. 8 Replies

Started by DataMiner. Last reply by YONGYANG HUO Feb 9, 2011.

Decreasing Dataset Dimensionality 7 Replies

Started by Paul Wilson. Last reply by Ralph Winters Jun 17, 2010.

Data Mining Graduate Certificate Options 7 Replies

Started by Sandra Donlon. Last reply by Tom Wolfer Sep 15, 2010.

K-means, SOM, k-nn or classical clustering methods? 7 Replies

Started by Urko. Last reply by Urko Jan 27, 2011.

Different approaches to simple counting question 7 Replies

Started by Vincent Granville. Last reply by Emory Creel Aug 18, 2008.

Transductive SVM for semi supervised learning 6 Replies

Started by Paul Wilson. Last reply by Paul Wilson Jun 22, 2010.

Lift Chart 5 Replies

Started by Paul Wilson. Last reply by Paul Wilson Feb 1, 2010.

How to get R started 4 Replies

Started by jadelim. Last reply by jadelim Jul 10, 2012.

Good R square FMCG industry 3 Replies

Started by Mindy Scott. Last reply by Jarkko Venna Feb 7, 2012.

Data mining in human resources 3 Replies

Started by yaser yadekar. Last reply by Zakaria Y. AL-Jammal Nov 9, 2009.

Comparison of predective Data Mining Tools 2 Replies

Started by KHELOUFI Tarik. Last reply by Jozo Kovac Feb 1, 2011.

Comment Wall

Comment

You need to be a member of Data Mining to add comments!

Comment by Greg Makowski on August 16, 2012 at 12:15am

Join our Data Mining Hackathon on Big Data.  We are having a kickoff meeting this Saturday in Sunnyvale, CA (but you can join the competition any time).  The problem is product recommendation of products on the Best Buy mobile web site.  We are working with www.Kaggle.com

http://www.sfbayacm.org/DM-Hackathon-2012-10

Our competitions include:

* Hadoop sized prediction

* PC sized prediction

* Data visualization

We have support of 3 cloud computing vendors for cloud computing time to get you started.

The awards will be given at the Data Mining Camp, Sat Oct 13 in San Jose, CA. 

I also work with sponsors for www.SFbayACM.org events.

Thanks, Greg

Comment by dabsy on July 5, 2012 at 6:05pm

Hi all, Is there anybody interested in sharing the cost of :SAS Certification Practice Exam: Predictive Modeling Using SAS Enterprise Miner 6, with me? It cost only $55 for 180 days.If interested please message me and we go from there.

Comment by Vera Klimkovsky on September 29, 2011 at 5:55pm

Join us for the FREE ACM Data Mining Camp on October 15 at eBay San Jose.

Learn more, watch the video

http://www.youtube.com/watch?v=aEcW9qwdopw

Comment by Nishant Modi on October 19, 2010 at 9:21am
That would depend on the situation and what you want to do. If one wants to really make some intense changes or tweak in the algos then writing your own code is more favorable, what I mean if you are more into research kind of environment. But if just getting the results is your daily job then going with the versions provided by different vendors would be the preferred option.
Comment by Yi-Chun Tsai on October 19, 2010 at 7:02am
Thanks for sharing this. I will check it out. I guess the next question is whether it is better to code your own data mining algorithms or it's better to use what is available either in commercial market like SAS/SPSS or open ware like R/RapidMiner. What do you think?
Comment by Nishant Modi on October 19, 2010 at 6:30am
Well, it would essentially depend on your programming background. But I would suggest you may go with Java because of 2 reasons 1) You will not have to go through hassles like pointers in C++ and many other issues from which you would like to stay away 2) Java has very large set of APIs with relevant documentation with many operations be it mathematical, data structures etc. already implemented. So you will not have to start things from scratch and would be mainly focused on your Data Mining part.

Also, the Data Mining APIs were going to be made available with JDM (Java Data Mining) APIs but I am not sure if they are out or not.

Hope this helps.
Comment by Yi-Chun Tsai on October 19, 2010 at 5:51am
Hi, Nishant:
You are welcome. I hope it helps. By the way, if you can suggest, what programming language would you suggest to me if I want to code data mining algorithms by myself instead of using out of the shelf packages such as SAS EM or SPSS Clementine? C++ or Java or Python? Thanks.
Comment by Nishant Modi on October 19, 2010 at 5:47am
Thanks Yi-Chun, your suggestions are helpful.
Comment by Yi-Chun Tsai on October 19, 2010 at 5:35am
Hi, Nishant:
Yes, that was what I meant.
Comment by Nishant Modi on October 19, 2010 at 5:30am
Thanks Yi-Chun for your suggestions.
Sorry, but just want to be sure if I have fully understood your point. So you use clustering to get a few clusters of attributes.
And then from each cluster you pick out attribute keeping in mind aspects like multicollinearity, confounding etc and the finally you get a set of attributes with which you would go ahead to model.

Thanks
Nishant

Cheers :))
 

Members (400)

 
 
 

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service