Data Intelligence, Business Analytics
Hi,
I am trying to perform clustering on my customer files with about 80K customers and 50 variables.
Instead of using either just hierarchical or non-hierarchical methods in SAS, I first tried to determine the "OPTIMAL" number of clusters and their seeds using PROC CLUSTER.
Next, I will feed this information/seeds into PROC FASTCLUS to refine the clusters. This was the recommendation that someone gave to me: use hierarchical method first to get the seeds and feed the seeds to non-hierarchical methods to fine tune the clusters.
However, it took forever for PROC CLUSTER to even create clusters for my 80K customers. I had to abandoned it before it returned any result.
Can anyone suggest a way to deal with big data set like mine? Thanks.
Tags:
Permalink Reply by Prashant on October 1, 2010 at 7:48am
Permalink Reply by Tom Wolfer on October 2, 2010 at 4:27pm
Permalink Reply by Tom Wolfer on October 2, 2010 at 4:51pm
Permalink Reply by Ralph Winters on October 4, 2010 at 11:08am
Permalink Reply by Tom Wolfer on October 5, 2010 at 10:01am
Permalink Reply by Ralph Winters on October 6, 2010 at 1:00pm
Permalink Reply by Tom Wolfer on October 7, 2010 at 11:28am
Permalink Reply by paul d on October 11, 2010 at 8:01pm
Permalink Reply by Tom Wolfer on October 12, 2010 at 10:06am
Permalink Reply by paul d on October 12, 2010 at 2:19pm
Permalink Reply by Ralph Winters on October 13, 2010 at 1:46pm
Permalink Reply by paul d on October 13, 2010 at 5:10pm
© 2013 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC