Data Intelligence, Business Analytics
Hi,
I am trying to perform clustering on my customer files with about 80K customers and 50 variables.
Instead of using either just hierarchical or non-hierarchical methods in SAS, I first tried to determine the "OPTIMAL" number of clusters and their seeds using PROC CLUSTER.
Next, I will feed this information/seeds into PROC FASTCLUS to refine the clusters. This was the recommendation that someone gave to me: use hierarchical method first to get the seeds and feed the seeds to non-hierarchical methods to fine tune the clusters.
However, it took forever for PROC CLUSTER to even create clusters for my 80K customers. I had to abandoned it before it returned any result.
Can anyone suggest a way to deal with big data set like mine? Thanks.
Tags:
Permalink Reply by Tom Wolfer on September 30, 2010 at 5:29pm
Permalink Reply by Ralph Winters on September 30, 2010 at 8:32pm
Permalink Reply by Yi-Chun Tsai on October 1, 2010 at 10:50am
Permalink Reply by Kumud Joseph Kujur on October 1, 2010 at 1:22pm
Permalink Reply by Hariharan Sunder on October 5, 2010 at 7:49am
Permalink Reply by Yi-Chun Tsai on October 5, 2010 at 8:49am
Permalink Reply by Kumud Joseph Kujur on October 6, 2010 at 12:39pm
Permalink Reply by Yi-Chun Tsai on October 6, 2010 at 1:07pm
Permalink Reply by Hariharan Sunder on October 6, 2010 at 2:17pm
Permalink Reply by Tom Wolfer on October 6, 2010 at 3:30pm
Permalink Reply by Kumud Joseph Kujur on October 7, 2010 at 11:35am
Permalink Reply by deepa bharti on June 9, 2011 at 1:06am Hi Kumud,
I need some clarification. I know that clustering can be used with binary transformation using distance matric but can fastclust be used in the same fashion. Please let me know your thoughts on this.
Thanks,
Deepa
© 2013 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC