AnalyticBridge

Social Network For Analytic Professionals

Hello,

I am working on a Project on Time Series Data. I need to segment customer who have similar time series trends. Can someone guide me how to approach such Time Series Clustering issues.

Tags: cluster, patterns, time

Share

Reply to This

Replies to This Discussion

Hi,

This is Biswajit represent StatSoft India. www.StatSoft.com. We will appreciate if you can share your details & particulars to discuss in length & details.Anticipate & expect may contribute & add value to your current project.

Thanks !
Biswajit.

Reply to This

I assume when you say "trend" you mean long-run stocastic trend. The only way I know of extracting long-run stocastic trend information is through cointegration techniques. Moving average or ARIMA models will not do it. However, you need a lot of observations to make identification of cointegrated time series meaningful. The data can be analyzed in an unrestricted VAR model and then tested. The CATS for RATS program will do this, as will several other econometric packages. Good luck. -jr

Reply to This

Rubin, Thank You for all your suggestion and time. I try those models and contact you for any further advice.

Biswajit, the data is about consumer transaction variable collected for every month for 12 months period. We have thousands of consumers. We need to see whether there are common patterns in consumer spending by clustering techniques.

Reply to This

Hi Kartteek,

This is Biswajit represent StatSoft India creator & publisher of STATISTICA since 1984.We embed approximate 6,00,000 & 1000 Users globally & in India trust,confidence & reference to our engagement.We have a product called STATISTICA AUTOMATED NEURAL NETWORK (SANN) for predictive analytics address the areas of Regression,Time Series,Cluster, Classification for a 360 degree technical view you may reach us biswajit@statsoftindia.com or + 91 98913 93138 (M)

Thanks !
Biswajit.

Reply to This

Why don't you try correspondence analysis? Suppose there are 1000 customers and 12 months of transaction data. First, set up the data in a 1000 x 12 matrix of customers by variables. Second, scale each customers' data so that the sum of its 12 months of transaction data equals 1. The rescaling removes size differences among customers. Then perform a correspondence analysis on the scaled matrix.

Suppose that customers have one of 3 different linear trends and there is no other variation in the data. If this is the case the first 3 right eigenvectors will identify these trends. Of course, there is always other variation in the data so the entire set of right eigenvectors must be examined.

The correspondence analysis identifies all sorts of patterns. For example, if one of the customer segments has a quadratic trend, an eigenvector with that pattern will show up. The customers in that segment will have a large coordinate on that component while customers in the other segments will not. Also, if there is a seasonal pattern, correspondence analysis will identify it.

Hope this helps.

Reply to This

Thanks for your suggestion Goldrick. I used Correspondence Analysis as you mentioned, but I'm not sure how good to use this for mining large database.Could you please give me some example or applications on time series data.

Atul, My data has

Cust_ID, Amount spent in M1 M2 M3 M4 ... M12 month.

I need to cluster customers who have common spending patterns. i.e I expect to see cluster of customers who spend more during Christmas.

Thanks for your time.

Reply to This

Karteek:

Okay. In your application, the correspondence analysis generates 11 eigenvectors in order of their explanatory power, from highest to lowest. The next thing to do is perform a "biplot" of the first 2 dimensions. The biplot will show both customers and months on the same 2-dimensional graph. If customers actually do cluster meaningfully, you will see it in this plot. You will also see which customer clusters are "close" to which months. This will tell you something about which customers increase their purchases around Christmas.

Now, it may be the case that customers cluster on one or more of the other 9 dimensions that have less explanatory power than the first 2 dimensions. You can check for this by performing a biplot on these other dimensions, for example, 3 and 4, 5 and 7, etc.

You will have to look at the patterns of the column eigenvectors to interpret them. For example, if there are 2 column eigenvectors that are monotonically increasing or decreasing from January to December, this probably indicates that a large number of customers follow 1 of 2 different trends. If you perform a biplot of these 2 dimensions you'll be able to see which customers cluster near which trend.

If there are too many customers for the biplot to be visually informative, you can perform hierarchical agglomerative clustering on the row "eigenvectors" for the subset of dimensions of interest (trend, Christmas, etc.).

Hope this helps.

Reply to This

Sorry Goldrick, I was working on some other project and couldn't able to respond. I tried doing hierarchical clustering on eigen vectors since we have lot of consumer data. Our results shows that there are peak transactions almost in every month. Should validate these results looking into other parameters.

Anyway Thanks for your guidance.

Reply to This

hello karteek

i will suggest u about this problem, just make clusters of your customers then u can simply make a 3d histogram taking cluster number on one axis and month on the other axis(any software will do this for u).
this may give u a picture of which spending patterns of group of customers in different months

Reply to This

Hi,

I would really appriciate if u could share with me if you want to make clusters of comsumers or the consumption

Reply to This

Did you mention what platform you are on? Microsoft's Sequence Clustering algorithm in SQL Server Analysis Services is designed to address that kind of a problem--sequence patterns. Also, I would think that web analytics software might have something to offer as sequences of clicks on websites are critical, but I have very limited experience with that.

Reply to This

RSS

Featured


Advertisement

© 2010   Created by Vincent Granville

Badges  |  Report an Issue  |  Privacy  |  Terms of Service