"Hi Manish,
It is not totally clear what you are trying to do. Depending on the situation there are several different strategies that you could try.
If you have some kind of cost associated with the vector values, you can use it to sort the…"

A Data Science Central Community

Jarkko Venna replied to Manish's discussion How to rank vectors based on their features? in the group Analytical Techniques

"Hi Manish,
It is not totally clear what you are trying to do. Depending on the situation there are several different strategies that you could try.
If you have some kind of cost associated with the vector values, you can use it to sort the…"

Apr 15, 2012

Jarkko Venna replied to Mindy Scott's discussion Good R square FMCG industry in the group Data Mining

"Hi,
sometimes it is not possible to avoid the multicollinearity. Especially if the correleated variables are categorical. One thing that you could try is to transform the correltated variables using for example PCA to form new variables that are not…"

Feb 7, 2012

Jarkko Venna replied to Mindy Scott's discussion Good R square FMCG industry in the group Data Mining

"Hi Mindy,
I would say that and R square of 50 would indicate that the covariates are related to the dependent variable, but this does not say that the model is accurate enoguh for you situation.
There is no general way of saying when the model…"

Feb 7, 2012

Jarkko Venna replied to Mark Nasila (PhD)'s discussion Excluding variables from a logistic regression model based on correlation in the group Analytical Techniques

"Hi,
If most of your correlated variables are categorical like citizenship and nationality then the reason that the having both in the model gives better results is that the difference in the model is meanigful for the case. That citizenship and…"

Feb 2, 2012

Jarkko Venna replied to Mark Nasila (PhD)'s discussion Excluding variables from a logistic regression model based on correlation in the group Analytical Techniques

"Hi,
Regression models can become ustable if the variables included have strong correlations.
If you wan to include all the variables but but want to avoid the problems that come from correlated variables you could use principal component analysis…"

Feb 1, 2012

Jarkko Venna replied to Daming's discussion Calculate the trustworthiness and continuity after the dimensionality reduction

"The r(i,j) is the rank( index in the ordered list of items) that you get when you order the distances from point i to other points.
In matlab you could put the distances from point i to the other points to a vector and then sort the vector base on…"

Jan 2, 2012

- Short Bio:
- I have a Phd in Computer Science and my academic work has been mostly related to information visualization. Especially to nonlinear projections and how to estimate the quality of visualizations they produce.

Currently I'm working as an consultant dealing with all kinds of analytical (and some not so analytical :) problems our customers have.

- Field of Expertise:
- Business Analytics, Predictive Modeling, Data Mining, Vizualization, Other, SAS

- Years of Experience in Analytical Role:
- 14

- Professional Status:
- Consultant

- Interests:
- Networking, Other

- Your Company:
- Numos Ltd

- No comments yet!