Subscribe to Vincent Granville's Weekly Digest:

We have 2+ million products that need to be assigned to already defined taxonomy (online ecommerce catalog). 

 

I suspect the best would be a Bayesian classifier.

 

Really would like open source. Need to do it on our backend, not an ASP model.

 

Any info would be greatly appreciated.

 

Best

Greg

 

 

Views: 67

Reply to This

Replies to This Discussion

Hi Greg -- See the discussion on "Classification of items into product categories", at http://www.analyticbridge.com/forum/topics/classification-of-items-....
Thanks Vincent. Yes, I had reviewed that but I'm looking for a tool currently deployed/available. Much appreciated.
I can think of 3 Open Source products which use Naive Bayes algorithm. Weka, Rapidminer, and Gate.

-Ralph Winters
Ralph,
Thanks for the information. Very helpful.

In the course of researching your suggestions I came upon this list of text mining, natural language processing, and classification tools. Listing it here as a breadcrumb for others in the future.

Thanks again,
Greg
On a different note, I am looking for a very large list of products, something like 1 million product names (or even much more), to create PPC keyword campaigns. Product names are usually good keywords to purchase (depending on your goal).

Greg, do you know where I could find very large product lists?

Thank you,
Vincent

has anyone actually used weka on a large dataset.  It seems to throw outOfMemory exceptions (aka heap) no matter how much memory your machine has and  regardless of how much heap you specify for it from the command line.  It seems the heap settings do not flow to things like the evaluation classes.  Using it has been a very frustrating experience.

Same is true for RWeka....

RSS

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service