Subscribe to Vincent Granville's Weekly Digest:
Dave Lewis
  • Chicago, IL
  • United States
Share Twitter
  • Blog Posts
  • Discussions
  • Groups (2)
  • Photos
  • Photo Albums

Dave Lewis's Groups

 

Dave Lewis's Page

Latest Activity

Dave Lewis replied to Andy Adamiec's discussion key sentence extraction in the group Text Mining
"There's a large research literature on this topic under the heading of text summarization, automatic abstracting, etc.  Here's an open source summarizer you could try out:  http://libots.sourceforge.net/"
Oct 6, 2011
Dave Lewis replied to Dave Lewis's discussion any Powerpoint decks introducing PMML available? in the group PMML
"p.s. Feel free to email me directly if you'd prefer.  My email address can be found at www.DavidDLewis.com"
Sep 12, 2011
Dave Lewis added a discussion to the group PMML
Thumbnail

any Powerpoint decks introducing PMML available?

PMML is the discussion topic of the next Chicago Machine Learning Meetup in a couple of days (Sept. 14).  Since there will be quite a few people there who have no prior experience with PMML, I'm wondering if there's a good 15-20 minute Powerpoint overview out there that I could borrow and present to get people up to speed. See More
Sep 12, 2011
Dave Lewis added a discussion to the group PMML
Thumbnail

high dimensional feature sets and the Data Dictionary

The Data Dictionary for a PMML model requires quite a bit of metadata for each field.  With sparse, high dimensional data the Data Dictionary could be many times larger than either the training data or the trained model.   Has anyone developed a standard extension to the PMML syntax that, for instance, just says that all fields have the same metadata?
Sep 2, 2011
Dave Lewis replied to Dave Lewis's discussion simple example of wrapping an open source learner in PMML? in the group PMML
"Yes, I hoping to find an example of something simple like SVMlight being wrapped for PMML.  But it's looking like people only take on PMML when the effort can be amortized over some huge system.   Well, fools rush in...we're…"
Sep 1, 2011
Michael Berthold replied to Dave Lewis's discussion simple example of wrapping an open source learner in PMML? in the group PMML
"KNIME (also open source) recently added PMML preprocessing support - so you can now add preprocessing to the PMML model using the graphical workflow editor (see also our KDD-PMML Workshop paper). The interna wrap the data dictionary around the model…"
Aug 29, 2011
Dave Lewis added a discussion to the group PMML
Thumbnail

examples or best practices for use of MiningBuildTask element

The optional MiningBuildTask element in PMML, which allows free-form content, seems like a great idea to me.  I often want to record information for future reference about the parameter settings used in training and potentially statistics about the training process.  I have my own thoughts about what to include in this element, but would be interested in pointers to best practices, or good concrete examples, of using this element.  Surprisingly, this element is not even discussed in the…See More
Aug 21, 2011
Dave Lewis added a discussion to the group PMML
Thumbnail

simple example of wrapping an open source learning in PMML?

Does anyone know of examples of wrapping an existing piece of supervised learning software to output models in PMML format?  Of particular interest are learners that just take in labeled vectors of numbers as training data and put out models that are pretty much just coefficient vectors (liblinear, SVMlight, BXRtrain, BOW, etc.).  That is, they don't have any smarts about data types, ranges of legal values of features, etc.: something else is assumed to deal with that and present the learner…See More
Aug 9, 2011

Profile Information

Short Bio:
Freelance computer scientist. From a disciplinary standpoint, I work in the areas of information retrieval, machine learning, computational linguistics, and applied statistics. From a task standpoint, I've worked on retrieval, categorization, clustering, filtering, routing, mining, extraction, authorship attribution, parsing, and pretty much anything else you can do to textual or partially textual data. Industries I've consulted to include web search, computational advertising, information security, data mining, venture capital, biotech, defense, nonprofits, government, and a range of startup companies.

The past few years, I've gotten particularly involved in work related to the law, both as a consulting expert on patent cases, and through a variety of electronic discovery activities. I am one of the founders of the TREC Legal Track, and created the first large scale public test collection for experimentation on e-discovery. I've consulted on e-discovery to both industry and law firms, designed machine learning algorithms for an e-discovery service provider, and developed training materials on e-discovery for a Fortune 100 company.
My Website or LinkedIn Profile (URL):
http://www.linkedin.com/profile/view?id=215096
Field of Expertise:
Predictive Modeling, Data Mining, Web Analytics, Statistical Consulting, Artificial Intelligence, Other
Years of Experience in Analytical Role:
26
Professional Status:
Consultant
Interests:
Networking, New Venture, Other
What is your Favorite Data Mining or Analytical Website?
http://hunch.net
What Other Analytical Website do you Recommend?
http://metaoptimize.com
Your Company:
www.DavidDLewis.com
Industry:
Consulting
How did you find out about AnalyticBridge?
Link from Zementis PMML Resources

Comment Wall

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

  • No comments yet!
 
 
 

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service