Subscribe to Vincent Granville's Weekly Digest:

Vincent's answers to data science questions - Part 2

Q7. What are the keys to operationalizing a machine learning ranking system from an organization and/or engineering management point of view? (Quora) - http://www.quora.com/What-are-the-keys-to-operationalizing-a-machin...

A. The very first thing would be to have the right database model / architecture from the very beginning, months before  it's in testing mode. So many companies went wrong just by choosing SQL server for their production platform. It is very expensive, once you realize you have scalability issues or slowness, to switch to Hadoop or some other architecture.

Q8. Can you patent mathematical procedures? Really? (LinkedIn)http://www.linkedin.com/groupAnswers?viewQuestionAndAnswers=&di...

A. Yes, I created and sold mathematical patents on scoring technology - used e.g. in the context of fraud detection or transaction scoring. You can patent pretty much anything if you have the right lawyers, and nothing if you do not - the system is very unfair.

Now I don't create patents anymore (not in the classical sense of the word), instead I make my inventions widely available to everyone by publishing them e.g. in my free e-Book on data science: you can call it "open patents" just like "open source", and yes I still do make money, although very indirectly.

Q9. If you could add or edit 3 features in your modeling or solution development software, what would they be? (LinkedIn)http://www.linkedin.com/groups/If-you-could-add-edit-35222.S.113756...

A. It would be:

  • Ability to run in batch mode and to process large data sets
  • Ability to be controlled remotely via an API (AaaS: Analytics as a Service)
  • Flexible programming language - maybe Perl

Q10. What is machine learning? (LinkedIn)http://www.linkedin.com/groupAnswers?viewQuestionAndAnswers=&di...

A. I think it depends what your context it. Your context is CS (computer science), mine is CS (computational statistics) and thus to me machine learning relates to techniques (decision trees, regression, SVM, neural networks, pattern recognition) to perform what is, at the end of the day, supervised clustering.

Back in 1993 when I completed my PhD on clustering techniques applied to image analysis, in Belgium, in a stats (not computer science) research lab, supervised clustering meant (at least in our stats lab) supervised classification, and unsupervised clustering meant unsupervised classification. Maybe this was not the correct terminology (none of us was English native speaker), but that's the words that we used. Also, we never used the word "data mining", instead we used "computational statistics". We never used "vector", we used "feature" - while here vector = set of attributes (what I call a feature) while feature = variable.

We should create a synonym dictionary, or maybe a translation dictionary :-) English to English.

Q11. If I want to do Data Science, would LinkedIn or Twitter be a better place to start work? (Quora) - http://www.quora.com/If-I-want-to-do-Data-Science-would-LinkedIn-or...

What about declining both offers and becoming a start-up yourself? Since I left the corporate world, I've never been so happy. My only regret is to have waited so long  before becoming independent. Not only is my revenue higher and more diversified (than when working for one company) and getting more diversified every day, but also I have a feeling of creating great things and helping people pretty much every day, without any limitation in what I do (other than the limitations imposed by the market, which all companies are also subject to).

Here some highlights about the joys of working for my startup:

  • Creation of one of the first automated, code-free news feed systems -   http://www.analyticbridge.com/profiles/blogs/automated-news-feed
  • Creation of new publisher business model to: sell a book via sponsorship -- you can download my "Data Science eBook by Analyticbridge" for free or buy it on Amazon, it's the first book on Data Science, see  http://www.analyticbridge.com/group/data-science/forum/topics/data-...
  • Deployment of very successful proprietary computational marketing techniques (sorry, no link for that, but I'm writing a tutorial for the interns that I hire)
  • Building the leading analytic / data science community, with very interesting and exciting sources of revenue, and self-funded (thanks to previous well managed investments including in real estate, thanks to my analytic mind and vision of when to buy/sell)
  • Leveraging very significant unfair competitive advatanges instead of leveraging Dad's money to get an MIT education
  • I can do stuff that larger companies can't for fear of litigation, and that's also something we leverage.
  • My work is very diversified: sales, data science, SEO/SEM, traffic and membership growth, traffic quality, product development, vendor and consultant relationships (worldwide), blog posting, operations, finance, chief creator, etc.
  • I don't contribute anymore to systems that are inefficient and corrupted, such as healthcare (very difficult to escape health insurance if you work for LinkedIn or Twitter)
  • I can publish or write whatever I want without fear of losing my job. I've created my own country and religion and sometimes discuss it publicly (www.mathematology.com), something you can't if you work for a corporation.
  • I create "open patents", that is, patentable systems that I make available to the entire analytic community for free, via my data science eBook that has been downloaded 15,000 times so far. In doing so I help the world get better, I help other entrepreneurs create start-ups. If you work for a company, you would be fired immediately for doing this, and possibly sued by your employer.
  • Having access to massive data is not an issue: I run web crawlers that extract vast amounts of data out of the web. Besides, I own the largest data available about the analytic community, and produce regular reports such as top 1,000 analytic blog and websites, 2012 vs. 2010 see e.g.  http://www.analyticbridge.com/profiles/blogs/top-analytic-blogs-and...
  • Some very talented people found great jobs or received PhD scholarships thanks to working with me and being "member of the month" in our community -  http://www.analyticbridge.com/group/memberofthemonth
  • Finally, if you need ideas for an analytic start-up, check  http://www.analyticbridge.com/profiles/blogs/connecting-executives

 

Read part 1 (Q1 to Q6) of this series at http://www.analyticbridge.com/profiles/blogs/new-series-vincent-s-a...

Views: 270

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Follow us

© 2013   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC

Badges  |  Report an Issue  |  Terms of Service