Subscribe to AnalyticBridge newsletter:

While the book is not yet finished, we wanted to share with you the 86 pages that we have written so far. The reasons are as follows:

  • We want your feedback about the style and content.
  • We want to attract sponsors and affiliates to submit contributions. 

Affiliates submit an 2-3 pages article, in exchange for making the book available for download on their website. Contributing to the book will drive traffic to your blog or website, as clickable links are available throughout the book, and can be added in your article as well.We are looking for authors to submit contributions to Part I or Part II.

So far, most articles are from Vincent Granville or Analyticbridge staff for the following reasons:

  • We own copyright for our own articles
  • We don't have to spend much time to select and review our own articles
  • We know that the links associated with our contributions are permanent, making our book more robust

However, we would love to include contributions from external authors in Part I and II, and contributions from sponsors (e.g. vendors) and affiliates in Part I, II and III. So feel free to send articles for possible inclusion. You can check and download the "book in progress" by clicking on the link below:

 

Download the draft version by clicking on ABbook5.pdf

 

About the book:

Our Data Science e-Book provides recipes, intriguing discussions and resources for data scientists and executives or decision makers. You don't need an advanced degree to understand the concepts. Most of the material is written in simple English, however it offers simple, better and patentable solutions to many modern business problems, especially about how to leverage big data.

Emphasis is on providing high-level information that executives can easily understand, while being detailed enough so that data scientists can easily implement our proposed solutions. Unlike most other data science books, we do not favor any specific analytic method nor any particular programming language: we stay one level above practical implementations. But we do provide recommendations about which methods to use when necessary.

Most of the material is original, and  can be used to develop better systems, derive patents or write scientific articles. We also provide several rules of the thumbs and details about craftsmanship used to avoid traditional pitfalls when working with data sets. The book also contains interviews with analytic leaders, and material about what should be included in a business analytics curriculum, or about how to efficiently optimize a search to fill an analytic position.

Among the more technical contributions, you will find notes on

  • How to determine the number of clusters
  • How to implement a system to detect plagiarism
  • How to build an ad relevancy algorithm
  • What is a data dictionary, and how to use it
  • Tutorial on how to design successful stock trading strategies
  • New fast and efficient random number generator
  • How to detect patterns vs. randomness

The book has three parts:

  • Part I: Data science recipes
  • Part II: Data science discussions
  • Part III: Data science resources

Part I and II mostly consist of the best Analyticbridge posts by Dr. Vincent Granville, founder of Analyticbridge. Part III consists of sponsored vendor contributions as well as contributions by organizations (affiliates offering software, conferences, training, books, etc.) who make our free e-book available for download on their web site. To become a sponsor or affiliate, please contact us at vincentg@datashaping.com.

Download the draft version by clicking on ABbook5.pdf

Views: 18276

Replies to This Discussion

Also, I believe that

  • This is the first book about data science
  • This is the first analytic book with content mostly coming from a social network
  • This is the first free e-book generating revenue via sponsors (vendor contributions) and where marketing is both internal and via contributors offering the book for download on their website

Indeed, we believe that this is the new way to sell and market a book. In many ways, it is the exact opposite of what traditional publishers still do as of today: selling the book for a fee, not having sponsors, and having very expensive marketing strategies. Eliminate all of this by proceeding as follows:

This new book publishing model has two components:

  1. Identify your own best posts on your network - and publish it in PDF format with tons of clickable links to your your web site. 
  2. Have a Resources from vendors (AKA sponsors) section - these are the guys who will pay you money, but you should also offer free contributions (from "affiliates") in exchange for having your book available for download on affiliate web sites. 

Make the book available for free, use your network to market it.

This creates an exploding mix where you generate traffic very fast, at no cost, and generate revenue directly from the book (via the sponsors), and indirectly to your network (web site) due to increased traffic and thus increased ad revenue.

(Note: we think that one day, we'll make a paper copy of our e-book - but the original version will be digital) 

 

I read the ABbook5.pdf with the expectation to see a data mining version of "Numerical Recipes". I am a big fan of the "Numerical Recipes" books as they provide accessible introductions to very complex and rich topics.

However, the description you offer here indicates that this is a different type of book: a new kind coming from a social network. So having properly reset my expectations, here is my feedback:

- as different problems have different scope, the recipes also widely differ in scope and applicability. As a service to the reader, if there can be some editorial control applied to the descriptions to add/unify that scope and applicability, the recipes will become much much better. Some supporting links like Wikipedia would help as well.

- the three parts are as different as they come, so I would suggest that you truly separate the parts into their own three distinct books. I do not see a reason to combine them in the same book, nor would I use the book in that way. By separating them you gain clarity and agility, and I would say from an operational/advertising/sponsor point of view, you gain more revenue and opportunity.

 

- love the idea to create a new publishing format, although it doesn't seem to go as far as Wikipedia, or platforms like the StackExchange or Quora (Q&A driven sites), or a marketplace like Spiceworks. I am excited to see if you can make this eBook work as it sits between Wikipedia and StackExchange.

 

- Take a look at Spiceworks as a model: right now I don't believe that the world of analytics is big enough to support something like Spiceworks for Analytics, but as big data and self-service BI are becoming essential elements in the arsenal of modern business the class of knowledge workers that deal with analytics will steadily grow. IMHO, fundamentally, Spiceworks works because IT professionals have a big say in the budgets they need to operate. Analytics professionals have a much smaller impact on budgets in the traditional enterprise, although they do sit at the strategy table in the analytics startups that we all love. If that trend continues, the Spiceworks model could work.

I have uploaded a new version (56 pages, 22 contributions, updated bio). If you can not download it, refresh your browser, then try again. The URL is the same. Many new articles will be added in the next 2 weeks.

Just curious (even after reading all benefits of pdf publishing), will it be a good idea to do an iBook too?  The demographics of ipad owners are not too small too and might be a good place to do. I played with iBook author a little and seems interesting due to its ability to add interactive elements. 

The fact that the book is published in PDF format offers new advertising opportunities: clickable banner ads. That would be the first time that pay-per-click or pay-per-lead advertising is sold in an e-book, and used to finance the cost of publishing the book.

While others point out the contents, may I add suggestions on style? 

After reading the articles it seems to me that short magazine style layout might suit. Here is one template we can probably use: 

http://graphicriver.net/item/24-page-indesign-magazine-a4/129724?WT...

PDF render beautiful electronic version of paper books or magazines. Still, they are optimized to be read on paper (or maybe a huge monitor while sitting on your desk).
However, electronic documents are best when consumed in a lightweigth ereader or tablet. Since these are smaller devices, in order to be able to fit a large font the book should reflow at the margin of this particular ereader. PDF is not appropriate for this use. EPub, and other proprietary ereader formats can reflow and are better suited for this purpose.

Thanks for all the comments. I really need to work on the format, maybe hire someone since I don't have much time available (otherwise I would have used TEX rather than Word as word processor), also to proofread, optimize navigation, add structure to the book, an index etc.

The nice thing with PDF is that it has clickable links to the actual discussions, so you can access the most up-to-date versions of the discussions on AB, with all the fresh comments. But PDF takes lots of bandwidth and storage. Also the page size is too large. But it looks very nice when printed :-)

Updated content as of 2/14/2012:

Introduction

Part I - Data Science Recipes

  1. New random number generator: simple, strong and fast
  2. Lifetime value of an e-mail blast: much longer than you think
  3. Two great ideas to create a much better search engine
  4. Identifying the number of clusters: finally a solution
  5. Online advertising: a solution to optimize ad relevancy
  6. Example of architecture for AaaS (Analytics as a Service)
  7. Why and how to build a data dictionary for big data sets
  8. Hidden decision trees: a modern scoring methodology
  9. Scorecards: Logistic, Ridge and Logic Regression
  10. Iterative Algorithm for Linear Regression
  11. Approximate Solutions to Linear Regression Problems
  12. Theorems for Traders
  13. Preserving metric and score consistency over time and across clients
  14. Advertising: reach and frequency mathematical formulas
  15. Real Life Example of Text Mining to Detect Fraudulent Buyers
  16. Discount optimization problem in retail analytics
  17. Sales forecasts: how to improve accuracy while simplifying models?
  18. How could Amazon increase sales by redefining relevancy?
  19. How to build simple, accurate, data-driven, model-free confidence intervals
  20. Comprehensive list of Excel errors, inaccuracies and use non-standard statistical definitions
  21. 10+ Great Metrics and Strategies for Email Campaign Optimization

Part II - Data Science Discussions

  1. Statisticians Have Large Role to Play in Web Analytics (AMSTAT interview)
  2. Future of Web Analytics: Interview with Dr. Vincent Granville
  3. Connecting with the Social Analytics Experts
  4. Interesting note and questions on mathematical patents
  5. Big data versus smart data: who will win?
  6. Creativity vs. Analytics: Are These Two Skills Incompatible?
  7. Barriers to hiring analytic people
  8. Salary report for selected analytical job titles
  9. Are we detailed-oriented or do we think "big picture", or both?
  10. Why you should stay away from the stock market
  11. Gartner Executive Programs' Worldwide Survey of More Than 2,300 CIOs
  12. Analysts Explore Cloud Analytics at Gartner Business Intelligence Summit 2012
  13. One Third of Organizations Plan to Use Cloud Offerings to Augment BI Capabilities
  14. Twenty Questions about Big Data and Data Sciences
  15. Interview with Drew Rockwell, CEO of Lavastorm

Part III - Data Science Resources

  1. Vincent’s list

Excellent reference and thank you for making it available. My company is caught right at Part II Section 6. We are a very creative software company that are focused on analytics of big data in real time, however we lack the analytic know how. We think our technology is interesting based on discussions we have had by offering true real-time (getting answers as they happen) but turing technology into a product and understanding what is important as a product is very difficult. You work here has really inspired me and has helped me quite a bit.

Thanks!

RSS

Featured

Advertisement

© 2012   Created by Vincent Granville.

Badges  |  Report an Issue  |  Terms of Service