Subscribe to DSC Newsletter

All Blog Posts (2,009)

Great Saturday Reading

Here is our selection of featured articles and resources posted today:

Continue

Added by Vincent Granville on March 25, 2017 at 11:00am — No Comments

Great Saturday Reading

Here is our selection of featured articles and resources posted since Thursday.

Continue

Added by Vincent Granville on March 18, 2017 at 12:19pm — No Comments

Advanced Machine Learning with Basic Excel

In this article, I present a few modern techniques that have been used in various business contexts, comparing performance with traditional methods. The advanced techniques in question are math-free, innovative, efficiently process large amounts of unstructured data, and are robust and scalable. Implementations in Python, R, Julia and Perl are provided, but here we focus on an Excel version that does not even require any Excel macros, coding, plug-ins, or anything other than the most…

Continue

Added by Vincent Granville on March 10, 2017 at 1:13pm — No Comments

Math Challenge: Computing the Average Rotational Speed of Earth

Or of any celestial body. Here I discuss a solution that can be explained to high school students, to get them interested in mathematics, statistics and probabilities. A few interesting related problems further enhance the pedagogical value of this discussion.  

I stumbled upon this kind of problems when learning advanced mathematics in my postgraduate studies, in a course entitled stochastic geometry. Just formulating the problem required advanced knowledge of sophisticated…

Continue

Added by Vincent Granville on March 3, 2017 at 1:00am — No Comments

Data Science Summarized in One Picture

After posting Machine Learning Summarized in One Picture, we posted a picture for data science. You can check it out here

Other articles published since Monday…

Continue

Added by Vincent Granville on February 28, 2017 at 9:30am — No Comments

Monte Carlo Analysis and Simulation

The Monte Carlo method is an simple way to solve very difficult probabilistic problems. This text is a very simple, didactic introduction to this subject, a mixture of history, mathematics and mythology.

 

The method has origins in the World War II, proposed by the Polish American mathematician Stanislaw Ulam and Hungary American mathematician John Von Neumann.…

Continue

Added by Arnaldo Gunzi on February 25, 2017 at 4:00pm — No Comments

Great Saturday Reading

Here is our selection of featured articles and resources for the weekend:

Continue

Added by Vincent Granville on February 25, 2017 at 3:00pm — No Comments

Python vs R: 4 Implementations of Same Machine Learning Technique

Actually, this is about two R versions (standard and improved), a Python version, and a Perl version of a new machine learning technique recently published here. We asked for help to translate the original Perl script to Python and R, and finally decided to work with Naveenkumar Ramaraju, who is currently pursuing a master's in Data Science…

Continue

Added by Vincent Granville on February 22, 2017 at 11:30am — No Comments

Great Friday Reading

Here is our selection of articles and resources for today:

Continue

Added by Vincent Granville on February 10, 2017 at 3:52pm — No Comments

State-of-the-Art Machine Learning Automation with HDT

In this article, we discuss a general machine learning technique to make predictions or score transnational data, applicable to very big, streaming data. This hybrid technique combines different algorithms to boost accuracy, outperforming each algorithm taken separately, yet it is simple enough to be reliably automated It is illustrated in the context of predicting the performance of articles published in media outlets or blogs, and has been used by the author to build an AI…

Continue

Added by Vincent Granville on February 10, 2017 at 12:00am — No Comments

Great Saturday Reading

Our selection of best articles today:

Continue

Added by Vincent Granville on February 4, 2017 at 5:34pm — No Comments

How to Handle Outliers in Regression Problems

New featured content for data scientists:

Data Science in Python: Pandas Cheat Sheet -- This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deel Learning, Data Science, and so on) click here. To read the article,…

Continue

Added by Vincent Granville on January 31, 2017 at 10:30pm — No Comments

Tutorial: Neutralizing Outliers in Any Dimension

The main focus of this article is on computing the point that minimizes the sum of the "distances" to n points in a d-dimensional space, called centroid or center, in the presence of outliers. 

This long article has several sections.

Content

1. A related physics problem

2. Algorithm to find the centroid

  • Source code to generate points and compute centroid, using Monte…
Continue

Added by Vincent Granville on January 30, 2017 at 2:30pm — No Comments

The Ultimate Guide for Choosing Algorithms for Predictive Modeling

There are three ways to look at data. The first is analytics. This is when you look at data from the (potentially very recent) past. Think analytics. It allows you to explore the questions what happened and why did it happen? The second is monitoring. This is looking at things as they happen. In many…

Continue

Added by Steven M. Mehler on January 30, 2017 at 12:00am — 1 Comment

46 SQL Job Interview Questions for Data Scientists

Here is our updated selection of featured articles and resources posted over the weekend:

Continue

Added by Vincent Granville on January 15, 2017 at 7:49pm — No Comments

In Japan, "Artificial Intelligence" comes to be a super star while "Data Scientist" is fading away

I published a post about the current status of "Data Scientist" in Japan, as a periodic follow-up analysis since two years ago. Its trend still remains, but it's beyond my anticipation at that time.

Indeed growing trend of "Artificial Intelligence" in Japan is steeper than…

Continue

Added by Takashi J. OZAKI on January 13, 2017 at 6:30am — 1 Comment

Ten Simple Rules for Effective Statistical Practice

This article, written by Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, and Reid N, contains the following rules:

  • Statistical Methods Should Enable Data to Answer Scientific Questions
  • Signals Always Come with Noise
  • Plan Ahead, Really Ahead
  • Worry about Data Quality
  • Statistical Analysis Is More Than a Set of Computations
  • Keep it Simple
  • Provide Assessments of Variability
  • Check Your Assumptions
  • When Possible,…
Continue

Added by Vincent Granville on January 10, 2017 at 11:16am — No Comments

How to build a search engine: Part 4

This post is the fourth part of the multi-part series on how to build a search engine –

Continue

Added by Vivek Kalyanarangan on January 10, 2017 at 1:00am — No Comments

7 Traps to Avoid Being Fooled by Statistical Randomness

Randomness is all around us. Its existence sends fear into the hearts of predictive analytics specialists everywhere -- if a process is truly random, then it is not predictable, in the analytic sense of that term.  Randomness refers to the absence of patterns, order, coherence, and predictability in a system. 

Unfortunately, we…

Continue

Added by Kirk Borne on January 9, 2017 at 6:00pm — 5 Comments

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service