Featured Blog Posts (1,447)

Math Challenge: Computing the Average Rotational Speed of Earth

Or of any celestial body. Here I discuss a solution that can be explained to high school students, to get them interested in mathematics, statistics and probabilities. A few interesting related problems further enhance the pedagogical value of this discussion.

I stumbled upon this kind of problems when learning advanced mathematics in my postgraduate studies, in a course entitled stochastic geometry. Just formulating the problem required advanced knowledge of sophisticated…

Continue

Added by Vincent Granville on March 3, 2017 at 1:00am — No Comments

Monte Carlo Analysis and Simulation

The Monte Carlo method is an simple way to solve very difficult probabilistic problems. This text is a very simple, didactic introduction to this subject, a mixture of history, mathematics and mythology.

The method has origins in the World War II, proposed by the Polish American mathematician Stanislaw Ulam and Hungary American mathematician John Von Neumann.…

Continue

Added by Arnaldo Gunzi on February 25, 2017 at 4:00pm — No Comments

The Ultimate Guide for Choosing Algorithms for Predictive Modeling

There are three ways to look at data. The first is analytics. This is when you look at data from the (potentially very recent) past. Think analytics. It allows you to explore the questions what happened and why did it happen? The second is monitoring. This is looking at things as they happen. In many…

Continue

Added by Steven M. Mehler on January 30, 2017 at 12:00am — 1 Comment

In Japan, "Artificial Intelligence" comes to be a super star while "Data Scientist" is fading away

I published a post about the current status of "Data Scientist" in Japan, as a periodic follow-up analysis since two years ago. Its trend still remains, but it's beyond my anticipation at that time.

Indeed growing trend of "Artificial Intelligence" in Japan is steeper than…

Continue

Added by Takashi J. OZAKI on January 13, 2017 at 6:30am — 1 Comment

12 Statistical and Machine Learning Methods that Every Data Scientist Should Know

Below is my personal list of statistical and machine learning methods that every data scientist should know in 2016.

1. Statistical Hypothesis Testing (t-test, chi-squared test & ANOVA)
2. Multiple Regression (Linear Models)
3. General Linear Models (GLM: Logistic Regression, Poisson Regression)
4. Random Forest
5. Xgboost (eXtreme Gradient Boosted Trees)
6. Deep Learning
7. Bayesian Modeling with…
Continue

Added by Takashi J. OZAKI on January 8, 2017 at 6:30am — 1 Comment

Deep Learning in Python: Getting Started

Deep learning is all the rage. You hear about it in the news, you read it about it in the news and it’s all over popular culture as well.…

Continue

Added by Malia Keirsey on December 5, 2016 at 12:00pm — No Comments

7 Traps to Avoid Being Fooled by Statistical Randomness

Randomness is all around us. Its existence sends fear into the hearts of predictive analytics specialists everywhere -- if a process is truly random, then it is not predictable, in the analytic sense of that term.  Randomness refers to the absence of patterns, order, coherence, and predictability in a system.

Unfortunately, we…

Continue

Added by Kirk Borne on January 9, 2017 at 6:00pm — 5 Comments

Blog - R vs Python. Which one has higher demand on the job market? A short study

R vs Python. Which language should you choose?

R is great for mathematical people. Think of R as spreadsheets on steroids. A lot of people progress from spreadsheets to R. These people are usually statisticians at heart.

Python, of the other hand, is more…

Continue

Added by Olga on September 27, 2016 at 7:30pm — No Comments

How to build a search engine: Part 4

This post is the fourth part of the multi-part series on how to build a search engine –

Continue

Added by Vivek Kalyanarangan on January 10, 2017 at 1:00am — No Comments

How to build a search engine: Part 3

This post is the third part of the multi-part series on how to build a search engine –

Continue

Added by Vivek Kalyanarangan on December 30, 2016 at 6:00am — No Comments

1. Introduction

Most tasks in Machine Learning can be reduced to classification tasks. For example, we have a medical dataset and we want to classify who has diabetes (positive class) and who doesn’t (negative class). We have a dataset from the financial world and want to know which customers will default on their credit (positive class) and which customers will not (negative class).

To do this, we can train a Classifier with a ‘training dataset’ and after such a Classifier is…

Continue

Added by ahmet taspinar on December 22, 2016 at 10:30am — No Comments

How to build a search engine - Part 2: Configuring elasticsearch

This post is the second part of the multi-part series on how to build a search engine –

Continue

Added by Vivek Kalyanarangan on December 23, 2016 at 10:30am — No Comments

How to build a search engine: Part 1

In this multi-part series, we will explore how to build a search engine. It will be quite powerful and industrial strength. The first part will focus on getting the right tools and getting technology stack ready. We will build this search engine with an AngularJS front-end and use elasticsearch as the computation back end.

This post is the first part of the multi-part series on how to build a search engine –

• How to build a search engine – Part 1: Installing the tools and…
Continue

Added by Vivek Kalyanarangan on December 16, 2016 at 2:00am — No Comments

The Generating New Probability Theorems

The purpose of this article is to generate new theorems of probability and to find out some applications of these theorems. In this case, suppose that we have a covered basket that contains many dices. In many blind tests, we will reach in and pull out a dice and set it on the table on one row from left to right. It is clear, each dice has six events (choices) including 1, 2, 3, 4, 5, and 6.

What is the application of these theorems (1 and 2)?

Let me…

Continue

Added by Gholamreza Soleimani on November 16, 2016 at 3:00am — No Comments

10 Tools For Working With Big Data For Successful Analytics

Traditional computer systems and software applications don’t have what it takes to support big data. If you want to collect, store, refine, or analyze big data, you have to have the right tools. Check out the following ten tools that are specifically designed with big data in mind.

If you know, or are willing to…

Continue

Added by Diana Beyer on November 17, 2016 at 8:30am — 1 Comment

How can organizations successfully convert big data into real-world decisions?

The word wide web is turning into a colossal heap of data that is being stored at hundreds and thousands of datacenters across the world. According to a recent research made by…

Continue

Added by Rick Riddle on November 10, 2016 at 10:00am — No Comments

The 15 Essential Database Marketing Techniques

Database marketing is the process of collecting, analyzing, and identifying important information about you customers.

The information comes different channels like sales records, email, cards warranty and etc.

With this is information you can sharpen your marketing campaigns to focus on what’s important; the client.

• Improving all your marketing efforts
• Increasing sales
• Retaining and getting more…
Continue

Added by Diana Beyer on November 9, 2016 at 8:00am — 1 Comment

Introduction

There’s a lot of buzzword around the term “Sentiment Analysis” and the various ways of doing it. Great! So you report with reasonable accuracies what the sentiment about a particular brand or product is.

Opinion Mining and Sentiment Analysis

After publishing this report, your client comes back to you and…

Continue

Added by Vivek Kalyanarangan on November 4, 2016 at 8:24am — No Comments

R, Python or SAS: Which one should you learn first?

Here is our selection of articles and resources featured today. Enjoy the reading!

How We Combined Different Methods to Create Advanced Time Series Prediction

Today, businesses need to be able to predict demand and trends to stay in line with any sudden market changes and economy swings. This is exactly where forecasting tools, powered by Data Science, come into play,…

Continue

Added by Vincent Granville on November 4, 2016 at 11:30am — No Comments

How Bayesian Inference Works: Tutorial

Guest blog by Brando Rohrer. Brandon is an author and deep learning developer. He has worked as Principal Data Scientist at Microsoft, as well as for DuPont Pioneer and Sandia National Laboratories. Brandon earned a Ph.D. in Mechanical Engineering from the Massachusetts Institute of Technology.

Bayesian inference is a way to get sharper predictions from your data. It’s particularly useful when you don’t have as much data as you would like and want to juice every last…

Continue

Added by Vincent Granville on November 2, 2016 at 5:03pm — No Comments

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

1

2

3

4