Subscribe to DSC Newsletter

Dan Kellett
  • Nottingham
  • United Kingdom
Share on Facebook
Share

Dan Kellett's Friends

  • Dr Venugopala Rao
 

Dan Kellett's Page

Latest Activity

Dan Kellett's blog post was featured

Making data science accessible – Data Munging

By Dan Kellett, Director of Data Science, Capital One UKOver the past few months my blogs have attempted to demystify some of the techniques used by Data Scientists to build models or process large amounts of data. For all the flashy techniques and algorithms this is not where Data Scientists spend 90% of their time. The hard yards…See More
Aug 13, 2016
enkhbayar mungunbat commented on Dan Kellett's blog post Making data science accessible – Logistic Regression
"interesting. your articles are very down to earth compared to those generally speakers."
Jul 29, 2016
Dan Kellett's blog post was featured

Making data science accessible – HDFS

By Dan Kellett, Director of Data Science, Capital One UK Disclaimer: This is my attempt to explain some of the ‘Big Data’ concepts using basic analogies. There are inevitably nuances my analogy misses. What is HDFS?When people talk about ‘Hadoop’ they are usually referring to either the efficient storing or processing of large amounts of data. MapReduce is a framework for efficient processing using a parallel, distributed algorithm (see my previous blog…See More
Jul 22, 2016
Dan Kellett's blog post was featured

Making data science accessible – Neural Networks

By Dan Kellett, Director of Data Science, Capital One UK What are Neural Networks? Neural Networks are a family of Machine Learning techniques modelled on the human brain. Being able to extract hidden patterns within data is a key ability for any Data Scientist and Neural Network approaches may be especially useful for extracting…See More
Jul 6, 2016
Dan Kellett posted a blog post

Making data science accessible – Text Mining

What is Text Mining? Text Mining is a general catch-all for a range of techniques for extracting information from text strings. Being able to extract, clean and summarize text data is a key ability for any Data Scientist. The following blog aims to highlight some of the process steps I use to clean text data as well as some summarization methods. Initial cleaning To illustrate some of the approaches to text mining I am going to use the full text of 1984 by George Orwell. This data was extracted…See More
Jun 18, 2016
Dan Kellett's blog post was featured

Making data science accessible – Logistic Regression

What is Logistic Regression? Regression is a modelling technique for predicting the values of an outcome variable from one or more explanatory variables. Logistic Regression is a specific approach for describing a binary outcome variable (for example yes/no). Let’s assume you are own a new boutique shop. You have a list of potential clients you are thinking of inviting to a special event with the aim of maximizing the number of sales – who should you invite? Data on previous events you have run…See More
May 29, 2016
Dan Kellett's blog post was featured

Making data science accessible - Markov Chains

What are Markov Chains? A Markov chain is a random process with the property that the next state depends only on the current state. For example: If you have the choice of red or blue twice the process would be Markovian if each time you chose the decision had nothing to do with your choice previously (see diagram below). How can Markov Chains help us?…See More
May 4, 2016
Dr Venugopala Rao commented on Dan Kellett's blog post Making data science accessible - Machine Learning – Tree Methods
"Good One"
Apr 19, 2016
Dan Kellett posted a blog post

Making data science accessible - Machine Learning – Tree Methods

What are Tree Methods? Tree methods are commonly used in data science to understand patterns within data and to build predictive models. The term Tree Methods covers a variety of techniques with different levels of complexity but my aim is to highlight three I find useful. To set the problem up let’s assume we have a census dataset containing age, education, employment status and so on. Given all this information we want to see if we can predict whether a person earns more than $50k per year.…See More
Apr 12, 2016
Dan Kellett posted a blog post

Making data science accessible - MapReduce

What is MapReduce? When people talk about ‘Hadoop’ they are usually referring to either the efficient storing or processing of large amounts of data. The standard approach to reliable, scalable data storage in Hadoop is through the use of HDFS (Hadoop Distributed File System) which may be a topic for a future blog. MapReduce is a framework for efficient processing using a parallel, distributed algorithm. Over the past 18 months we have used MapReduce for a variety of analytic needs, building up…See More
Mar 25, 2016

Profile Information

Short Bio:
I love data.

There’s nothing more exciting than using the information we have all around us to understand more about how people interact with the world through their financial products.

As head of the UK Data Lab at Capital One I lead a cross-functional team that is changing the way we do business. We use the latest distributed computing technologies and operate across many billions of customer transactions. The models and data products that my team build unlock big opportunities for the business and help UK consumers save money and reduce friction in their financial lives.

Capital One is a founder led company on a mission to change banking for good by bringing simplicity and humanity to all that we do. We’re a business dedicated to helping our customers succeed through digital solutions. We pride ourselves on attracting and developing the best talent. Our people are the key to our success and ultimately what makes Capital One the number one Great Place to Work in the UK for the past 3 years!

I'm interested in connecting with other Data Scientists, Statisticians or Data Visualisers to discuss advances in the fields.

I'm also on the look-out for ambitious, talented people to join my team as we build out our capabilities, using data to Change Banking for Good!
My Website or LinkedIn Profile (URL):
http://uk.linkedin.com/in/kellettdan
Field of Expertise:
Business Analytics, Predictive Modeling, Data Mining, Vizualization, Statistical Consulting
Years of Experience in Analytical Role:
16
Professional Status:
Director
Interests:
Networking, Recruiting
Your Company:
Capital One
Industry:
Banking
Your Job Title:
Director of Data Science

Dan Kellett's Blog

Making data science accessible – Data Munging

By Dan Kellett, Director of Data Science, Capital One UK

Over the past few months my blogs have attempted to demystify some of the techniques used by Data Scientists to build models or process large amounts of data. For all the flashy techniques and algorithms this is not where Data Scientists spend…

Continue

Posted on August 9, 2016 at 1:00am

Making data science accessible – HDFS

By Dan Kellett, Director of Data Science, Capital One UK

 

Disclaimer: This is my attempt to explain some of the ‘Big Data’ concepts using basic analogies. There are inevitably nuances my analogy misses.

 

What is HDFS?

When people talk about ‘Hadoop’ they are usually referring to either the efficient storing or processing of large amounts of data. MapReduce is a framework for efficient processing using a parallel, distributed algorithm…

Continue

Posted on July 21, 2016 at 2:00am

Making data science accessible – Neural Networks

By Dan Kellett, Director of Data Science, Capital One UK

 

What are Neural Networks?

 

Neural Networks are a family of Machine Learning techniques modelled on the human brain. Being able to extract hidden patterns within data is a key ability for any Data Scientist and Neural…

Continue

Posted on July 5, 2016 at 10:47am

Making data science accessible – Text Mining

What is Text Mining?

 

Text Mining is a general catch-all for a range of techniques for extracting information from text strings. Being able to extract, clean and summarize text data is a key ability for any Data Scientist. The following blog aims to highlight some of the process steps I use to clean text data as well as some summarization methods.

 

Initial cleaning

 

To illustrate some of the approaches to text…

Continue

Posted on June 14, 2016 at 8:06am

Comment Wall

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

  • No comments yet!
 
 
 

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service