Data Intelligence, Business Analytics
At BigML we believe that over the next few years automated, data-driven decisions and data-driven applications are going to change the world. In fact, we think it will be the biggest shift in business efficiency since the dawn of the office calculator, when individuals had “Computer” listed as the title on their business card. We want to help people rapidly and easily create predictive models using their datasets, no matter what size they are. Our easy-to-use, public API is a great step in that direction but a few bindings for popular languages is obviously a big bonus.
Thus, we are very happy to announce an open source Python binding to BigML.io, the BigML REST API. You can find it and fork it at Github.
The BigML Python module makes it extremely easy to programmatically manage BigML sources, datasets, models and predictions. The snippet below sketches how you can create a source, dataset, model and then a prediction for a new object.
1 |
from bigml.api import BigML |
2 |
3 |
api = BigML() |
4 |
5 |
source = api.create_source('yourdata.csv') |
6 |
dataset = api.create_dataset(source) |
7 |
model = api.create_model(dataset) |
8 |
prediction = api.create_prediction(model, new_object) |
Just like magic!
We have tried to build a very simple binding just wrapping all the HTTP requests and responses to BigML.io within one class. Over the next few weeks we’ll see how to add more layers of abstraction so that you can have different ways to exploit all the information provided by datasets, models and predictions. You can see a few more examples in the github page.
Getting back to the example above, imagine all the steps you would need in order to create a predictive model using another ML or statistical package. There are several specific Machine Learning libraries for Python. For example, PyML, PyBrain, or Orange to name a few. Of course there are also the fabulous SciPy and NumPy libraries. They are great tools that can be the perfect complement or supplement to your BigML application. But using them is still non-trivial, and one needs to pay attention to lots of nitty-gritty details to model any realistic problem.
On the other hand, there are a few advantages to a cloud-based machine learning service like BigML that you need to bear in mind:
There are also two key specific advantages to building BigML predictive models:
The website is very intuitively designed – You can create a dataset from an uploaded file in one click and you can create a Decision Tree model in one click as well. I wish other cloud computing websites like Google Prediction API make design so intuitive and easy to understand.
So when you use BigML API or the Python binding to create new models programmatically you can later access to them through the BigML interface where you can nicely visualize and explore the models in your dashboard.
Also unlike Google Prediction API, the models are not black box models, but have a description which can be understood
In other words, your model is always an API call away, allowing you to download it locally and deploy it however you choose.
When we first started BigML we spent some time making our models exportable to PMML. However, we soon saw that creating a light-weight version in JSON was the way to go. Not only the models are smaller, simpler and easier to read, but they are also directly translatable to PMML.
Finally, we didn’t want to finish this post without giving you a sneak peek of things that are coming soon:
We are doing our best to make “machine learning for everyone” a reality. Now it’s time to unleash your inner hacker and show the world what machine learning can do in your application. If you don’t have a BigML account yet or need more credits just send us an email to support@bigml.com and we’ll be happy to move you to the top of our invite list.
© 2013 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC
You need to be a member of AnalyticBridge to add comments!
Join AnalyticBridge