It's time to create your own artificial intelligence! More specifically, a machine learning model. We've already talked about why Scikit-learn is a good choice to get started in machine learning. Let's confirm this in practice and write our own machine-learning model. Stretch your fingers, there will be a lot of code. But first, we strongly recommend that you become a little familiar with the Scikit-learn description and the machine learning theory. Have you already? Then let's go forward!
Let's get started
The first step is to make sure that you have Python installed on your computer. If you don't, you can download it for free from the Python website. Just follow the installation instructions.
After installing Python, you have to install Scikit-learn using the package manager, a tool that helps you download and install any libraries.
There are several different package managers which might be used, but the easiest way is to use the one called pip. Pip comes with Python, so you don't need to install it separately.
How to install Scikit-learn using pip:
- Open a command line on your computer. To do this, press Win + R on Windows, Command + Space on MacOS, and type cmd.
- Once you have the command prompt open, type in the following command:
pip install scikit-learn
- You have installed Scikit-learn!
Now you can program. To do this comfortably, you should install a development environment, such as Wing or PyCharm. Development environments are handier than the command line because they have a lot of handy features, such as a project manager, autocorrect, debugger, and many more. The website of any Integrated Development Environment or IDE has detailed installation instructions. You can also use an online development environment.
How to teach robots: meet the estimators
Remember the puzzles where you had to continue the row, like the one below?
Estimators are subprograms that solve these kinds of tasks. They learn from examples and can make predictions about new data. So, if you give an estimator a bunch of data about houses, they can learn what factors make a house expensive and then use that information to predict how much a new house might cost.
In Python, estimators are represented by several classes that are already prescribed in Scikit-learn.
Classes and objects
In programming, you often have to deal with objects and classes. Cats will help us get the idea.
This is Mr. Pumpkin and Luna. They are cats. In terms of object-oriented programming, they are two different objects of the same class — cats. A class is a kind of template for creating similar objects.
Classes have methods and attributes. Methods are functions — subprograms that perform a specific task. Methods are unique to objects of a given class. So, a class cat can have methods .eat(), .sleep(), .meow().
If methods are what a class object can do, then attributes are what it is. Cat class attributes can be name, weight, color, breed, and age.
Let's create our own cats:
We just created and described a class. Now we can create objects — instances of this class:
Running the code in an editor, the result looks like this:
Now you know how to create your own class! Let's go back to the estimators and create a simple one. You don't have to understand all of the following codes. Just see how similar in structure the creation of the Cat and Estimator classes are.
As you can see, MyEstimator class, unlike the Cat class, has no .meow(), .sleep(), or .eat() methods, but does have .fit() and .predict().
.fit() is used to train an estimator on a set of data. When you call the fit method on an object of the MyEstimator class, you provide it with some input data (X) and optional target data (y). The fit method doesn't do anything with this data — it just returns the object itself.
.predict() is used to make predictions based on the data that the estimator was trained on. When you call the predict method on an object of the MyEstimator class, you provide it with some input data (X). The predict method returns a prediction for each input in X.
We also say that the MyEstimator class inherits from the BaseEstimator class. This means that our class has some features of another class. Don't overthink it — that's just how things work in Python.
Basic Python vs Scikit-learn
Another cool thing about Scikit-learn is that you don't have to create your own estimator classes to create a machine learning model — you can simply load a ready-made class from the library itself. This shortens your code considerably. Check this out:
Let's break down exactly what this code does.
We have loaded Iris, a training dataset with 150 samples of iris flowers, each with four features (sepal length, sepal width, petal length, and petal width) and a corresponding target variable indicating the species of iris (setosa, versicolor, or virginica).
The training set actually teaches the algorithm how to classify a flower. The test set is used to test how well the algorithm works on new data it has not yet seen. It's like the answers at the end of a problem book, with which we compare the solution.
Next, we created an estimator. This algorithm is an implementation of logistic regression, a machine learning algorithm that tries to predict the probability of something. In this case, we use the length of the sepal, the width of the sepal, the length of the petal, and the width of the petal to predict which iris species we had.
We then used the .fit() method, loading the training and test sets into our estimator. And then, using the .predict() method, our model began to apply what we created to classify iris flowers.
After that, we evaluated the effectiveness of the algorithm by calculating its accuracy.
And at the end, we put a new flower and use its parameters to classify it.
Is it possible to create a machine-learning model without a special library? Yes, but it is not very handy, as you’ll see in our example below, using basic Python:
In each case, you will get the same result if you use the same dataset, but with Scikit-learn it is easier and faster. Speed is especially important if you use more complex algorithms and work with huge data.
Wow, you really did it! Now you have your own machine learning model. You can be really proud of yourself!
Know what else is cool? Once you understand how to work with Scikit-learn, you can master more advanced machine-learning tools. Maybe the artificial intelligence you write about will make the world a better place?
We want to make the world a better place, too, and we're big fans of artificial intelligence, so we put together TripleTen's Data Science Bootcamp. There you'll learn the basics of Python, delve deeper into working with Scikit-learn, and master working with more complex and advanced machine-learning libraries. If you're ready, join the team of like-minded folks.