Multilable classification in sklearn
Multilable classification predict the multiple classes for the input. Like in image object detection you will have multiple objects that you want to identify. In simple classification you will predict number in MNIST data set ( example ) in this you will predict multiple numbers in image ( example )
Data
Data is from kaggle you can download it from here.
import pandas as pd train_df = pd.read_csv('./train_features.csv')
test_df = pd.read_csv('./test_features.csv')
y_train = pd.read_csv('./train_targets_scored.csv')
Then we will convert categorical variables to respective class codes.To do that we will first merge train and test data. So classes in both set get same corresponding categorical value.
grouped_df = pd.concat([train_df,test_df],axis=0)
grouped_df =grouped_df.drop(['sig_id'],axis=1)
converting
grouped_df['cp_type'] = grouped_df['cp_type'].astype('category')
grouped_df['cp_type'] = grouped_df['cp_type'].cat.codesgrouped_df['cp_dose'] = grouped_df['cp_dose'].astype('category')
grouped_df['cp_dose'] = grouped_df['cp_dose'].cat.codes
Split train test set
x_train = grouped_df[:23814]
x_test = grouped_df[23814:]
y_train = y_train.drop(['sig_id'],axis=1)
not doing validation set because it just a simple example
Modeling
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifiersvc = LinearSVC(max_iter=1500)
now come the new thing
OneVsRestClassifier :-
Now for each class it will build model for each class and it is trained as binary classification for that class. And in end it will predict 0 or 1 for each class .
clf = OneVsRestClassifier(svc)
clf.fit(x_train,y_train)
And we will have array of 0 and 1 and for every 1 we will output corresponding class to it.
Drawbacks:
As it create new model for each class so it will take huge amount of time to train the model