Using sklearn to cross-validate bayesian network classifier¶
The purpose of this notebook is to show the possible integration of the pyAgrum's classifier in the scikit-learn's ecosystem. Thus, it is possible to use the tools provided by scikit-learn for crossfolding for pyAgrum's Bayesian network.
In [1]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
from pyAgrum.skbn import BNClassifier
In [2]:
from sklearn.model_selection import cross_validate
from sklearn import datasets
# get iris data
iris = datasets.load_iris()
X = iris.data
y = iris.target
In [3]:
model = BNClassifier(learningMethod='MIIC', prior='Smoothing', priorWeight=1,
discretizationNbBins=3,discretizationStrategy="kmeans",discretizationThreshold=10)
In [4]:
cv = cross_validate(model, X, y, cv=30)
print(f"scores with cross-folding : {cv['test_score']}")
print()
print(f"mean score : {cv['test_score'].mean()}")
scores with cross-folding : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.8 1. 1. 1. 0.8 1. 1. 0.8 1. 1. 1. 0.8 1. 1. 1. 1. 1. 1. 1. 1. ] mean score : 0.9733333333333333
In [5]:
cv = cross_validate(model, X, y, cv=50)
print(f"scores with cross-folding : {cv['test_score']}")
print()
print(f"mean score : {cv['test_score'].mean()}")
scores with cross-folding : [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.66666667 1. 1. 1. 1. 1. 1. 0.66666667 1. 1. 1. 1. 1. 0.66666667 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. ] mean score : 0.98