2.8. Using cleverhans within SecML¶

In this tutorial we will show how to craft evasion attacks against machine learning models in SecML through the Cleverhans interface.

Warning

Requires installation of the pytorch and cleverhans extra dependencies. See extra components for more information.

[1]:

%%capture --no-stderr --no-display
# NBVAL_IGNORE_OUTPUT

try:
  import secml
  import torch
  import cleverhans
except ImportError:
  %pip install git+https://gitlab.com/secml/secml#egg=secml[pytorch,cleverhans]

2.8.1. Training the model¶

The first part is the same as the (first notebook). We load here a 2D dataset, so that we can easily plot the attack initial point and path.

[2]:

random_state = 999

n_features = 2  # Number of features
n_samples = 1100  # Number of samples
centers = [[-2, 0], [2, -2], [2, 2]]  # Centers of the clusters
cluster_std = 0.75  # Standard deviation of the clusters
n_classes = len(centers)

from secml.data.loader import CDLRandomBlobs

dataset = CDLRandomBlobs(n_features=n_features,
                         centers=centers,
                         cluster_std=cluster_std,
                         n_samples=n_samples,
                         random_state=random_state).load()

n_tr = 1000  # Number of training set samples
n_ts = 100  # Number of test set samples

# Split in training and test
from secml.data.splitter import CTrainTestSplit

splitter = CTrainTestSplit(
    train_size=n_tr, test_size=n_ts, random_state=random_state)
tr, ts = splitter.split(dataset)

# Normalize the data
from secml.ml.features import CNormalizerMinMax

nmz = CNormalizerMinMax()
tr.X = nmz.fit_transform(tr.X)
ts.X = nmz.transform(ts.X)

# Metric to use for training and performance evaluation
from secml.ml.peval.metrics import CMetricAccuracy

metric = CMetricAccuracy()

# Creation of the multiclass classifier
import torch
from torch import nn

class Net(nn.Module):
    """
    Model with input size (-1, 5) for blobs dataset
    with 5 features
    """
    def __init__(self, n_features, n_classes):
        """Example network."""
        super(Net, self).__init__()
        self.fc1 = nn.Linear(n_features, 50)
        self.fc2 = nn.Linear(50, n_classes)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Random seed for PyTorch
torch.manual_seed(random_state)

# torch model creation
net = Net(n_features=n_features, n_classes=n_classes)

from torch import optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(),
                      lr=0.001, momentum=0.9)

# wrap torch model in CClassifierPyTorch class
from secml.ml.classifiers import CClassifierPyTorch
clf = CClassifierPyTorch(model=net,
                         loss=criterion,
                         optimizer=optimizer,
                         input_shape=(n_features,),
                         epochs=5,
                         random_state=random_state)

# We can now fit the classifier
clf.fit(tr.X, tr.Y)

# Compute predictions on a test set
y_pred = clf.predict(ts.X)

# Evaluate the accuracy of the classifier
acc = metric.performance_score(y_true=ts.Y, y_pred=y_pred)

print("Accuracy on test set: {:.2%}".format(acc))

Accuracy on test set: 100.00%

2.8.2. Preparing the attacks¶

Now that we have the model we can prepare the attacks. We will test several attack algorithm from cleverhans library.

We can specify a starting point for the attacks, we select a point from the class 1, which is in the lower right-corner of the 2D plane. As always, we can define a box for the attack in order to comply with the feature range ([0, 1]) and a maximum distance, as well as the target class (as always, specifying y_target=None will produce an untargeted attack).

[3]:

from secml.array import CArray

# x0, y0 = ts[5, :].X, ts[5, :].Y  # Initial sample
x0, y0 = CArray([0.7, 0.4]), CArray([1])
lb, ub = 0, 1
dmax = 0.4
y_target = 2

We can finally specify the parameters for the attacks. We can compare the paths of several attacks, for which the parameters can be found in cleverhans docs.

We are going to use the following attacks:

FGM Goodfellow IJ, Shlens J, Szegedy C. Explaining and Harnessing Adversarial Examples. arXiv:14126572 [cs, stat] [Internet]. 2014

PGD Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. arXiv:160702533 [cs, stat] [Internet]. 2017

MIM Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, et al. Boosting Adversarial Attacks with Momentum. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition [Internet]. Salt Lake City, UT: IEEE; 2018

CW Carlini N, Wagner D. Towards Evaluating the Robustness of Neural Networks. arXiv:160804644 [cs] [Internet]. 2016

[4]:

from cleverhans.attacks import CarliniWagnerL2, ProjectedGradientDescent, \
    MomentumIterativeMethod,FastGradientMethod

from collections import namedtuple
Attack = namedtuple('Attack', 'attack_cls short_name attack_params')

attacks = [
    Attack(FastGradientMethod, 'FGM', {'eps': dmax,
                                       'clip_max': ub,
                                       'clip_min': lb,
                                       'ord': 2}),
    Attack(ProjectedGradientDescent, 'PGD', {'eps': dmax,
                                             'eps_iter': 0.05,
                                             'nb_iter': 50,
                                             'clip_max': ub,
                                             'clip_min': lb,
                                             'ord': 2,
                                             'rand_init': False}),
    Attack(MomentumIterativeMethod, 'MIM', {'eps': dmax,
                                            'eps_iter': 0.05,
                                            'nb_iter': 50,
                                            'clip_max': ub,
                                            'clip_min': lb,
                                            'ord': 2,
                                            'decay_factor': 1}),
    Attack(CarliniWagnerL2, 'CW2', {'binary_search_steps': 1,
                                    'initial_const': 0.2,
                                    'confidence': 10,
                                    'abort_early': True,
                                    'clip_min': lb,
                                    'clip_max': ub,
                                    'max_iterations': 50,
                                    'learning_rate': 0.1})]

2.8.3. Running the attacks¶

We can now run the attacks by passing them to the CAttackEvasionCleverhans class, which handles the attack optimization and provides useful output as the attack path and objective function. We can plot these information with the help of the CFigure module and its powerful APIs plot_function and plot_path.

[5]:

from secml.figure import CFigure
# Only required for visualization in notebooks
%matplotlib inline

from secml.adv.attacks import CAttackEvasionCleverhans

fig = CFigure(width=20, height=15)

for i, attack in enumerate(attacks):
    fig.subplot(2, 2, i + 1)

    fig.sp.plot_decision_regions(clf,
                                 plot_background=False,
                                 n_grid_points=100)

    cleverhans_attack = CAttackEvasionCleverhans(
        classifier=clf,
        y_target=y_target,
        clvh_attack_class=attack.attack_cls,
        **attack.attack_params)

    # Run the evasion attack on x0
    print("Attack {:} started...".format(attack.short_name))
    y_pred_CH, _, adv_ds_CH, _ = cleverhans_attack.run(x0, y0)
    print("Attack finished!")

    fig.sp.plot_fun(cleverhans_attack.objective_function,
                    multipoint=True, plot_levels=False,
                    n_grid_points=50, alpha=0.6)

    print("Original x0 label: ", y0.item())
    print("Adversarial example label ({:}): "
          "".format(attack.attack_cls.__name__), y_pred_CH.item())

    print("Number of classifier function evaluations: {:}"
          "".format(cleverhans_attack.f_eval))
    print("Number of classifier gradient evaluations: {:}"
          "".format(cleverhans_attack.grad_eval))

    fig.sp.plot_path(cleverhans_attack.x_seq)
    fig.sp.title(attack.short_name)
    fig.sp.text(0.2, 0.92, "f_eval:{}\ngrad_eval:{}"
                           "".format(cleverhans_attack.f_eval,
                                     cleverhans_attack.grad_eval),
                bbox=dict(facecolor='white'), horizontalalignment='right')
fig.show()

Attack FGM started...
Attack finished!
Original x0 label:  1
Adversarial example label (FastGradientMethod):  2
Number of classifier function evaluations: 1
Number of classifier gradient evaluations: 1
Attack PGD started...
Attack finished!
Original x0 label:  1
Adversarial example label (ProjectedGradientDescent):  2
Number of classifier function evaluations: 50
Number of classifier gradient evaluations: 50
Attack MIM started...
Attack finished!
Original x0 label:  1
Adversarial example label (MomentumIterativeMethod):  2
Number of classifier function evaluations: 50
Number of classifier gradient evaluations: 50
Attack CW2 started...
Attack finished!
Original x0 label:  1
Adversarial example label (CarliniWagnerL2):  2
Number of classifier function evaluations: 46
Number of classifier gradient evaluations: 46

../_images/tutorials_09-Cleverhans_10_1.png