secml.data.selection

CPrototypesSelector

class secml.data.selection.c_prototypes_selector.CPrototypesSelector[source]

Bases: secml.core.c_creator.CCreator

Selection of Prototypes.

Prototype selection methods help reducing the number of samples in a dataset by carefully selecting a subset of prototypes.

[Rb0529608e100-1] A good selection strategy should satisfy the following three conditions. First, if some prototypes are similar-that is, if they are close in the space of strings-their distances to a sample string should vary only little. Hence, in this case, some of the respective vector components are redundant. Consequently, a selection algorithm should avoid redundancies. Secondly, to include as much structural information as possible in the prototypes, they should be uniformly distributed over the whole set of patterns. Thirdly, since outliers are likely to introduce noise and distortions, a selection algorithm should disregard outliers.

References

Rb0529608e100-1

Spillmann, Barbara, et al. “Transforming strings to vector spaces using prototype selection.” Structural, Syntactic, and Statistical Pattern Recognition. Springer Berlin Heidelberg, 2006. 287-296.

Attributes
class_type

Defines class type.

logger

Logger for current object.

sel_idx

Returns an array with the indices of the selected prototypes.

verbose

Verbosity level of logger output.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes)

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

property sel_idx

Returns an array with the indices of the selected prototypes.

abstract select(self, dataset, n_prototypes)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

Returns
reduced_dsCDataset

Dataset with selected prototypes.

CPSBorder

class secml.data.selection.c_ps_border.CPSBorder[source]

Bases: secml.data.selection.c_prototypes_selector.CPrototypesSelector

Selection of Prototypes using border strategy.

Selects the prototypes from the borders of the dataset.

References

Spillmann, Barbara, et al. “Transforming strings to vector spaces using prototype selection.” Structural, Syntactic, and Statistical Pattern Recognition. Springer Berlin Heidelberg, 2006. 287-296.

Attributes
class_type‘border’

Defines class type.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes)

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

select(self, dataset, n_prototypes)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

Returns
reduced_dsCDataset

Dataset with selected prototypes.

CPSCenter

class secml.data.selection.c_ps_center.CPSCenter[source]

Bases: secml.data.selection.c_prototypes_selector.CPrototypesSelector

Selection of Prototypes using center strategy.

Selects the prototypes from the center of the dataset.

References

Spillmann, Barbara, et al. “Transforming strings to vector spaces using prototype selection.” Structural, Syntactic, and Statistical Pattern Recognition. Springer Berlin Heidelberg, 2006. 287-296.

Attributes
class_type‘center’

Defines class type.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes)

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

select(self, dataset, n_prototypes)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

Returns
reduced_dsCDataset

Dataset with selected prototypes.

CPSKMedians

class secml.data.selection.c_ps_kmedians.CPSKMedians[source]

Bases: secml.data.selection.c_prototypes_selector.CPrototypesSelector

Selection of Prototypes using K-Medians strategy.

Runs a k-means clustering to obtain a set of clusters from the dataset. Then selects the prototypes as their set medians.

References

Spillmann, Barbara, et al. “Transforming strings to vector spaces using prototype selection.” Structural, Syntactic, and Statistical Pattern Recognition. Springer Berlin Heidelberg, 2006. 287-296.

Attributes
class_type‘k-medians’

Defines class type.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes[, …])

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

select(self, dataset, n_prototypes, random_state=None)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

random_stateint, RandomState or None, optional

Determines random number generation for centroid initialization. Default None.

Returns
reduced_dsCDataset

Dataset with selected prototypes.

CPSRandom

class secml.data.selection.c_ps_random.CPSRandom[source]

Bases: secml.data.selection.c_prototypes_selector.CPrototypesSelector

Selection of Prototypes using random strategy.

Attributes
class_type‘random’

Defines class type.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes[, …])

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

select(self, dataset, n_prototypes, random_state=None)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

random_stateint, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, is the RandomState instance used by np.random.

Returns
reduced_dsCDataset

Dataset with selected prototypes.

CPSSpanning

class secml.data.selection.c_ps_spanning.CPSSpanning[source]

Bases: secml.data.selection.c_prototypes_selector.CPrototypesSelector

Selection of Prototypes using spanning strategy.

Selects the first prototype as the dataset median, and the remaining ones iteratively, by maximizing the distance to the set of previously-selected prototypes.

References

Spillmann, Barbara, et al. “Transforming strings to vector spaces using prototype selection.” Structural, Syntactic, and Statistical Pattern Recognition. Springer Berlin Heidelberg, 2006. 287-296.

Attributes
class_type‘spanning’

Defines class type.

Methods

copy(self)

Returns a shallow copy of current class.

create([class_item])

This method creates an instance of a class with given type.

deepcopy(self)

Returns a deep copy of current class.

get_class_from_type(class_type)

Return the class associated with input type.

get_params(self)

Returns the dictionary of class parameters.

get_state(self)

Returns the object state dictionary.

get_subclasses()

Get all the subclasses of the calling class.

list_class_types()

This method lists all types of available subclasses of calling one.

load(path)

Loads object from file.

load_state(self, path)

Sets the object state from file.

save(self, path)

Save class object to file.

save_state(self, path)

Store the object state to file.

select(self, dataset, n_prototypes)

Selects the prototypes from input dataset.

set(self, param_name, param_value[, copy])

Set a parameter of the class.

set_params(self, params_dict[, copy])

Set all parameters passed as a dictionary {key: value}.

set_state(self, state_dict[, copy])

Sets the object state using input dictionary.

timed([msg])

Timer decorator.

select(self, dataset, n_prototypes)[source]

Selects the prototypes from input dataset.

Parameters
datasetCDataset

Dataset from which prototypes should be selected

n_prototypesint

Number of prototypes to be selected.

Returns
reduced_dsCDataset

Dataset with selected prototypes.