feast

Module feast


The FEAST module provides an interface between the C-library
for feature selection to Python. 

References: 
1) G. Brown, A. Pocock, M.-J. Zhao, and M. Lujan, "Conditional
    likelihood maximization: A unifying framework for information
    theoretic feature selection," Journal of Machine Learning 
    Research, vol. 13, pp. 27-66, 2012.

Version: 0.2.0

Author: Calvin Morrison

License: GPL

Functions

list

BetaGamma(data, labels, n_select, beta=1.0, gamma=1.0)
This algorithm implements conditional mutual information feature select, such that beta and gamma control the weight attached to the redundant mutual and conditional mutual information, respectively.

source code

list

CIFE(data, labels, n_select)
This function implements the Condred feature selection algorithm.

source code

list

CMIM(data, labels, n_select)
This function implements the conditional mutual information maximization feature selection algorithm.

source code

CondMI(data, labels, n_select)
This function implements the conditional mutual information maximization feature selection algorithm.

source code

list

Condred(data, labels, n_select)
This function implements the Condred feature selection algorithm.

source code

list

DISR(data, labels, n_select)
This function implements the double input symmetrical relevance feature selection algorithm.

source code

list

ICAP(data, labels, n_select)
This function implements the interaction capping feature selection algorithm.

source code

list

JMI(data, labels, n_select)
This function implements the joint mutual information feature selection algorithm.

source code

list

MIFS(data, labels, n_select)
This function implements the MIFS algorithm.

source code

list

MIM(data, labels, n_select)
This function implements the MIM algorithm.

source code

list

mRMR(data, labels, n_select)
This funciton implements the max-relevance min-redundancy feature selection algorithm.

source code

tuple

check_data(data, labels)
Check dimensions of the data and the labels.

source code

Variables
	__credits__ = `['Calvin Morrison', 'Gregory Ditzler']`
	__maintainer__ = `'Calvin Morrison'`
	__email__ = `'mutantturkey@gmail.com'`
	__status__ = `'Release'`
	libFSToolbox = `<CDLL 'libFSToolbox.so', handle 2be1240 at 2b4b...`
	__package__ = `None`

Function Details

BetaGamma(data, labels, n_select, beta=1.0, gamma=1.0)

source code

This algorithm implements conditional mutual information feature select, such that beta and gamma control the weight attached to the redundant mutual and conditional mutual information, respectively.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features (REQUIRED)
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations. (REQUIRED)
n_select (integer) - number of features to select. (REQUIRED)
beta (float between 0 and 1.0) - penalty attacted to I(X_j;X_k)
gamma (float between 0 and 1.0) - positive weight attached to the conditional redundancy term I(X_k;X_j|Y)

Returns: list

features in the order they were selected.

CIFE(data, labels, n_select)

source code

This function implements the Condred feature selection algorithm. beta = 1; gamma = 1;

Parameters:

data (ndarray) - A Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select.

Returns: list

CMIM(data, labels, n_select)

source code

This function implements the conditional mutual information maximization feature selection algorithm. Note that this implementation does not allow for the weighting of the redundancy terms that BetaGamma will allow you to do.

Parameters:

data (ndarray) - A Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy array with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select.

Returns: list

features in the order that they were selected.

CondMI(data, labels, n_select)

source code

This function implements the conditional mutual information maximization feature selection algorithm.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select.

Returns:

features in the order they were selected. @rtype list

Condred(data, labels, n_select)

source code

This function implements the Condred feature selection algorithm. beta = 0; gamma = 1;

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select.

Returns: list

the features in the order they were selected.

DISR(data, labels, n_select)

source code

This function implements the double input symmetrical relevance feature selection algorithm.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

ICAP(data, labels, n_select)

source code

This function implements the interaction capping feature selection algorithm.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

JMI(data, labels, n_select)

source code

This function implements the joint mutual information feature selection algorithm.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

MIFS(data, labels, n_select)

source code

This function implements the MIFS algorithm. beta = 1; gamma = 0;

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

MIM(data, labels, n_select)

source code

This function implements the MIM algorithm. beta = 0; gamma = 0;

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

mRMR(data, labels, n_select)

source code

This funciton implements the max-relevance min-redundancy feature selection algorithm.

Parameters:

data (ndarray) - data in a Numpy array such that len(data) = n_observations, and len(data.transpose()) = n_features
labels (ndarray) - labels represented in a numpy list with n_observations as the number of elements. That is len(labels) = len(data) = n_observations.
n_select (integer) - number of features to select. (REQUIRED)

Returns: list

the features in the order they were selected.

check_data(data, labels)

source code

Check dimensions of the data and the labels. Raise and exception if there is a problem.

Data and Labels are automatically cast as doubles before calling the feature selection functions

Parameters:

data - the data
labels - the labels

Returns: tuple

Variables Details

libFSToolbox

Value:

<CDLL 'libFSToolbox.so', handle 2be1240 at 2b4bc10>