Feature
Feature¶
Tools for handling features and feature labels during classification, data preparation and evaluation
-
best.feature.augment_features(x, feature_names=None, feature_indexes=[], operation=None, mutual=False, operation_str='')¶ Augments features with entered operations (mutual - between features such as
*,/,+,-, ….; non mutual - log, exp, power, …)- Parameters
x (numpy ndarray) – shape[n_samples, n_features]
feature_names (list or numpy array of strings, optional) – names of features
feature_indexes (list or numpy array) – indexes of features which will be augmented
operation (function) – callable function which will be applied on existing features.
mutual (bool) – indicates whether operation is applied on single feature e.g. np.log10, or on 2 parameters e.g. np.divide if mutual = True, then applied on all feature combination specified in feature_indexes
- Returns
- Return type
numpy ndarray -> shape[n_samples, n_features]
-
best.feature.balance_classes(x, y, std_factor=0.0)¶ Balances unbalanced classes in dataset by extending the sample array with same samples, possibly with introduced noise. Detects classes from y variable and number of samples per category. Duplicates samples from the categories with lower number of samples. std_factor gives the level of noise introduced into duplicated samples relatively to the std of a given dimension for a given category.
- Parameters
x (numpy ndarray) – shape[n_samples, n_features]
y (list or numpy array) – string or int indexes for each category
std_factor (float) – Amount of noise introduced into duplicated features relatively to std of a given feature within a category.
- Returns
numpy ndarray – x - samples
list – y - categories
-
best.feature.find_category_outliers(x, y=None)¶ Finds outliers for each category within data. Check website: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.LocalOutlierFactor.html
- Parameters
x (numpy ndarray) – shape[n_samples, n_features]
y (list or numpy array) – string or int indexes for each category
- Returns
position index list with detected outliers
- Return type
list
-
best.feature.get_classification_scores(Y, YY, labels=None)¶ Returns a classification report. All values are already in a formated string.
-
best.feature.print_classification_scores(Y, YY, N_merge=False)¶ Prints classification report for sleep scoring labels.
-
best.feature.remove_features(x, feature_names=None, to_del=None)¶ Removes features
- Parameters
x (numpy ndarray) – shape[n_samples, n_features]
feature_names (list or numpy array, optional) – names of features
to_del –
-
best.feature.remove_samples(x, y=None, to_del=None)¶ Removes samples
- Parameters
x (numpy ndarray / list / pd.DataFrame) – shape[n_samples, n_features]
y (list or numpy array, optional) – category reference for each sample
to_del –
-
best.feature.replace_annotations(Y, old_key=None, new_key=None)¶ Replaces annotation names in a numpy array or list
-
best.feature.zscore(x)¶ Calculates Z-score :param x: shape[n_samples, n_features] :type x: numpy ndarray
- Returns
normalized_features
- Return type
numpy ndarray