Feature Selection

Feature Selection chooses a set of variables that will be present in the data records used for model training, testing and prediction processing.

Factors

  • relevance: features should be correlated with the predictive objectives of the model

  • redundancy: redundancy of features should be minimized

  • dimensionality: manages the number of features

  • availability: features should be available in a sufficient number of model training records

  • accuracy: features should provide data that is as accurate as possible

Processes

Generalized Data Visualization processes can be expressed as follows:

Notes:

References