technics list

technics list#

List of feature engineering techniques


Missing Data Imputation:

    Complete case analysis

    Mean / Median / Mode imputation

    Random Sample Imputation

    Replacement by Arbitrary Value

    Missing Value Indicator

    Multivariate imputation

Categorical Encoding:

    One hot encoding

    Count and Frequency encoding

    Target encoding / Mean encoding

    Ordinal encoding

    Weight of Evidence

    Rare label encoding

    BaseN, feature hashing and others

Variable Transformation:

    Logarithm

    Reciprocal

    Square root

    Exponential

    Yeo-Johnson

    Box-Cox

Discretisation:

    Equal frequency discretisation

    Equal length discretisation

    Discretisation with trees

    Discretisation with ChiMerge

Outlier Removal:

    Removing outliers

    Treating outliers as NaN

    Capping, Windsorisation

Feature Scaling:

    Standardisation

    MinMax Scaling

    Mean Scaling

    Max Absolute Scaling

    Unit norm-Scaling

Date and Time Engineering:

    Extracting days, months, years, quarters, time elapsed

Feature Creation:

    Sum, subtraction, mean, min, max, product, quotient of group of features

Aggregating Transaction Data:

    Same as above but in same feature over time window

Extracting features from text:

    Bag of words

    tfidf

    n-grams

    word2vec

    topic extraction

And finally extracting features from images.