Training Amorphous Models

Once you’ve created a data source, you can train a model. In the navigation, click on “Models”, and then “Train Model.” Choose a descriptive name for your model, the data source you just created, and the model type. Then click “Train”. For more details on the various model types, read on.

Model Overview

In the model overview section, each trained model is listed. With each model, the Root Mean Square Error is also listed. RMSE measures the average difference between predicted values and the actual values. Lower RMSE values indicate a better fit with the data.

Model Types

Amorphous supports gradient-boosted decision trees for analyzing your data. Decision trees are the most common technique used by the machine learning community for logistic regression, and works well for a broad range of data. Amorphous uses several different decision tree implementations (including XGBoost and XGM) and will automatically choose the model with the lowest error rate.

Encoding

Machine learning models operating on floating point numbers (e.g., 1.38172, -381.2831). Amorphous automatically encodes your data into floating point numbers in order for training and analysis to work. Amorphous uses the following rules for encoding data:

Integers and floating point numbers pass through unchanged. Examples that fall into this category are age and lead score.
Boolean values (True/False) also pass through unchanged.
Categorical values that are encoded as strings are “n-hot encoded”. For example, a column Region that may have the values N. America, LATAM, EMEA, and APAC will be translated into four columns, Region_N. America, Region_LATAM, Region_EMEA, and Region_APAC, with true/false values for each filled in automatically.
Dates and times are translated into months, day of the year (e.g., day 5, day 200), and hour. This allows Amorphous to identify if there are any particular patterns in a date or time range.