How to use AutoML-Matrix through AWS/Sagemaker Platform?

How to prepare the dataset for AutoML-Matrix

AutoML Matrix expects the data to be in a CSV file format (comma separated) with no header column
Rows in the file represent data points and columns represent the features / attributes
Feature values can be only numeric (integer or floating point values)
First column represents the class label and it should contain only integer values
There is no need to split the dataset into train/validation/test sets, the algorithm automatically does that internally

How to train AutoML Matrix models from AWS Sagemaker Web Console

Below is a step-by-step description of how to use AutoML Matrix algorithm on AWS:

Log into your AWS account
Go to Amazon Marketplace and subscribe Deep Element AutoML Matrix algorithm
Create a folder (bucket) in AWS S3 Storage and upload data that bucket
Open the web console page of AWS Sagemaker
On the left pane, click on link “Training Jobs”
Click on the button “Create training job”
Enter a name for the training job
Under the “Algorithms Source” section, select “An algorithm subscription from AWS Marketplace”
Select “DeepElement AutoML-Matrix” algorithm
Select the “Instance Type” among the list of suggested instance types
Specify the value of “Maximum Runtime”.
Runtime of the algorithm depends on various factors including number of data samples, number of features, the complexity of the problem itself. A rough approximation of Maximum Runtime based only on number of data samples can be:
- If #Samples < 100,000, then Maximum Runtime can be in range (2, 4) hrs
- If #Samples < 500,000, then Maximum Runtime can be in range (3, 6) hrs
- If #Samples < 1,000,000, then Maximum Runtime can be in range (6, 10) hrs
- If #Samples < 10,000,000, then Maximum Runtime can be in range (10, 20) hrs
Channels: Algorithm expects train data location as part of channel “train”.
- Input Mode: Select “File”
- Content Type: Type “text/csv”
- Compression Type: None
- Record Wrapper: None
- Data Source: Select “S3”
- S3 data type: S3Prefix
- S3 data distribution type: FullyReplicated
- S3 location: Specify the location of S3 location of the train data CSV file
- S3 output path: S3 location of the folder (or bucket) which contains the train data file
Click “create training job”

How to deploy AutoML Matrix models from AWS Sagemaker Web Console

Below is a step-by-step description of how to use AutoML Matrix algorithm on AWS:

Once the training job is completed successfully, open the training jobs page
Click on “Create model package”
Specify the name of the model package
Click “Next”
Under the “Validation and Scanning” section,
- Select “No” to “Publish this model package on AWS Marketplace”
- Select “No” to “Validate this resource”
Click on “Create Model Package” - this should create a new Model Package
Click “Model Packages” on the left pane of the web page, this should show you the list of Model Packages created so far
Select the model package you just created
Select “Create endpoint”
- Model Name: Specify a name for this new model
- Under “Container input options”, select “Use a model package resource”
- Click “Next”
- Endpoint Name: Specify a name for this endpoint
- Select “Create a new endpoint configuration”
- Endpoint configuration name: Specify a value
- Click “Create Endpoint Configuration”
- Click “Submit”
This should create a new endpoint which is callable for making predictions