Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs

Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs

Amazon SageMaker Studio provides a fully managed solution for data scientists to interactively build, train, and deploy machine learning (ML) models. In addition to the interactive ML experience, data workers also seek solutions to run notebooks as ephemeral jobs without the need to refactor code as Python modules or learn DevOps tools and best practices to automate their deployment infrastructure. Some common use cases for doing this include:

  • Regularly running model inference to generate reports
  • Scaling up a feature engineering step after having tested in Studio against a subset of data on a small instance
  • Retraining and deploying models on some cadence
  • Analyzing your team’s Amazon SageMaker usage on a regular cadence

Previously, when data scientists wanted to take the code they built interactively on notebooks and run them as batch jobs, they were faced with a steep learning curve using Amazon SageMaker Pipelines, AWS Lambda, Amazon EventBridge, or other solutions that are difficult to set up, use, and manage.

With SageMaker notebook jobs, you can now run your notebooks as is or in a parameterized fashion with just a few simple clicks from the SageMaker Studio or SageMaker Studio Lab interface. You can run these notebooks on a schedule or immediately. There’s no need for the end-user to modify their existing notebook code. When the job is complete, you can view the populated notebook cells, including any visualizations!

In this post, we share how to operationalize your SageMaker Studio notebooks as scheduled notebook jobs.

Solution overview

The following diagram illustrates our solution architecture. We utilize the pre-installed SageMaker extension to run notebooks as a job immediately or on a schedule.

In the following sections, we walk through the steps to create a notebook, parameterize cells, customize additional options, and schedule your job. We also include a sample use case.

Prerequisites

To use SageMaker notebook jobs, you need to be running a JupyterLab 3 JupyterServer app within Studio. For more information on how to upgrade to JupyterLab 3, refer to View and update the JupyterLab version of an app from the console. Be sure to Shut down and Update SageMaker Studio in order to pick up the latest updates.

To define job definitions that run notebooks on a schedule, you may need to add additional permissions to your SageMaker execution role.

First, add a trust relationship to your SageMaker execution role that allows events.amazonaws.com to assume your role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "sagemaker.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "events.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Additionally, you may need to create and attach an inline policy to your execution role. The below policy is supplementary to the very permissive AmazonSageMakerFullAccess policy. For a complete and minimal set of permissions see Install Policies and Permissions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "events:TagResource",
                "events:DeleteRule",
                "events:PutTargets",
                "events:DescribeRule",
                "events:PutRule",
                "events:RemoveTargets",
                "events:DisableRule",
                "events:EnableRule"
            ],
            "Resource": "*",
            "Condition": {
              "StringEquals": {
                "aws:ResourceTag/sagemaker:is-scheduling-notebook-job": "true"
              }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/*",
            "Condition": {
                "StringLike": {
                    "iam:PassedToService": "events.amazonaws.com"
                }
            }
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": "sagemaker:ListTags",
            "Resource": "arn:aws:sagemaker:*:*:user-profile/*/*"
        }
    ]
}

Create a notebook job

To operationalize your notebook as a SageMaker notebook job, choose the Create a notebook job icon.

Alternatively, you can choose (right-click) your notebook on the file system and choose Create Notebook Job.

In the Create job section, simply choose the right instance type for your scheduled job based on your workload: standard instances, compute optimized instances, or accelerated computing instances that contain GPUs. You can choose any of the instances available for SageMaker training jobs. For the complete list of instances available, refer to Amazon SageMaker Pricing.

When a job is complete, you can view the output notebook file with its populated cells, as well as the underlying logs from the job runs.

Parameterize cells

When moving a notebook to a production workflow, it’s important to be able to reuse the same notebook with different sets of parameters for modularity. For example, you may want to parameterize the dataset location or the hyperparameters of your model so that you can reuse the same notebook for many distinct model trainings. SageMaker notebook jobs support this through cell tags. Simply choose the double gear icon in the right pane and choose Add Tag. Then label the tag as parameters.

By default, the notebook job run uses the parameter values specified in the notebook, but alternatively, you can modify these as a configuration for your notebook job.

Configure additional options

When creating a notebook job, you can expand the Additional options section in order to customize your job definition. Studio will automatically detect the image or kernel you’re using in your notebook and pre-select it for you. Ensure that you have validated this selection.

You can also specify environment variables or startup scripts to customize your notebook run environment. For the full list of configurations, see Additional Options.

Schedule your job

To schedule your job, choose Run on a schedule and set an appropriate interval and time. Then you can choose the Notebook Jobs tab that is visible after choosing the home icon. After the notebook is loaded, choose the Notebook Job Definitions tab to pause or remove your schedule.

Example use case

For our example, we showcase an end-to-end ML workflow that prepares data from a ground truth source, trains a refreshed model from that time period, and then runs inference on the most recent data to generate actionable insights. In practice, you might run a complete end-to-end workflow, or simply operationalize one step of your workflow. You can schedule an AWS Glue interactive session for daily data preparation, or run a batch inference job that generates graphical results directly in your output notebook.

The full notebook for this example can be found in our SageMaker Examples GitHub repository. The use case assumes that we’re a telecommunications company that is looking to schedule a notebook that predicts probable customer churn based on a model trained with the most recent data we have available.

To start, we gather the most recently available customer data and perform some preprocessing on it:

import pandas as pd
from synthetic_data import generate_data

previous_two_weeks_data = generate_data(5000, label_known=True)
todays_data = generate_data(300, label_known=False)

processed_prior_data = process_data(previous_two_weeks_data, label_known=True)
processed_todays_data = process_data(todays_data, label_known=False)

We train our refreshed model on this updated training data in order to make accurate predictions on todays_data:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, confusion_matrix, ConfusionMatrixDisplay

y = np.ravel(processed_prior_data[["Churn"]])
x = processed_prior_data.drop(["Churn"], axis=1)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25)

clf = RandomForestClassifier(n_estimators=int(number_rf_estimators), criterion="gini")
clf.fit(x_train, y_train)

Because we’re going to schedule this notebook as a daily report, we want to capture how good our refreshed model performed on our validation set so that we can be confident in its future predictions. The results in the following screenshot are from our scheduled inference report.

Lastly, you want to capture the predicted results of today’s data into a database so that actions can be taken based on the results of this model.

After the notebook is understood, feel free to run this as an ephemeral job using the Run now option described earlier or test out the scheduling functionality.

Clean up

If you followed along with our example, be sure to pause or delete your notebook job’s schedule to avoid incurring ongoing charges.

Conclusion

Bringing notebooks to production with SageMaker notebook jobs vastly simplifies the undifferentiated heavy lifting required by data workers. Whether you’re scheduling end-to-end ML workflows or a piece of the puzzle, we encourage you to put some notebooks in production using SageMaker Studio or SageMaker Studio Lab! To learn more, see Notebook-based Workflows.


About the authors

Sean MorganSean Morgan is a Senior ML Solutions Architect at AWS. He has experience in the semiconductor and academic research fields, and uses his experience to help customers reach their goals on AWS. In his free time Sean is an activate open source contributor/maintainer and is the special interest group lead for TensorFlow Addons.

Sumedha Swamy is a Principal Product Manager at Amazon Web Services. He leads SageMaker Studio team to build it into the IDE of choice for interactive data science and data engineering workflows. He has spent the past 15 years building customer-obsessed consumer and enterprise products using Machine Learning. In his free time he likes photographing the amazing geology of the American Southwest.

Edward Sun is a Senior SDE working for SageMaker Studio at Amazon Web Services. He is focused on building interactive ML solution and simplifying the customer experience to integrate SageMaker Studio with popular technologies in data engineering and ML ecosystem. In his spare time, Edward is big fan of camping, hiking and fishing and enjoys the time spending with his family.

Read More

How xarvio Digital Farming Solutions accelerates its development with Amazon SageMaker geospatial capabilities

How xarvio Digital Farming Solutions accelerates its development with Amazon SageMaker geospatial capabilities

This is a guest post co-written by Julian Blau, Data Scientist at xarvio Digital Farming Solutions; BASF Digital Farming GmbH, and Antonio Rodriguez, AI/ML Specialist Solutions Architect at AWS

xarvio Digital Farming Solutions is a brand from BASF Digital Farming GmbH, which is part of BASF Agricultural Solutions division. xarvio Digital Farming Solutions offers precision digital farming products to help farmers optimize crop production. Available globally, xarvio products use machine learning (ML), image recognition technology, and advanced crop and disease models, in combination with data from satellites and weather station devices, to deliver accurate and timely agronomic recommendations to manage the needs of individual fields. xarvio products are tailored to local farming conditions, can monitor growth stages, and recognize diseases and pests. They increase efficiency, save time, reduce risks, and provide higher reliability for planning and decision-making—all while contributing to sustainable agriculture.

We work with different geospatial data, including satellite imagery of the areas where our users’ fields are located, for some of our use cases. Therefore, we use and process hundreds of large image files daily. Initially, we had to invest a lot of manual work and effort to ingest, process, and analyze this data using third-party tools, open-source libraries, or general-purpose cloud services. In some instances, this could take up to 2 months for us to build the pipelines for each specific project. Now, by utilizing the geospatial capabilities of Amazon SageMaker, we have reduced this time to just 1–2 weeks.

This time-saving is the result of automating the geospatial data pipelines to deliver our use cases more efficiently, along with using built-in reusable components for speeding up and improving similar projects in other geographical areas, while applying the same proven steps for other use cases based on similar data.

In this post, we go through an example use case to describe some of the techniques we commonly use, and show how implementing these using SageMaker geospatial functionalities in combination with other SageMaker features delivers measurable benefits. We also include code examples so you can adapt these to your own specific use cases.

Overview of solution

A typical remote sensing project for developing new solutions requires a step-by-step analysis of imagery taken by optical satellites such as Sentinel or Landsat, in combination with other data, including weather forecasts or specific field properties. The satellite images provide us with valuable information used in our digital farming solutions to help our users accomplish various tasks:

  • Detecting diseases early in their fields
  • Planning the right nutrition and treatments to be applied
  • Getting insights on weather and water for planning irrigation
  • Predicting crop yield
  • Performing other crop management tasks

To achieve these goals, our analyses typically require preprocessing of the satellite images with different techniques that are common in the geospatial domain.

To demonstrate the capabilities of SageMaker geospatial, we experimented with identifying agricultural fields through ML segmentation models. Additionally, we explored the preexisting SageMaker geospatial models and the bring your own model (BYOM) functionality on geospatial tasks such as land use and land cover classification, or crop classification, often requiring panoptic or semantic segmentation techniques as additional steps in the process.

In the following sections, we go through some examples of how to perform these steps with SageMaker geospatial capabilities. You can also follow these in the end-to-end example notebook available in the following GitHub repository.

As previously mentioned, we selected the land cover classification use case, which consists of identifying the type of physical coverage that we have on a given geographical area on the earth’s surface, organized on a set of classes including vegetation, water, or snow. This high-resolution classification allows us to detect the details for the location of the fields and its surroundings with high accuracy, which can be later chained with other analyses such as change detection in crop classification.

Client setup

First, let’s assume we have users with crops being cultivated in a given geographical area that we can identify within a polygon of geospatial coordinates. For this post, we define an example area over Germany. We can also define a given time range, for example in the first months of 2022. See the following code:

### Coordinates for the polygon of your area of interest...
coordinates = [
    [9.181602157004177, 53.14038825707946],
    [9.181602157004177, 52.30629767547948],
    [10.587520893823973, 52.30629767547948],
    [10.587520893823973, 53.14038825707946],
    [9.181602157004177, 53.14038825707946],
]
### Time-range of interest...
time_start = "2022-01-01T12:00:00Z"
time_end = "2022-05-01T12:00:00Z"

In our example, we work with the SageMaker geospatial SDK through programmatic or code interaction, because we’re interested in building code pipelines that can be automated with the different steps required in our process. Note you could also work with an UI through the graphical extensions provided with SageMaker geospatial in Amazon SageMaker Studio if you prefer this approach, as shown in the following screenshots. For accessing the Geospatial Studio UI, open the SageMaker Studio Launcher and choose Manage Geospatial resources. You can check more details in the documentation to Get Started with Amazon SageMaker Geospatial Capabilities.

Geospatial UI launcher

Geospatial UI main

Geospatial UI list of jobs

Here you can graphically create, monitor, and visualize the results of the Earth Observation jobs (EOJs) that you run with SageMaker geospatial features.

Back to our example, the first step for interacting with the SageMaker geospatial SDK is to set up the client. We can do this by establishing a session with the botocore library:

session = botocore.session.get_session()
gsClient = session.create_client(
    service_name='sagemaker-geospatial',
    region_name=region) #Replace with your region, e.g., 'us-east-1'

From this point on, we can use the client for running any EOJs of interest.

Obtaining data

For this use case, we start by collecting satellite imagery for our given geographical area. Depending on the location of interest, there might be more or less frequent coverage by the available satellites, which have its imagery organized in what is usually referred to as raster collections.

With the geospatial capabilities of SageMaker, you have direct access to high-quality data sources for obtaining the geospatial data directly, including those from AWS Data Exchange and the Registry of Open Data on AWS, among others. We can run the following command to list the raster collections already provided by SageMaker:

list_raster_data_collections_resp = gsClient.list_raster_data_collections()

This returns the details for the different raster collections available, including the Landsat C2L2 Surface Reflectance (SR), Landsat C2L2 Surface Temperature (ST), or the Sentinel 2A & 2B. Conveniently, Level 2A imagery is already optimized into Cloud-Optimized GeoTIFFs (COGs). See the following code:

…
{'Name': 'Sentinel 2 L2A COGs',
  'Arn': 'arn:aws:sagemaker-geospatial:us-west-2:378778860802:raster-data-collection/public/nmqj48dcu3g7ayw8',
  'Type': 'PUBLIC',
  'Description': 'Sentinel-2a and Sentinel-2b imagery, processed to Level 2A (Surface Reflectance) and converted to Cloud-Optimized GeoTIFFs'
…

Let’s take this last one for our example, by setting our data_collection_arn parameter to the Sentinel 2 L2A COGs’ collection ARN.

We can also search the available imagery for a given geographical location by passing the coordinates of a polygon we defined as our area of interest (AOI). This allows you to visualize the image tiles available that cover the polygon you submit for the specified AOI, including the Amazon Simple Storage Service (Amazon S3) URIs for these images. Note that satellite imagery is typically provided in different bands according to the wavelength of the observation; we discuss this more later in the post.

response = gsClient.search_raster_data_collection(**eoj_input_config, Arn=data_collection_arn)

The preceding code returns the S3 URIs for the different image tiles available, that you can directly visualize with any library compatible with GeoTIFFs such as rasterio. For example, let’s visualize two of the True Color Image (TCI) tiles.

…
'visual': {'Href': 'https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/32/U/NC/2022/3/S2A_32UNC_20220325_0_L2A/TCI.tif'},
…

True Color Image 1True Color Image 2

Processing techniques

Some of the most common preprocessing techniques that we apply include cloud removal, geo mosaic, temporal statistics, band math, or stacking. All of these processes can now be done directly through the use of EOJs in SageMaker, without the need to perform manual coding or using complex and expensive third-party tools. This makes it 50% faster to build our data processing pipelines. With SageMaker geospatial capabilities, we can run these processes over different input types. For example:

  • Directly run a query for any of the raster collections included with the service through the RasterDataCollectionQuery parameter
  • Pass imagery stored in Amazon S3 as an input through the DataSourceConfig parameter
  • Simply chain the results of a previous EOJ through the PreviousEarthObservationJobArn parameter

This flexibility allows you to build any kind of processing pipeline you need.

The following diagram illustrates the processes we cover in our example.

Geospatial Processing tasks

In our example, we use a raster data collection query as input, for which we pass the coordinates of our AOI and time range of interest. We also specify a percentage of maximum cloud coverage of 2%, because we want clear and noise-free observations of our geographical area. See the following code:

eoj_input_config = {
    "RasterDataCollectionQuery": {
        "RasterDataCollectionArn": data_collection_arn,
        "AreaOfInterest": {
            "AreaOfInterestGeometry": {"PolygonGeometry": {"Coordinates": [coordinates]}}
        },
        "TimeRangeFilter": {"StartTime": time_start, "EndTime": time_end},
        "PropertyFilters": {
            "Properties": [
                {"Property": {"EoCloudCover": {"LowerBound": 0, "UpperBound": 2}}}
            ]
        },
    }
}

For more information on supported query syntax, refer to Create an Earth Observation Job.

Cloud gap removal

Satellite observations are often less useful due to high cloud coverage. Cloud gap filling or cloud removal is the process of replacing the cloudy pixels from the images, which can be done with different methods to prepare the data for further processing steps.

With SageMaker geospatial capabilities, we can achieve this by specifying a CloudRemovalConfig parameter in the configuration of our job.

eoj_config =  {
    'CloudRemovalConfig': {
        'AlgorithmName': 'INTERPOLATION',
        'InterpolationValue': '-9999'
    }
}

Note that we’re using an interpolation algorithm with a fixed value in our example, but there are other configurations supported, as explained in the Create an Earth Observation Job documentation. The interpolation allows it estimating a value for replacing the cloudy pixels, by considering the surrounding pixels.

We can now run our EOJ with our input and job configurations:

response = gsClient.start_earth_observation_job(
    Name =  'cloudremovaljob',
    ExecutionRoleArn = role,
    InputConfig = eoj_input_config,
    JobConfig = eoj_config,
)

This job takes a few minutes to complete depending on the input area and processing parameters.

When it’s complete, the results of the EOJ are stored in a service-owned location, from where we can either export the results to Amazon S3, or chain these as input for another EOJ. In our example, we export the results to Amazon S3 by running the following code:

response = gsClient.export_earth_observation_job(
    Arn = cr_eoj_arn,
    ExecutionRoleArn = role,
    OutputConfig = {
        'S3Data': {
            'S3Uri': f's3://{bucket}/{prefix}/cloud_removal/',
            'KmsKeyId': ''
        }
    }
)

Now we’re able to visualize the resulting imagery stored in our specified Amazon S3 location for the individual spectral bands. For example, let’s inspect two of the blue band images returned.

Alternatively, you can also check the results of the EOJ graphically by using the geospatial extensions available in Studio, as shown in the following screenshots.

Cloud Removal UI 1   Cloud Removal UI 2

Temporal statistics

Because the satellites continuously orbit around earth, the images for a given geographical area of interest are taken at specific time frames with a specific temporal frequency, such as daily, every 5 days, or 2 weeks, depending on the satellite. The temporal statistics process enables us to combine different observations taken at different times to produce an aggregated view, such as a yearly mean, or the mean of all observations in a specific time range, for the given area.

With SageMaker geospatial capabilities, we can do this by setting the TemporalStatisticsConfig parameter. In our example, we obtain the yearly mean aggregation for the Near Infrared (NIR) band, because this band can reveal vegetation density differences below the top of the canopies:

eoj_config =  {
    'TemporalStatisticsConfig': {
        'GroupBy': 'YEARLY',
        'Statistics': ['MEAN'],
        'TargetBands': ['nir']
    }
}

After a few minutes running an EOJ with this config, we can export the results to Amazon S3 to obtain imagery like the following examples, in which we can observe the different vegetation densities represented with different color intensities. Note the EOJ can produce multiple images as tiles, depending on the satellite data available for the time range and coordinates specified.

Temporal Statistics 1Temporal Statistics 2

Band math

Earth observation satellites are designed to detect light in different wavelengths, some of which are invisible to the human eye. Each range contains specific bands of the light spectrum at different wavelengths, which combined with arithmetic can produce images with rich information about characteristics of the field such as vegetation health, temperature, or presence of clouds, among many others. This is performed in a process commonly called band math or band arithmetic.

With SageMaker geospatial capabilities, we can run this by setting the BandMathConfig parameter. For example, let’s obtain the moisture index images by running the following code:

eoj_config =  {
    'BandMathConfig': {
        'CustomIndices': {
            'Operations': [
                {
                    'Name': 'moisture',
                    'Equation': '(B8A - B11) / (B8A + B11)'
                }
            ]
        }
    }
}

After a few minutes running an EOJ with this config, we can export the results and obtain images, such as the following two examples.

Moisture index 1Moisture index 2Moisture index legend

Stacking

Similar to band math, the process of combining bands together to produce composite images from the original bands is called stacking. For example, we could stack the red, blue, and green light bands of a satellite image to produce the true color image of the AOI.

With SageMaker geospatial capabilities, we can do this by setting the StackConfig parameter. Let’s stack the RGB bands as per the previous example with the following command:

eoj_config =  {
    'StackConfig': {
        'OutputResolution': {
            'Predefined': 'HIGHEST'
        },
        'TargetBands': ['red', 'green', 'blue']
    }
}

After a few minutes running an EOJ with this config, we can export the results and obtain images.

Stacking TCI 1Stacking TCI 2

Semantic segmentation models

As part of our work, we commonly use ML models to run inferences over the preprocessed imagery, such as detecting cloudy areas or classifying the type of land in each area of the images.

With SageMaker geospatial capabilities, you can do this by relying on the built-in segmentation models.

For our example, let’s use the land cover segmentation model by specifying the LandCoverSegmentationConfig parameter. This runs inferences on the input by using the built-in model, without the need to train or host any infrastructure in SageMaker:

response = gsClient.start_earth_observation_job(
    Name =  'landcovermodeljob',
    ExecutionRoleArn = role,
    InputConfig = eoj_input_config,
    JobConfig = {
        'LandCoverSegmentationConfig': {},
    },
)

After a few minutes running a job with this config, we can export the results and obtain images.

Land Cover 1Land Cover 2Land Cover 3Land Cover 4

In the preceding examples, each pixel in the images corresponds to a land type class, as shown in the following legend.

Land Cover legend

This allows us to directly identify the specific types of areas in the scene such as vegetation or water, providing valuable insights for additional analyses.

Bring your own model with SageMaker

If the state-of-the-art geospatial models provided with SageMaker aren’t enough for our use case, we can also chain the results of any of the preprocessing steps shown so far with any custom model onboarded to SageMaker for inference, as explained in this SageMaker Script Mode example. We can do this with any of the inference modes supported in SageMaker, including synchronous with real-time SageMaker endpoints, asynchronous with SageMaker asynchronous endpoints, batch or offline with SageMaker batch transforms, and serverless with SageMaker serverless inference. You can check further details about these modes in the Deploy Models for Inference documentation. The following diagram illustrates the workflow in high-level.

Inference flow options

For our example, let’s assume we have onboarded two models for performing a land cover classification and crop type classification.

We just have to point towards our trained model artifact, in our example a PyTorch model, similar to the following code:

from sagemaker.pytorch import PyTorchModel
import datetime

model = PyTorchModel(
    name=model_name, ### Set a model name
    model_data=MODEL_S3_PATH, ### Location of the custom model in S3
    role=role,
    entry_point='inference.py', ### Your inference entry-point script
    source_dir='code', ### Folder with any dependencies
    image_uri=image_uri, ### URI for your AWS DLC or custom container
    env={
        'TS_MAX_REQUEST_SIZE': '100000000',
        'TS_MAX_RESPONSE_SIZE': '100000000',
        'TS_DEFAULT_RESPONSE_TIMEOUT': '1000',
    }, ### Optional – Set environment variables for max size and timeout
)

predictor = model.deploy(
    initial_instance_count = 1, ### Your number of instances
    instance_type = 'ml.g4dn.8xlarge', ### Your instance type
    async_inference_config=sagemaker.async_inference.AsyncInferenceConfig(
        output_path=f"s3://{bucket}/{prefix}/output",
        max_concurrent_invocations_per_instance=2,
    ), ### Optional – Async config if using SageMaker Async Endpoints
)

predictor.predict(data) ### Your images for inference

This allows you to obtain the resulting images after inference, depending on the model you’re using.

In our example, when running a custom land cover segmentation, the model produces images similar to the following, where we compare the input and prediction images with its corresponding legend.

Land Cover Segmentation 1  Land Cover Segmentation 2. Land Cover Segmentation legend

The following is another example of a crop classification model, where we show the comparison of the original vs. resulting panoptic and semantic segmentation results, with its corresponding legend.

Crop Classification

Automating geospatial pipelines

Finally, we can also automate the previous steps by building geospatial data processing and inference pipelines with Amazon SageMaker Pipelines. We simply chain each preprocessing step required through the use of Lambda Steps and Callback Steps in Pipelines. For example, you could also add a final inference step using a Transform Step, or directly through another combination of Lambda Steps and Callback Steps, for running an EOJ with one of the built-in semantic segmentation models in SageMaker geospatial features.

Note we’re using Lambda Steps and Callback Steps in Pipelines because the EOJs are asynchronous, so this type of step allows us to monitor the run of the processing job and resume the pipeline when it’s complete through messages in an Amazon Simple Queue Service (Amazon SQS) queue.

Geospatial Pipeline

You can check the notebook in the GitHub repository for a detailed example of this code.

Now we can visualize the diagram of our geospatial pipeline through Studio and monitor the runs in Pipelines, as shown in the following screenshot.

Geospatial Pipeline UI

Conclusion

In this post, we presented a summary of the processes we implemented with SageMaker geospatial capabilities for building geospatial data pipelines for our advanced products from xarvio Digital Farming Solutions. Using SageMaker geospatial increased the efficiency of our geospatial work by more than 50%, through the use of pre-built APIs that accelerate and simplify our preprocessing and modeling steps for ML.

As a next step, we’re onboarding more models from our catalog to SageMaker to continue the automation of our solution pipelines, and will continue utilizing more geospatial features of SageMaker as the service evolves.

We encourage you to try SageMaker geospatial capabilities by adapting the end-to-end example notebook provided in this post, and learning more about the service in What is Amazon SageMaker Geospatial Capabilities?.


About the Authors

Julian BlauJulian Blau is a Data Scientist at BASF Digital Farming GmbH, located in Cologne, Germany. He develops digital solutions for agriculture, addressing the needs of BASF’s global customer base by using geospatial data and machine learning. Outside work, he enjoys traveling and being outdoors with friends and family.

Antonio RodriguezAntonio Rodriguez is an Artificial Intelligence and Machine Learning Specialist Solutions Architect in Amazon Web Services, based out of Spain. He helps companies of all sizes solve their challenges through innovation, and creates new business opportunities with AWS Cloud and AI/ML services. Apart from work, he loves to spend time with his family and play sports with his friends.

Read More

Protecting Consumers and Promoting Innovation – AI Regulation and Building Trust in Responsible AI

Protecting Consumers and Promoting Innovation – AI Regulation and Building Trust in Responsible AI

Artificial intelligence (AI) is one of the most transformational technologies of our generation and provides huge opportunities to be a force for good and drive economic growth. It can help scientists cure terminal diseases, engineers build inconceivable structures, and farmers yield more crops. AI allows us to make sense of our world as never before—and build products and services to address some of our most challenging problems, like climate change and responding to humanitarian disasters. AI is also helping industries innovate and overcome more commonplace challenges. Manufacturers are deploying AI to avoid equipment downtime through predictive maintenance and streamlining their logistics and distribution channels through supply chain optimization. Airlines are taking advantage of AI technologies to enhance the customer booking experience, assist with crew scheduling, and transporting passengers with greater fuel efficiency by simulating routes based on distance, aircraft weight, and weather.

While the benefits of AI are already plain to see and improving our lives each day, unlocking AI’s full potential will require building greater confidence among consumers. That means earning public trust that AI will be used responsibly and in a manner that is consistent with the rule of law, human rights, and the values of equity, privacy, and fairness.

Understanding the important need for public trust, we work closely with policymakers across the country and around the world as they assess whether existing consumer protections remain fit-for-purpose in an AI era. An important baseline for any regulation must be to differentiate between high-risk AI applications and those that pose low-to-no risk. The great majority of AI applications fall in the latter category, and their widespread adoption provides opportunities for immense productivity gains and, ultimately, improvements in human well-being. If we are to inspire public confidence in the overwhelmingly good, businesses must demonstrate they can confidently mitigate the potential risks of high-risk AI. The public should be confident that these sorts of high-risk systems are safe, fair, appropriately transparent, privacy protective, and subject to appropriate oversight.

At AWS, we recognize that we are well positioned to deliver on this vision and are proud to support our customers as they invent, build, and deploy AI systems to solve real-world problems. As AWS offers the broadest and deepest set of AI services and the supporting cloud infrastructure, we are committed to developing fair and accurate AI services and providing customers with the tools and guidance needed to build applications responsibly. We recognize that responsible AI is the shared responsibility of all organizations that develop and deploy AI systems.

We are committed to providing tools and resources to aide customers using our AI and machine learning (ML) services. Earlier this year, we launched our Responsible Use of Machine Learning guide, providing considerations and recommendations for responsibly using ML across all phases of the ML lifecycle. In addition, at our 2020 AWS re:Invent conference, we rolled out Amazon SageMaker Clarify, a service that provides developers with greater insights into their data and models, helping them understand why an ML model made a specific prediction and also whether the predictions were impacted by bias. Additional resources, access to AI/ML experts, and education and training can also be found on our Responsible use of artificial intelligence and machine learning page.

We continue to expand efforts to provide guidance and support to customers and the broader community in the responsible use space. This week at our re:Invent 2022 conference, we announced the launch of AWS AI Service Cards, a new transparency resource to help customers better understand our AWS AI services. The new AI Service Cards deliver a form of responsible AI documentation that provide customers with a single place to find information.

Each AI Service Card covers four key topics to help you better understand the service or service features, including intended use cases and limitations, responsible AI design considerations, and guidance on deployment and performance optimization. The content of the AI Service Cards addresses a broad audience of customers, technologists, researchers, and other stakeholders who seek to better understand key considerations in the responsible design and use of an AI service.

Conversations among policymakers regarding AI regulations continue as the technologies become more established. AWS is focused on not only offering the best-in-class tools and services to provide for the responsible development and deployment of AI services, but also continuing our engagement with lawmakers to promote strong consumer protections while encouraging the fast pace of innovation.


About the Author

Nicole Foster is Director of AWS Global AI/ML and Canada Public Policy at Amazon, where she leads the direction and strategy of artificial intelligence public policy for Amazon Web Services (AWS) around the world as well as the company’s public policy efforts in support of the AWS business in Canada. In this role, she focuses on issues related to emerging technology, digital modernization, cloud computing, cyber security, data protection and privacy, government procurement, economic development, skilled immigration, workforce development, and renewable energy policy.

Read More

Stability AI builds foundation models on Amazon SageMaker

Stability AI builds foundation models on Amazon SageMaker

We’re thrilled to announce that Stability AI has selected AWS as its preferred cloud provider to power its state-of-the-art AI models for image, language, audio, video, and 3D content generation. Stability AI is a community-driven, open-source artificial intelligence (AI) company developing breakthrough technologies. With Amazon SageMaker, Stability AI will build AI models on compute clusters with thousands of GPU or AWS Trainium chips, reducing training time and cost by 58%. Stability AI will also collaborate with AWS to enable students, researchers, startups, and enterprises around the world to use its open-source tools and models.

“Our mission at Stability AI is to build the foundation to activate humanity’s potential through AI. AWS has been an integral partner in scaling our open-source foundation models across modalities, and we are delighted to bring these to SageMaker to enable tens of thousands of developers and millions of users to take advantage of them. We look forward to seeing the amazing things built on these models and helping our customers customize and scale their models and solutions.”

-Emad Mostaque, Founder and CEO of Stability AI.

Generative AI models and Stable Diffusion

Generative AI models can create text, images, audio, video, code, and more from simple text instructions. For example, I created the following image by giving this text prompt to the model: “Four people riding a bicycle in the Swiss Alps, renaissance painting, epic breathtaking nature scene, diffused light.” I used a Jupyter notebook in Amazon SageMaker Studio to generate this image with Stable Diffusion.

Stability AI also announced a distilled stable diffusion model, which can generate coherent images up to ten times faster than before. This latest open-source release also introduces models to upscale an image’s resolution and infer depth information to generate new images. The following images show an example of how you can use the new depth2img model to generate new images while preserving the depth and coherence of the original image.

We’re excited by the potential of these generative AI models and by what our customers will create. From inpainting to textual inversion to modifiers, the community continues to innovate and build better open-source models and tools in generative AI.

Training foundation models at scale with SageMaker

Foundation models—large models that are adaptable to a variety of downstream tasks in domains such as language, image, audio, video—are hard to train because they require a high-performance compute cluster with thousands of GPU or Trainium chips, along with software to efficiently utilize the cluster.

Stability AI picked AWS as its preferred cloud provider to provision one of the largest-ever clusters of GPUs in the public cloud. Using SageMaker’s managed infrastructure and optimization libraries, Stability is able to make its model training more resilient and performant. For example, with models such as GPT NeoX, Stability AI was able to reduce training time and cost by 58% using SageMaker and its model parallel library. These optimizations and performance improvements apply to models with tens or hundreds of billions of parameters.

Get started with Stable Diffusion

Stable Diffusion 2.0 is available today on Amazon SageMaker JumpStart. JumpStart is the machine learning (ML) hub of SageMaker that provides hundreds of built-in algorithms, pre-trained models, and end-to-end solution templates to help you quickly get started with ML.

Get started today with Stable Diffusion 2.0.


About the authors

Aditya Bindal is a Principal Product Manager for AWS Deep Learning. He works on software and tools to make large-scale training and inference easier for customers. In his spare time, he enjoys spending time with his daughter, playing tennis, reading historical fiction, and traveling.

Read More

Launch Amazon SageMaker Autopilot experiments directly from within Amazon SageMaker Pipelines to easily automate MLOps workflows

Launch Amazon SageMaker Autopilot experiments directly from within Amazon SageMaker Pipelines to easily automate MLOps workflows

Amazon SageMaker Autopilot, a low-code machine learning (ML) service that automatically builds, trains, and tunes the best ML models based on tabular data, is now integrated with Amazon SageMaker Pipelines, the first purpose-built continuous integration and continuous delivery (CI/CD) service for ML. This enables the automation of an end-to-end flow of building ML models using Autopilot and integrating models into subsequent CI/CD steps.

So far, to launch an Autopilot experiment within Pipelines, you have to build a model-building workflow by writing custom integration code with Pipelines Lambda or Processing steps. For more information, see Move Amazon SageMaker Autopilot ML models from experimentation to production using Amazon SageMaker Pipelines.

With the support for Autopilot as a native step within Pipelines, you can now add an automated training step (AutoMLStep) in Pipelines and invoke an Autopilot experiment with Ensembling training mode. For example, if you’re building a training and evaluation ML workflow for a fraud detection use case with Pipelines, you can now launch an Autopilot experiment using the AutoML step, which automatically runs multiple trials to find the best model on a given input dataset. After the best model is created using the Model step, its performance can be evaluated on test data using the Transform step and a Processing Step for a custom evaluation script within Pipelines. Eventually, the model can be registered into the SageMaker model registry using the Model step in combination with a Condition step.

In this post, we show how to create an end-to-end ML workflow to train and evaluate a SageMaker generated ML model using the newly launched AutoML step in Pipelines and register it with the SageMaker model registry. The ML model with the best performance can be deployed to a SageMaker endpoint.

Dataset overview

We use the publicly available UCI Adult 1994 Census Income dataset to predict if a person has an annual income of greater than $50,000 per year. This is a binary classification problem; the options for the income target variable are either <=50K or >50K.

The dataset contains 32,561 rows for training and validation and 16,281 rows for testing with 15 columns each. This includes demographic information about individuals and class as the target column indicating the income class.

Column Name Description
age Continuous
workclass Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked
fnlwgt Continuous
education Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool
education-num Continuous
marital-status Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse
occupation Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces
relationship Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
race White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black
sex Female, Male
capital-gain Continuous
capital-loss Continuous
hours-per-week Continuous
native-country United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands
class Income class, either <=50K or >50K

Solution overview

We use Pipelines to orchestrate different pipeline steps required to train an Autopilot model. We create and run an Autopilot experiment as part of an AutoML step as described in this tutorial.

The following steps are required for this end-to-end Autopilot training process:

  • Create and monitor an Autopilot training job using the AutoMLStep.
  • Create a SageMaker model using ModelStep. This step fetches the best model’s metadata and artifacts rendered by Autopilot in the previous step.
  • Evaluate the trained Autopilot model on a test dataset using TransformStep.
  • Compare the output from the previously run TransformStep with the actual target labels using ProcessingStep.
  • Register the ML model to the SageMaker model registry using ModelStep, if the previously obtained evaluation metric exceeds a predefined threshold in ConditionStep.
  • Deploy the ML model as a SageMaker endpoint for testing purposes.

Architecture

The architecture diagram below illustrates the different pipeline steps necessary to package all the steps in a reproducible, automated, and scalable SageMaker Autopilot training pipeline. The data files are read from the S3 bucket and the pipeline steps are called sequentially.

Walkthrough

This post provides a detailed explanation of the pipeline steps. We review the code and discuss the components of each step. To deploy the solution, refer to the example notebook, which provides step-by-step instructions for implementing an Autopilot MLOps workflow using Pipelines.

Prerequisites

Complete the following prerequisites:

When the dataset is ready to use, we need to set up Pipelines to establish a repeatable process to automatically build and train ML models using Autopilot. We use the SageMaker SDK to programmatically define, run, and track an end-to-end ML training pipeline.

Pipeline Steps

In the following sections, we go through the different steps in the SageMaker pipeline, including AutoML training, model creation, batch inference, evaluation, and conditional registration of the best model. The following diagram illustrates the entire pipeline flow.

AutoML training step

An AutoML object is used to define the Autopilot training job run and can be added to the SageMaker pipeline by using the AutoMLStep class, as shown in the following code. The ensembling training mode needs to be specified, but other parameters can be adjusted as needed. For example, instead of letting the AutoML job automatically infer the ML problem type and objective metric, these could be hardcoded by specifying the problem_type and job_objective parameters passed to the AutoML object.

automl = AutoML(
    role=execution_role,
    target_attribute_name=target_attribute_name,
    sagemaker_session=pipeline_session,
    total_job_runtime_in_seconds=max_automl_runtime,
    mode="ENSEMBLING",
)
train_args = automl.fit(
    inputs=[
        AutoMLInput(
            inputs=s3_train_val,
            target_attribute_name=target_attribute_name,
            channel_type="training",
        )
    ]
)
step_auto_ml_training = AutoMLStep(
    name="AutoMLTrainingStep",
    step_args=train_args,
)

Model creation step

The AutoML step takes care of generating various ML model candidates, combining them, and obtaining the best ML model. Model artifacts and metadata are automatically stored and can be obtained by calling the get_best_auto_ml_model() method on the AutoML training step. These can then be used to create a SageMaker model as part of the Model step:

best_auto_ml_model = step_auto_ml_training.get_best_auto_ml_model(
    execution_role, sagemaker_session=pipeline_session
)
step_args_create_model = best_auto_ml_model.create(instance_type=instance_type)
step_create_model = ModelStep(name="ModelCreationStep", step_args=step_args_create_model)

Batch transform and evaluation steps

We use the Transformer object for batch inference on the test dataset, which can then be used for evaluation purposes. The output predictions are compared to the actual or ground truth labels using a Scikit-learn metrics function. We evaluate our results based on the F1 score. The performance metrics are saved to a JSON file, which is referenced when registering the model in the subsequent step.

Conditional registration steps

In this step, we register our new Autopilot model to the SageMaker model registry, if it exceeds the predefined evaluation metric threshold.

Create and run the pipeline

After we define the steps, we combine them into a SageMaker pipeline:

pipeline = Pipeline(
    name="AutoMLTrainingPipeline",
    parameters=[
        instance_count,
        instance_type,
        max_automl_runtime,
        model_approval_status,
        model_package_group_name,
        model_registration_metric_threshold,
        s3_bucket,
        target_attribute_name,
    ],
    steps=[
        step_auto_ml_training,
        step_create_model,
        step_batch_transform,
        step_evaluation,
        step_conditional_registration,
    ],
    sagemaker_session=pipeline_session,
)

The steps are run in sequential order. The pipeline runs all the steps for an AutoML job using Autopilot and Pipelines for training, model evaluation, and model registration.

You can view the new model by navigating to the model registry on the Studio console and opening AutoMLModelPackageGroup. Choose any version of a training job to view the objective metrics on the Model quality tab.

You can view the explainability report on the Explainability tab to understand your model’s predictions.

To view the underlying Autopilot experiment for all the models created in AutoMLStep, navigate to the AutoML page and choose the job name.

Deploy the model

After we have manually reviewed the ML model’s performance, we can deploy our newly created model to a SageMaker endpoint. For this, we can run the cells in the notebook that create the model endpoint using the model configuration saved in the SageMaker model registry.

Note that this script is shared for demonstration purposes, but it’s recommended to follow a more robust CI/CD pipeline for production deployment for ML inference. For more information, refer to Building, automating, managing, and scaling ML workflows using Amazon SageMaker Pipelines.

Summary

This post describes an easy-to-use ML pipeline approach to automatically train tabular ML models (AutoML) using Autopilot, Pipelines, and Studio. AutoML improves ML practitioners’ efficiency, accelerating the path from ML experimentation to production without the need for extensive ML expertise. We outline the respective pipeline steps needed for ML model creation, evaluation, and registration. Get started by trying the example notebook to train and deploy your own custom AutoML models.

For more information on Autopilot and Pipelines, refer to Automate model development with Amazon SageMaker Autopilot and Amazon SageMaker Pipelines.

Special thanks to everyone who contributed to the launch: Shenghua Yue, John He, Ao Guo, Xinlu Tu, Tian Qin, Yanda Hu, Zhankui Lu, and Dewen Qi.


About the Authors

Janisha Anand is a Senior Product Manager in the SageMaker Low/No Code ML team, which includes SageMaker Autopilot. She enjoys coffee, staying active, and spending time with her family.

Marcelo Aberle is an ML Engineer at AWS AI. He helps Amazon ML Solutions Lab customers build scalable ML(-Ops) systems and frameworks. In his spare time, he enjoys hiking and cycling in the San Francisco Bay Area.

Geremy Cohen is a Solutions Architect with AWS where he helps customers build cutting-edge, cloud-based solutions. In his spare time, he enjoys short walks on the beach, exploring the bay area with his family, fixing things around the house, breaking things around the house, and BBQing.

Shenghua Yue is a Software Development Engineer at Amazon SageMaker. She focuses on building ML tools and products for customers. Outside of work, she enjoys the outdoors, yoga, and hiking.

Read More

AI21 Jurassic-1 foundation model is now available on Amazon SageMaker

AI21 Jurassic-1 foundation model is now available on Amazon SageMaker

Today we are excited to announce that AI21 Jurassic-1 (J1) foundation models are available for customers using Amazon SageMaker. Jurassic-1 models are highly versatile, capable of both human-like text generation, as well as solving complex tasks such as question answering, text classification, and many others. You can easily try out this model and use it with Amazon SageMaker JumpStart. JumpStart is the machine learning (ML) hub of SageMaker that provides access to foundation models in addition to built-in algorithms and end-to-end solution templates to help you quickly get started with ML.

In this post, we walk through how to use the Jurassic-1 Grande model in SageMaker.

Foundation models in SageMaker

JumpStart provides access to a range of models from popular model hubs including Hugging Face, PyTorch Hub, and TensorFlow Hub, which you can use within your ML development workflow in SageMaker. Recent advances in ML have given rise to a new class of models known as foundation models, which are typically trained on billions of parameters and are adaptable to a wide category of use cases, such as text summarization, generating digital art, and language translation. Because these models are expensive to train, customers want to use existing pre-trained foundation models and fine-tune them as needed, rather than train these models themselves. SageMaker provides a curated list of models that you can choose from on the SageMaker console.

You can now find foundation models from different model providers within JumpStart, enabling you to get started with foundation models quickly. You can find foundation models based on different tasks or model providers, and easily review model characteristics and usage terms. You can also try out these models using a test UI widget. When you want to use a foundation model at scale, you can do so easily without leaving SageMaker by using pre-built notebooks from model providers. Because the models are hosted and deployed on AWS, you can rest assured that your data, whether used for evaluating or using the model at scale, is never shared with third parties.

Jurassic-1 foundation model

Jurassic-1 is the first generation in a series of large language models trained and made widely accessible by AI21 Labs. For a complete description of Jurassic-1, including benchmarks and quantitative comparisons with other models, refer to the following technical paper. All J1 models were trained on a massive corpus of English text, making them highly versatile general purpose text-generators, capable of composing human-like text and solving complex tasks such as question answering, text classification, and many others. J1 can be applied to virtually any language task by crafting a suitable prompt containing a description of the task and a few examples, a process commonly known as prompt engineering. Popular use cases include generating marketing copy, powering chatbots, and assisting creative writing.

“We are building world-class foundation models for text and want to help our customers innovate with the latest Jurassic-1 models. Amazon SageMaker offers the deepest and broadest set of ML services, and we’re excited to collaborate with Amazon SageMaker so that customers will be able to use these foundation models on SageMaker within their development environment. Now customers can rapidly innovate, lower time-to-value, and drive efficiency in their businesses.”

-Ori Goshen, co-CEO of AI21 Labs.

Walkthrough

Let’s take you on a tour to test the J1-Grande model in SageMaker. You can try out the experience in three simple steps:

  1. Choose the Jurassic-1 model on the SageMaker console.
  2. Evaluate the model using a test widget.
  3. Use a notebook associated with the foundation model to deploy it in your environment.

Let’s expand each step in detail.

Choose the Jurassic-1 model on the SageMaker console

The first step is to login to the AWS Management Console for Amazon SageMaker and request access to the list of foundation models from the foundation model category under JumpStart here:

After your account is allow listed, you can see a list of models on this page. You can quickly search for the Jurassic-1 Grande model from the same view.

Evaluate the Jurassic-1 Grande model with a test widget

On the Jurassic-1 Grande listing, choose View Model. You will see a description of the model and the tasks that you can perform. Read through the EULA for the model before proceeding.

Let’s first try out the model for text summarization. Choose Try out model.

You’re taken to the page in a separate browser tab where you can give sample prompts to the J1-Grande model and view the output.

The following example generates a summary about a restaurant based on reviews.

Note that foundation models and their output are from the model provider, and AWS is not responsible for the content or accuracy therein.

The model output may vary depending on the settings and the prompt. You can generate text from the model using simple instructions, but by providing the model with more examples in the prompt, just as a human would, it can produce completions that are more aligned with your intentions. The best way to guide the model is to provide several examples of input/output pairs in the prompt. This establishes a pattern for the model to mimic. Then add the input for a query example and let the model complete it with an appropriate generation.

After you have played with the model, it’s time to use the notebook and deploy it as an endpoint in your environment.

Deploy the foundation model from a notebook

Go back to the model listing shown earlier and choose View notebook. You should see the Jurassic-1 Grande Jupyter notebook with the walkthrough to deploy the model.

Let’s use this notebook from Amazon SageMaker Studio. Open Studio and pull in the notebook using the Git repo URL https://github.com/AI21Labs/SageMaker.git.

The notebook example uses both the Boto3 SDK and the AI21 SDK to deploy and interact with the endpoint.

Note that this example uses an ml.g5.12xlarge instance. If your default limit for your AWS account is 0, you need to request a limit increase for this GPU instance.

Let’s create the endpoint using SageMaker inference. First we set the necessary variables, then we deploy the model from the model package:

model_name = "j1-grande"

content_type = "application/json"

real_time_inference_instance_type = (
    "ml.g5.12xlarge"
)

# create a deployable model from the model package.
model = ModelPackage(
    role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session
)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name, 
                         model_data_download_timeout=3600,
                         container_startup_health_check_timeout=600,
                        )

After the endpoint is deployed, you can run inference queries against the model.

You can think of Jurassic-1 Grande as a smart auto-completion algorithm: it’s very good at latching on to hints and patterns expressed in plain English, and generating text that follows the same patterns. After the model is deployed, you can interact with the deployed endpoint using the following code snippet:

response = ai21.Completion.execute(sm_endpoint="j1-grande",
                                   prompt="To be or",
                                   maxTokens=4,
                                   temperature=0,
                                   numResults=1)

print(response['completions'][0]['data']['text'])

The notebook also contains a walkthrough on how you can run inference queries with the AI21 SDK.

The following video walks through the workflow.

Clean up

After you have tested the endpoint, make sure you delete the SageMaker inference endpoint and delete the model to avoid incurring charges.

Conclusion

In this post, we showed you how you can test and use AI21’s Jurassic Grande model using Amazon SageMaker. Request access, try out the foundation model in SageMaker today and let us know your feedback!


About the authors

Karthik Bharathy is the product leader for the Amazon SageMaker team with over a decade of product management, product strategy, execution, and launch experience.

Tomer Asida is an algo team lead at AI21 Labs. As an algo team lead, Tomer heads the algorithm development efforts of our developer platform Ai21 Studio including Jurassic-1 models and associated APIs.

Read More

Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

Artificial intelligence (AI) and machine learning (ML) are some of the most transformative technologies we will encounter in our generation—to tackle business and societal problems, improve customer experiences, and spur innovation. Along with the widespread use and growing scale of AI comes the recognition that we must all build responsibly. At AWS, we think responsible AI encompasses a number of core dimensions including:

  • Fairness and bias– How a system impacts different subpopulations of users (e.g., by gender, ethnicity)
  • Explainability– Mechanisms to understand and evaluate the outputs of an AI system
  • Privacy and Security– Data protected from theft and exposure
  • Robustness– Mechanisms to ensure an AI system operates reliably
  • Governance– Processes to define, implement and enforce responsible AI practices within an organization
  • Transparency– Communicating information about an AI system so stakeholders can make informed choices about their use of the system

Our commitment to developing AI and ML in a responsible way is integral to how we build our services, engage with customers, and drive innovation. We are also committed to providing customers with tools and resources to develop and use AI/ML responsibly, from enabling ML builders with a fully managed development environment to helping customers embed AI services into common business use cases.

Providing customers with more transparency

Our customers want to know that the technology they are using was developed in a responsible way. They want resources and guidance to implement that technology responsibly at their own organization. And most importantly, they want to ensure that the technology they roll out is for everyone’s benefit, especially their end-users’. At AWS, we want to help them bring this vision to life.

To deliver the transparency that customers are asking for, we are excited to launch AWS AI Service Cards, a new resource to help customers better understand our AWS AI services. AI Service Cards are a form of responsible AI documentation that provide customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for our AI services. They are part of a comprehensive development process we undertake to build our services in a responsible way that addresses fairness and bias, explainability, robustness, governance, transparency, privacy, and security. At AWS re:Invent 2022 we’re making the first three AI Service Cards available: Amazon Rekognition – Face Matching, Amazon Textract – AnalyzeID, and Amazon Transcribe – Batch (English-US).

Components of the AI Service Cards

Each AI Service Card contains four sections covering:

  • Basic concepts to help customers better understand the service or service features
  • Intended use cases and limitations
  • Responsible AI design considerations
  • Guidance on deployment and performance optimization

The content of the AI Service Cards addresses a broad audience of customers, technologists, researchers, and other stakeholders who seek to better understand key considerations in the responsible design and use of an AI service.

Our customers use AI in an increasingly diverse set of applications. The intended use cases and limitations section provides information about common uses for a service, and helps customers assess whether a service is a good fit for their application. For example, in the Amazon Transcribe – Batch (English-US) Card we describe the service use case of transcribing general-purpose vocabulary spoken in US English from an audio file. If a company wants a solution that automatically transcribes a domain-specific event, such as an international neuroscience conference, they can add custom vocabularies and language models to include scientific vocabulary in order to increase the accuracy of the transcription.

In the design section of each AI Service Card, we explain key responsible AI design considerations across important areas, such as our test-driven methodology, fairness and bias, explainability, and performance expectations. We provide example performance results on an evaluation dataset that is representative of a common use case. This example is just a starting point though, as we encourage customers to test on their own datasets to better understand how the service will perform on their own content and use cases in order to deliver the best experience for their end customers. And this is not a one-time evaluation. To build in a responsible way, we recommend an iterative approach where customers periodically test and evaluate their applications for accuracy or potential bias.

In the best practices for deployment and performance optimization section, we lay out key levers that customers should consider to optimize the performance of their application for real-world deployment. It’s important to explain how customers can optimize the performance of an AI system that acts as a component of their overall application or workflow to get the maximum benefit. For example, in the Amazon Rekognition Face Matching Card that covers adding face recognition capabilities to identity verification applications, we share steps customers can take to increase the quality of the face matching predictions incorporated into their workflow.

Delivering responsible AI resources and capabilities

Offering our customers the resources and tools they need to transform responsible AI from theory to practice is an ongoing priority for AWS. Earlier this year we launched our Responsible Use of Machine Learning guide that provides considerations and recommendations for responsibly using ML across all phases of the ML lifecycle. AI Service Cards complement our existing developer guides and blog posts, which provide builders with descriptions of service features and detailed instructions for using our service APIs. And with Amazon SageMaker Clarify and Amazon SageMaker Model Monitor, we offer capabilities to help detect bias in datasets and models and better monitor and review model predictions through automation and human oversight.

At the same time, we continue to advance responsible AI across other key dimensions, such as governance. At re:Invent today we launched a new set of purpose-built tools to help customers improve governance of their ML projects with Amazon SageMaker Role Manager, Amazon SageMaker Model Cards, and Amazon SageMaker Model Dashboard. Learn more on the AWS News blog and website about how these tools help to streamline ML governance processes.

Education is another key resource that helps advance responsible AI. At AWS we are committed to building the next generation of developers and data scientists in AI with the AI and ML Scholarship Program and AWS Machine Learning University (MLU). This week at re:Invent we launched a new, public MLU course on fairness considerations and bias mitigation across the ML lifecycle. Taught by the same Amazon data scientists who train AWS employees on ML, this free course features 9 hours of lectures and hands-on exercises and it is easy to get started.

AI Service Cards: A new resource—and an ongoing commitment

We are excited to bring a new transparency resource to our customers and the broader community and provide additional information on the intended uses, limitations, design, and optimization of our AI services, informed by our rigorous approach to building AWS AI services in a responsible way. Our hope is that AI Service Cards will act as a useful transparency resource and an important step in the evolving landscape of responsible AI. AI Service Cards will continue to evolve and expand as we engage with our customers and the broader community to gather feedback and continually iterate on our approach.

Contact our group of responsible AI experts to start a conversation.


About the authors

Vasi Philomin is currently a Vice President in the AWS AI team for services in the language and speech technologies areas such as Amazon Lex, Amazon Polly, Amazon Translate, Amazon Transcribe/Transcribe Medical, Amazon Comprehend, Amazon Kendra, Amazon Code Whisperer, Amazon Monitron, Amazon Lookout for Equipment and Contact Lens/Voice ID for Amazon Connect as well as Machine Learning Solutions Lab and Responsible AI.

Peter Hallinan leads initiatives in the science and practice of Responsible AI at AWS AI, alongside a team of responsible AI experts. He has deep expertise in AI (PhD, Harvard) and entrepreneurship (Blindsight, sold to Amazon). His volunteer activities have included serving as a consulting professor at the Stanford University School of Medicine, and as the president of the American Chamber of Commerce in Madagascar. When possible, he’s off in the mountains with his children: skiing, climbing, hiking and rafting

Read More