April 2022 – Page 2

How Searchmetrics uses Amazon SageMaker to automatically find relevant keywords and make their human analysts 20% faster

Searchmetrics is a global provider of search data, software, and consulting solutions, helping customers turn search data into unique business insights. To date, Searchmetrics has helped more than 1,000 companies such as McKinsey & Company, Lowe’s, and AXA find an advantage in the hyper-competitive search landscape.

In 2021, Searchmetrics turned to AWS to help with artificial intelligence (AI) usage to further improve their search insights capabilities.

In this post, we share how Searchmetrics built an AI solution that increased the efficiency of its human workforce by 20% by automatically finding relevant search keywords for any given topic, using Amazon SageMaker and its native integration with Hugging Face.

“Amazon SageMaker made it a breeze to evaluate and integrate Hugging Face’s state-of-the-art NLP models into our systems.
The solution we built makes us more efficient and greatly improves our user experience.”– Ioannis Foukarakis, Head of Data, Searchmetrics

Using AI to identify relevance from a list of keywords

A key part of Searchmetrics’ insights offering is its ability to identify the most relevant search keywords for a given topic or search intent.

To do this, Searchmetrics has a team of analysts assessing the potential relevance of certain keywords given a specific seed word. Analysts use an internal tool to review a keyword within a given topic and a generated list of potentially related keywords, and they must then select one or more related keywords that are relevant to that topic.

This manual filtering and selection process was time consuming and slowed down Searchmetrics’s ability to deliver insights to its customers.

To improve this process, Searchmetrics sought to build an AI solution that could use natural language processing (NLP) to understand the intent of a given search topic and automatically rank an unseen list of potential keywords by relevance.

Using SageMaker and Hugging Face to quickly build advanced NLP capabilities

To solve this, Searchmetrics’ engineering team turned to SageMaker, an end-to-end machine learning (ML) platform that helps developers and data scientists quickly and easily build, train, and deploy ML models.

SageMaker accelerates the deployment of ML workloads by simplifying the ML build process. It provides a broad set of ML capabilities on top of a fully managed infrastructure. This removes the undifferentiated heavy lifting that too-often hinders ML development.

Searchmetrics chose SageMaker because of the full range of capabilities it provided at every step of the ML development process:

SageMaker notebooks enabled the Searchmetrics team to quickly spin up fully managed ML development environments, perform data preprocessing, and experiment with different approaches
The batch transform capabilities in SageMaker enabled Searchmetrics to efficiently process its inference payloads in bulk, as well as easily integrate into its existing web service in production

Searchmetrics was also particularly interested in the native integration of SageMaker with Hugging Face, an exciting NLP startup that provides easy access to more than 7,000 pre-trained language models through its popular Tranformers library.

SageMaker provides a direct integration with Hugging Face through a dedicated Hugging Face estimator in the SageMaker SDK. This makes it easy to run Hugging Face models on the fully managed SageMaker infrastructure.

With this integration, Searchmetrics was able to test and experiment with a range of different models and approaches to find the best-performing approach to their use case.

The end solution uses a zero-shot classification pipeline to identify the most relevant keywords. Different pre-trained models and query strategies were evaluated, with facebook/bart-large-mnli providing the most promising results.

Using AWS to improve operational efficiency and find new innovation opportunities

With SageMaker and its native integration with Hugging Face, Searchmetrics was able to build, train, and deploy an NLP solution that could understand a given topic and accurately rank an unseen list of keywords based on their relevance. The toolset offered by SageMaker made it easier to experiment and deploy.

When integrated with Searchmetrics’s existing internal tool, this AI capability delivered an average reduction of 20% in the time taken for human analysts to complete their job. This resulted in higher throughput, improved user experience, and faster onboarding of new users.

This initial success has not only improved the operational performance of Searchmetrics’s search analysts, but has also helped Searchmetrics chart a clearer path to deploying more comprehensive automation solutions using AI in its business.

These exciting new innovation opportunities help Searchmetrics continue to improve their insights capabilities, and also help them ensure that customers continue to stay ahead in the hyper-competitive search landscape.

In addition, Hugging Face and AWS announced a partnership earlier in 2022 that makes it even easier to train Hugging Face models on SageMaker. This functionality is available through the development of Hugging Face AWS Deep Learning Containers (DLCs). These containers include Hugging Face Transformers, Tokenizers, and the Datasets library, which allows us to use these resources for training and inference jobs.

For a list of the available DLC images, see available Deep Learning Containers Images, which are maintained and regularly updated with security patches. You can find many examples of how to train Hugging Face models with these DLCs and the Hugging Face Python SDK in the following GitHub repo.

Learn more about how you can accelerate your ability to innovate with AI/ML by visiting Getting Started with Amazon SageMaker, getting hands-on learning content by reviewing the Amazon SageMaker developer resources, or visiting Hugging Face on Amazon SageMaker.

About the Author

Daniel Burke is the European lead for AI and ML in the Private Equity group at AWS. Daniel works directly with Private Equity funds and their portfolio companies, helping them accelerate their AI and ML adoption to improve innovation and increase enterprise value.

Identify paraphrased text with Hugging Face on Amazon SageMaker

Identifying paraphrased text has business value in many use cases. For example, by identifying sentence paraphrases, a text summarization system could remove redundant information. Another application is to identify plagiarized documents. In this post, we fine-tune a Hugging Face transformer on Amazon SageMaker to identify paraphrased sentence pairs in a few steps.

A truly robust model can identify paraphrased text when the language used may be completely different, and also identify differences when the language used has high lexical overlap. In this post, we focus on the latter aspect. Specifically, we look at whether we can train a model that can identify the difference between two sentences that have high lexical overlap and very different or opposite meanings. For example, the following sentences have the exact same words but opposite meanings:

I took a flight from New York to Paris
I took a flight from Paris to New York

Solution overview

We walk you through the following high-level steps:

Set up the environment.
Prepare the data.
Tokenize the dataset.
Fine-tune the model.
Deploy the model and perform inference.
Evaluate model performance.

If you want to skip setting up the environment, you can use the following notebook on GitHub and run the code in SageMaker.

Hugging Face and AWS announced a partnership earlier in 2022 that makes it even easier to train Hugging Face models on SageMaker. This functionality is available through the development of Hugging Face AWS Deep Learning Containers (DLCs). These containers include Hugging Face Transformers, Tokenizers, and the Datasets library, which allows us to use these resources for training and inference jobs. For a list of the available DLC images, see Available Deep Learning Containers Images. They are maintained and regularly updated with security patches. You can find many examples of how to train Hugging Face models with these DLCs and the Hugging Face Python SDK in the following GitHub repo.

The PAWS dataset

Realizing the lack of efficient sentence pairs datasets that exhibit high lexical overlap without being paraphrases, the original PAWS dataset released in 2019 aimed to provide the natural language processing (NLP) community a new resource for training and evaluating paraphrase detection models. PAWS sentence pairs are generated in two steps using Wikipedia and the Quora Question Pairs (QQP) dataset. A language model first swaps words in a sentence pair with the same Bag of Words (BOW) to generate a sentence pair. A back translation step then generates paraphrases with high BOW overlap but using a different word order. The final PAWS dataset contains a total of 108,000 human-labeled and 656,000 noisily labeled pairs.

In this post, we use the PAWS-Wiki Labeled (Final) dataset from Hugging Face. Hugging Face has already performed the data split for us, which results in 49,000 sentence pairs in the training dataset, and 8,000 sentence pairs each for the validation and test datasets. Two sentence pair examples from the training dataset are shown in the following example. A label of 1 indicates that the two sentences are paraphrases of each other.

Sentence 1	Sentence 2	Label
Although interchangeable, the body pieces on the 2 cars are not similar.	Although similar, the body parts are not interchangeable on the 2 cars.	0
Katz was born in Sweden in 1947 and moved to New York City at the age of 1.	Katz was born in 1947 in Sweden and moved to New York at the age of one.	1

Prerequisites

You need to complete the following prerequisites:

Sign up for an AWS account if you don’t have one. For more information, see Set Up Amazon SageMaker Prerequisites.
Get started using SageMaker notebook instances.
Set up the right AWS Identity and Access Management (IAM) permissions. For more information, see SageMaker Roles.

Set up the environment

Before we begin examining and preparing our data for model fine-tuning, we need to set up our environment. Let’s start by spinning up a SageMaker notebook instance. Choose an AWS Region in your AWS account and follow the instructions to create a SageMaker notebook instance. The notebook instance may take a few minutes to spin up.

When the notebook instance is running, choose conda_pytorch_p38 as your kernel type. To use the Hugging Face dataset, we first need to install and import the Hugging Face library:

!pip --quiet install "sagemaker" "transformers==4.17.0" "datasets==1.18.4" --upgrade
!pip --quiet install sentence-transformers

import sagemaker.huggingface
import sagemaker
from datasets import load_dataset

Next, let’s establish a SageMaker session. We use the default Amazon Simple Storage Service (Amazon S3) bucket associated with the SageMaker session to store the PAWS dataset and model artifacts:

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sess.default_bucket()

Prepare the data

We can load the Hugging Face version of the PAWS dataset with its load_dataset() command. This call downloads and imports the PAWS Python processing script from the Hugging Face GitHub repository, which then downloads the PAWS dataset from the original URL stored in the script and caches the data as an Arrow table on the drive. See the following code:

dataset_train, dataset_val, dataset_test = load_dataset("paws", "labeled_final", split=['train', 'validation', 'test'])

Before we begin fine-tuning our pre-trained BERT model, let’s look at our target class distribution. For our use case, the PAWS dataset has binary labels (0 indicates the sentence pair is not a paraphrase, and 1 indicates it is). Let’s create a column chart to view the class distribution, as shown in the following code. We see that there is a slight class imbalance issue in our training set (56% negative samples vs. 44% positive samples). However, the imbalance is small enough to avoid employing class imbalance mitigation techniques.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = dataset_train.to_pandas()

ax = sns.countplot(x="label", data=df)
ax.set_title('Label Count for PAWS Dataset', fontsize=15)
for p in ax.patches:
    ax.annotate(f'n{p.get_height()}', (p.get_x()+0.4, p.get_height()), ha='center', va='top', color='white', size=13)

Tokenize the dataset

Before we can begin fine-tuning, we need to tokenize our dataset. As a starting point, let’s say we want to fine-tune and evaluate the roberta-base transformer. We selected roberta-base because it’s a general-purpose transformer that was pre-trained on a large corpus of English data and has frequently shown high performance on a variety of NLP tasks. The model was originally introduced in the paper RoBERTa: A Robustly Optimized BERT Pretraining Approach.

We perform tokenization on the sentences with a roberta-base tokenizer from Hugging Face, which uses byte-level Byte Pair Encoding to split the document into tokens. For more details about the RoBERTa tokenizer, refer to RobertaTokenizer. Because our inputs are sentence pairs, we need to tokenize both sentences simultaneously. Because most BERT models require the input to have a fixed tokenized input length, we set the following parameters: max_len=128 and truncation=True. See the following code:

from transformers import AutoTokenizer
tokenizer_and_model_name = 'roberta-base'

# Download tokenizer
tokenizer = AutoTokenizer.from_pretrained(tokenizer_and_model_name)

# Tokenizer helper function
def tokenize(batch, max_len=128):
    return tokenizer(batch['sentence1'], batch['sentence2'], max_length=max_len, truncation=True)

dataset_train_tokenized = dataset_train.map(tokenize, batched=True, batch_size=len(dataset_train))
dataset_val_tokenized = dataset_val.map(tokenize, batched=True, batch_size=len(dataset_val))

The last preprocessing step for fine-tuning our BERT model is to convert the tokenized train and validation datasets into PyTorch tensors and upload them to our S3 bucket:

import botocore
from datasets.filesystems import S3FileSystem

s3 = S3FileSystem()
s3_prefix = 'sts-sbert-paws/sts-paws-datasets'

# convert and save train_dataset to s3
training_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/train'
dataset_train_tokenized = dataset_train_tokenized.rename_column("label", "labels")
dataset_train_tokenized.set_format('torch', columns=['input_ids', 'attention_mask', 'labels'])
dataset_train_tokenized.save_to_disk(training_input_path,fs=s3)

# convert and save val_dataset to s3
val_input_path = f's3://{sess.default_bucket()}/{s3_prefix}/val'
dataset_val_tokenized = dataset_val_tokenized.rename_column("label", "labels")
dataset_val_tokenized.set_format('torch', columns=['input_ids', 'attention_mask', 'labels'])
dataset_val_tokenized.save_to_disk(val_input_path,fs=s3)

Fine-tune the model

Now that we’re done with data preparation, we’re ready to fine-tune our pre-trained roberta-base model on the paraphrase identification task. We can use the SageMaker Hugging Face Estimator class to initiate the fine-tuning process in two steps. The first step is to specify the training hyperparameters and metric definitions. The metric definitions variable tells the Hugging Face Estimator what types of metrics to extract from the model’s training logs. Here, we’re primarily interested in extracting validation set metrics at each training epoch.

# Step 1: specify training hyperparameters and metric definitions
hyperparameters = {'epochs': 4,
                   'train_batch_size': 16,
                   'model_name': tokenizer_and_model_name}
                   
metric_definitions=[
    {'Name': 'loss', 'Regex': "'loss': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_loss', 'Regex': "'eval_loss': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_accuracy', 'Regex': "'eval_accuracy': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_f1', 'Regex': "'eval_f1': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_precision', 'Regex': "'eval_precision': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'eval_recall', 'Regex': "'eval_recall': ([0-9]+(.|e-)[0-9]+),?"},
    {'Name': 'epoch', 'Regex': "'epoch': ([0-9]+(.|e-)[0-9]+),?"}
]

The second step is to instantiate the Hugging Face Estimator and start the fine-tuning process with the .fit() method:

# Step 2: instantiate estimator and begin fine-tuning
from sagemaker.huggingface import HuggingFace

huggingface_estimator = HuggingFace(
                            entry_point='train.py',
                            source_dir='./scripts',
                            output_path=f's3://{sess.default_bucket()}',
                            base_job_name='huggingface-sdk-extension',
                            instance_type='ml.p3.8xlarge',
                            instance_count=1,
                            volume_size=100,
                            transformers_version='4.17.0',
                            pytorch_version='1.10.2',
                            py_version='py38',
                            role=role,
                            hyperparameters=hyperparameters,
                            metric_definitions=metric_definitions
                        )
                        
huggingface_estimator.fit({'train': training_input_path, 'test': val_input_path}, 
                          wait=True, 
                          job_name='sm-sts-blog-{}'.format(int(time.time())))

The fine-tuning process takes approximately 30 minutes using the specified hyperparameters.

Deploy the model and perform inference

SageMaker offers multiple deployment options depending on your use case. For persistent, real-time endpoints that make one prediction at a time, we recommend using SageMaker real-time hosting services. If you have workloads that have idle periods between traffic spurts and can tolerate cold starts, we recommend using Serverless Inference. Serverless endpoints automatically launch compute resources and scale them in and out depending on traffic, eliminating the need to choose instance types or manage scaling policies. We demonstrate how to deploy our fine-tuned Hugging Face model to both a real-time inference endpoint and a Serverless Inference endpoint.

Deploy to a real-time inference endpoint

You can deploy a training object onto real-time inference hosting within SageMaker using the .deploy() method. For a full list of the accepted parameters, refer to Hugging Face Model. To start, let’s deploy the model to one instance, by passing in the following parameters: initial_instance_count, instance_type, and endpoint_name. See the following code:

rt_predictor = huggingface_estimator.deploy(initial_instance_count=1,
instance_type="ml.g4dn.xlarge",
endpoint_name="sts-sbert-paws")

The model takes a few minutes to deploy. After the model is deployed, we can submit sample records from the unseen test dataset to the endpoint for inference.

Deploy to a Serverless Inference endpoint

To deploy our training object onto a serverless endpoint, we need to first specify a serverless config file with memory_size_in_mb and max_concurrency arguments:

from sagemaker.serverless.serverless_inference_config import ServerlessInferenceConfig

serverless_config = ServerlessInferenceConfig(
    memory_size_in_mb=6144,
    max_concurrency=1,
)

memory_size_in_mb defines the total RAM size of your serverless endpoint; the minimal RAM size is 1024 MB (1 GB) and it can scale up to 6144 MB (6 GB). Generally, you should aim to choose a memory size that is at least as large as your model size. max_concurrency defines the quota for how many concurrent invocations can be processed at the same time (up to 50 concurrent invocations) for a single endpoint.

We also need to supply the Hugging Face inference image URI, which you can retrieve using the following code:

image_uri = sagemaker.image_uris.retrieve(
    framework="huggingface",
    base_framework_version="pytorch1.10",
    region=sess.boto_region_name,
    version="4.17",
    py_version="py38",
    instance_type="ml.m5.large",
    image_scope="inference",
)

Now that we have the serverless config file, we can create a serverless endpoint in the same way as our real-time inference endpoint, using the .deploy() method:

sl_predictor = huggingface_estimator.deploy(
    serverless_inference_config=serverless_config, image_uri=image_uri
)

The endpoint should be created in a few minutes.

Perform model inference

To make predictions, we need to create the sentence pair by adding the [CLS] and [SEP] special tokens and subsequently submit the input to the model endpoints. The syntax for real-time inference and serverless inference is the same:

import random 

rand = random.randrange(0, 8000)

true_label = dataset_test[rand]['label']
sent_1 = dataset_test[rand]['sentence1']
sent_2 = dataset_test[rand]['sentence2']

sentence_pair = {"inputs": ['[CLS] ' + sent_1 + ' [SEP] ' + sent_2 + ' [SEP]']}


# real-time inference 
print('Sentence 1:', sent_1) 
print('Sentence 2:', sent_2)
print()
print('Inference Endpoint:', rt_predictor.endpoint_name)
print('True Label:', true_label)
print('Predicted Label:', rt_predictor.predict({"inputs": sentence_pair})[0]['label'])
print('Prediction Confidence:', rt_predictor.predict({"inputs": sentence_pair})[0]['score'])

# serverless inference
print('Sentence 1:', sent_1) 
print('Sentence 2:', sent_2)
print()
print('Inference Endpoint:', sl_predictor.endpoint_name)
print('True Label:', true_label)
print('Predicted Label:', sl_predictor.predict({"inputs": sentence_pair})[0]['label'])
print('Prediction Confidence:', sl_predictor.predict({"inputs": sentence_pair})[0]['score'])

In the following examples, we can see the model is capable of correctly classifying whether the input sentence pair contains paraphrased sentences.

The following is a real-time inference example.

The following is a Serverless Inference example.

Evaluate model performance

To evaluate the model, let’s expand the preceding code and submit all 8,000 unseen test records to the real-time endpoint:

from tqdm import tqdm

preds = []
labels = []

# Inference takes ~5 minutes for all test records using a fine-tuned roberta-base and ml.g4dn.xlarge instance

for i in tqdm(range(len(dataset_test))):
    true_label = dataset_test[i]['label']
    sent_1 = dataset_test[i]['sentence1']
    sent_2 = dataset_test[i]['sentence2']
    
    sentence_pair = {"inputs": ['[CLS] ' + sent_1 + ' [SEP] ' + sent_2 + ' [SEP]']}
    pred = rt_predictor.predict(sentence_pair)
    
    labels.append(true_label)
    preds.append(int(pred[0]['label'].split('_')[1]))

Next, we can create a classification report using the extracted predictions:

from sklearn.metrics import classification_report

print('Endpoint Name:', rt_predictor.endpoint_name)
class_names = ['paraphase', 'not paraphrase']
print(classification_report(labels, preds, target_names=class_names))

We get the following test scores.

We can observe that roberta-base has a combined macro-average F1 score of 92% and performs slightly better at detecting sentences that are paraphrases. The roberta-base model performs well, but it’s good practice to calculate model performance using at least one other model.

The following table compares roberta-base performance results on the same test set against another fine-tuned transformer called paraphrase-mpnet-base-v2, a sentence transformer pre-trained specifically for the paraphrase identification task. Both models were trained on an ml.p3.8xlarge instance.

The results show that roberta-base has a 1% higher F1 score with very similar training and inference times using real-time inference hosting on SageMaker. The performance difference between the models is relatively minor, however, roberta-base is ultimately the winner since it has marginally better performance metrics and almost identical training and inference times.

Precision

Recall

F1-score

Training time (billable)

Inference time (full test set)

roberta-base

0.92

0.93

0.92

18 minutes

2 minutes

paraphrase-mpnet-

base-v2

0.92

0.91

17 minutes

2 minutes

Clean up

When you’re done using the model endpoints, you can delete them to avoid incurring future charges:

rt_predictor.delete_endpoint()
sl_predictor.delete_endpoint()

Conclusion

In this post, we discussed how to rapidly build a paraphrase identification model using Hugging Face transformers on SageMaker. We fine-tuned two pre-trained transformers, roberta-base and paraphrase-mpnet-base-v2, using the PAWS dataset (which contains sentence pairs with high lexical overlap). We demonstrated and discussed the benefits of real-time inference vs. Serverless Inference deployment, the latter being a new feature that targets spiky workloads and eliminates the need to manage scaling policies. On an unseen test set with 8,000 records, we demonstrated that both models achieved an F1 score greater than 90%.

To expand on this solution, consider the following:

Try fine-tuning with your own custom dataset. If you don’t have sufficient training labels, you could evaluate the performance of a fine-tuned model like the one demonstrated in this post on a custom test dataset.
Integrate this fine-tuned model into a downstream application that requires information on whether two sentences (or blocks of text) are paraphrases of each other.

Happy building!

About the Authors

Bala Krishnamoorthy is a Data Scientist with AWS Professional Services, where he enjoys applying machine learning to solve customer business problems. He specializes in natural language processing use cases and has worked with customers in industries such as software, finance and healthcare. In his free time, he enjoys trying new food, watching comedies and documentaries, working out at Orange Theory, and being out on the water (paddle-boarding, snorkeling and hopefully diving soon).

Ivan Cui is a Data Scientist with AWS Professional Services, where he helps customers build and deploy solutions using machine learning on AWS. He has worked with customers across diverse industries, including software, finance, pharmaceutical, and healthcare. In his free time, he enjoys reading, spending time with his family, and maximizing his stock portfolio.

How Moovit turns data into insights to help passengers avoid delays using Apache Airflow and Amazon SageMaker

This is a guest post by Moovit’s Software and Cloud Architect, Sharon Dahan.

Moovit, an Intel company, is a leading Mobility as a Service (MaaS) solutions provider and creator of the top urban mobility app. Moovit serves over 1.3 billion riders in 3,500 cities around the world.

We help people everywhere get to their destination in the smoothest way possible, by combining all options for real-time trip planning and payment in one app. We provide governments, cities, transit agencies, operators, and all organizations with mobility challenges with AI-powered mobility solutions that cover planning, operations, and analytics.

In this post, we describe how Moovit built an automated pipeline to train and deploy BERT models which classify public transportation service alerts in multiple metropolitan areas using Apache Airflow and Amazon SageMaker.

The service alert challenge

One of the key features in Moovit’s urban mobility app is offering access to transit service alerts (sourced from local operators and agencies) to app users around the world.

A service alert is a text message that describes a change (which can be positive or negative) in public transit service. These alerts are typically communicated by the operator in a long textual format and need to be analyzed in order to classify their potential impact on the user’s trip plan. The service alert classification affects the way transit recommendations are shown in the app. An incorrect classification may cause users to ignore important service interruptions that may impact their trip plan.

Existing solution and classification challenges

Historically, Moovit applied both automated rule-based classification (which works well for simple logic) as well as manual human classification for more complex cases.

For example, the following alert “Line 46 will arrive 10 min later as a result of an accident with a deer.” Can be classified into one of the following categories:

1: "NO_SERVICE",
2: "REDUCED_SERVICE",
3: "SIGNIFICANT_DELAYS",
4: "DETOUR",
5: "ADDITIONAL_SERVICE",
6: "MODIFIED_SERVICE",
7: "OTHER_EFFECT",
9: "STOP_MOVED",

The above example should be classified as 3, which is SIGNIFICANT_DELAYS.

The existing rule-based classification solution searches the text for key phrases (for example delay or late) as illustrated in the following diagram.

While the rule-based classification engine offered accurate classifications, it was able to classify only 20% of the service alerts requiring the other 80% to be manually classified. This was not scalable and resulted in gaps in our service alerts coverage.

NLP based classification with a BERT framework

We decided to leverage a neural network that can learn to classify service alerts and selected the BERT model for this challenge.

BERT (Bidirectional Encoder Representations from Transformers) is an open-source machine learning (ML) framework for natural language processing (NLP). BERT is designed to help computers understand the meaning of ambiguous language in the text by using surrounding text to establish context. The BERT framework was pre-trained using text from the BooksCorpus with 800M words and English Wikipedia with 2,500M words, and can be fine-tuned with question-answer datasets.

We leveraged classified data from our rule-based classification engine as ground truth for the training job and explored two possible approaches:

Approach 1: The first approach was to train using the BERT pre-trained model which meant adding our layers in the beginning and at the end of the pre-trained model.
Approach 2: The second approach was to use the BERT tokenizer with a standard five-layer model.

Comparison tests showed that, due to the limited amount of available ground truth data, the BERT tokenizer approach yielded better results, was less time-consuming, and required minimal compute resources for training. The model was able to successfully classify service alerts that could not be classified with the existing rule-based classification engine.

The following diagram illustrates the model’s high-level architecture.

After we have the trained model, we deploy it to a SageMaker endpoint and expose it to the Moovit backend server (with request payload being the service alert’s raw text). See the following example code:

{
   "instances": [
       "Expect longer waits for </br> B4, B8, B11, B12, B14, B17, B24, B35, B38, B47, B48, B57, B60, B61, B65, B68, B82, and B83 buses.rnrnWe're working to provide as much service as possible."
   ]
}

The response is the classification and the level of confidence:

{
   "response": [
       {
           "id": 1,
           "prediction": "SIGNIFICANT_DELAYS",
           "confidance": 0.921
       }
   ]
}

From research to production – overcoming operational challenges

Once we trained an NLP model, we had to overcome several challenges in order to enable our app users to access service alerts at scale and in a timely manner:

How do we deploy a model to our production environment?
How do we serve the model at scale with low latency?
How do we re-train the model in order to future proof our solution?
How do we expand to other metropolitan areas (aka “metros”) in an efficient way?

Prior to using SageMaker, we used to take the trained ML models and manually integrate them into our backend environment. This created a dependency between the model deployment and a backend upgrade. As a result, our ability to deploy new models was very limited and resulted in extremely rare model updates.

In addition, serving an ML model can require substantial compute resources which are difficult to predict and need to be provisioned for in advance in order to ensure adherence to our strict latency requirements. When the model is served within the backend this can cause unnecessary scaling of compute resources and erratic behavior.

The solution to both these challenges was to use SageMaker endpoints for our real time inference requirements. This enabled us to (1) de-couple the model serving and deployment cycle from the backend release schedule and (2) de-couple the resource provisioning required for model serving (also in peak periods) from the backend provisioning.

Because our group already had deep experience with Airflow, we decided to automate the entire pipeline using Airflow operators in conjunction with SageMaker. As you can see below, we built a full CI/CD pipeline to automate data collection, model re-training and to manage the deployment process. This pipeline can also be leveraged to make the entire process scalable to new metropolitan areas, as we continue to increase our coverage in additional cities worldwide.

AI Lake architecture

The architecture shown in the following diagram is based on SageMaker and Airflow; all endpoints exposed to developers use Amazon API Gateway. This implementation was dubbed “AI lake”.

SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning models quickly by bringing together a broad set of capabilities purpose-built for machine learning.

Moovit uses SageMaker to automate the training and deployment process. The trained models are saved to Amazon Simple Storage Service (Amazon S3) and cataloged.

SageMaker helps us significantly reduce the need for engineering time and lets us focus more on developing features for the business and less on the infrastructure required to support the model’s lifecycle. Below you can see Moovit’s SageMaker Training Jobs.

After we train the Metro’s model, we expose it using the SageMaker endpoint. SageMaker enables us to deploy a new version seamlessly to the app, without any downtime.

Moovit uses API Gateway to expose all models under the same domain, as shown in the following screenshot.

Moovit decided to use Airflow to schedule and create a holistic workflow. Each model has its own workflow, which includes the following steps:

Dataset generation – The owner of this step is the BI team. This step automatically creates a fully balanced dataset with which to train the model. The final dataset is saved to an S3 bucket.
Train – The owner of this step is the server team. This step fetches the dataset from the previous step and trains the model using SageMaker. SageMaker takes care of the whole training process, such as provisioning the instance, running the training code, saving the model, and saving the training job results and logs.
Verify – This step is owned by the data science team. During the verification step, Moovit runs a confusion matrix and checks some of the parameters to make sure that the model is healthy and stands within proper thresholds. If the new model misses the criteria, the flow is canceled and the deploy step doesn’t run.
Deploy – The owner of this step is the DevOps teams. This step triggers the deploy function for SageMaker (using Boto3) to update the existing endpoint or create a new one.

Results

With the AI lake solution and service alert classification model, Moovit accomplished two major achievements:

Functional – In Metros where the service alert classification model was deployed, Moovit has achieved x3 growth in percentage of classified service alerts! (from 20% to over 60%)
Operational – Moovit now has the ability to maintain and develop more ML models with less engineering effort, and with very clear and outlined best practices and responsibilities. This opens new opportunities for integrating AI and ML models into Moovit’s products and technologies.

The following charts illustrate the service alert classifications before (left) and after (right) implementing this solution – the turquoise area is the unclassified alerts (aka “modified service”).

Conclusion

In this post, we shared how Moovit used SageMaker with AirFlow to improve the number of classified service alerts by 200% (x3). Moovit is now able to maintain and develop more ML models with less engineering efforts and with very clear practices and responsibilities.

For further reading, refer to the following:

About the Authors

Sharon Dahan is a Software & Cloud Architect at Moovit. He is responsible for bringing innovative and creative solutions which can stand within Moovit’s tremendous scale. In his spare time, Sharon makes tasty hoppy beer.

Miron Perel is a Senior Machine Learning Business Development Manager with Amazon Web Services. Miron helps enterprise organizations harness the power of data and Machine Learning to innovate and grow their business.

Eitan Sela is a Machine Learning Specialist Solutions Architect with Amazon Web Services. He works with AWS customers to provide guidance and technical assistance, helping them build and operate machine learning solutions on AWS. In his spare time, Eitan enjoys jogging and reading the latest machine learning articles.

This AI-enabled robotic boat cleans up harbors and rivers to keep plastic trash out of the ocean

The post This AI-enabled robotic boat cleans up harbors and rivers to keep plastic trash out of the ocean appeared first on The AI Blog.

How DNEG Helped Win Another Visual-Effects Oscar by Bringing ‘Dune’ to Life With NVIDIA RTX

Featuring stunning visuals from futuristic interstellar worlds, including colossal sand creatures, Dune captivated audiences around the world.

The sci-fi film picked up six Oscars last month at the 94th Academy Awards, including for Best Sound and Visual Effects. Adapted from Frank Herbert’s 1965 novel of the same name, Dune tells the story of Paul Atreides, a heroic character whose family travels to the dangerous planet of Arrakis.

To bring the story to life, now seven-time Academy Award-winning studio DNEG used a blend of practical and visual effects, creating spectacular graphics that capture the dystopian worlds of Dune.

The film’s production visual effects supervisor, Paul Lambert, said that his focus was on seamlessly augmenting or enhancing what was already accomplished with the beautiful production design and cinematography — grounding the visual effects in reality. DNEG contributed to 28 sequences and over 1,000 VFX shots in the film, and the artists worked from multiple locations using NVIDIA RTX Virtual Workstations.

A Mix of Sand and Simulations

Sand, inevitably, was going to play a major part in Dune, and 18 tons of it were used to make the film. But the VFX team also digitally replicated every aspect of it to perfectly blend simulated sand into the shots.

“One of the things we came up against early on was getting that essence of scale,” said Paul Salvini, global CTO of DNEG. “In simulations, each grain of sand is literally the size of a pixel, which means we needed huge volumes of sand, and that turned into petabytes of data.”

Beyond replicating the look and feel of sand, the team needed to realistically capture its movement. This became even more challenging when it came to finding a way to depict massive sandworms moving through the desert.

The artists spent months building, modeling and sculpting the creature into shape. They took inspiration from baleen whales — months of research revealed that when a colossal object moves through sand, the environment around it behaves like water, similar to how a whale moves through the ocean.

DNEG then simulated each sand particle to see how it would cascade off a sandworm, or how the dunes would ripple as the creature moved around. For the latter, the practical effects team created a sand-displacement effect by placing a vibrating metal plate under real sand, and the VFX team expanded it to simulate the effect on a much larger scale.

“It’s tricky to do, because it’s super complex and computationally expensive to figure out how one grain of sand is connected to another grain — and have all of this act on a massive scale,” said Salvini. “It was an iterative process, and it takes a long time to actually simulate all of these particles.”

DNEG used a combination of Dell Precision workstations and Dell PowerEdge R740 servers with NVIDIA RTX and server GPUs to iterate quickly and make changes, ensuring the simulations with the sandworm looked realistic.

To add more realism to the creature, the artists looked to the bristly teeth of baleen whales. The VFX team modeled different versions of the teeth and used a scattering system in the Houdini app, which allowed them to populate the worm’s mouth at render time.

Using Isotropix Clarisse and NVIDIA RTX, DNEG artists rendered graphics in hours instead of days. This allowed them to receive feedback on the visuals nearly instantly. It also helped increase their number of iterations, enabling final shots and high-quality images at a much quicker pace.

Enhancing Production Workflows With Virtualization

DNEG was one of the first studios to implement NVIDIA virtual GPUs at scale with NVIDIA RTX Virtual Workstation software. NVIDIA RTX-powered virtual workstations deliver incredible flexibility, allowing DNEG to adjust the number of users on a particular server based on the current workload.

Virtual machines are also cost effective. As newer GPUs and expanded software packages enter the data center, DNEG can deploy these to its users to maintain optimal performance for each artist.

“To give our artists more compute power, we can easily increase NVIDIA vGPU profile sizes and reduce the number of users we put on each server,” said Daire Byrne, global head of systems at DNEG. “We don’t need to replace any equipment to keep working with maximum performance.”

And because creators can securely log into RTX-powered virtual workstations from anywhere in the world, DNEG artists can work remotely, while still maintaining high productivity.

“Every show we get is different from the last, and NVIDIA RTX Virtual Workstations let us scale the memory and performance characteristics up or down to meet the needs of our artists,” said Byrne.

Living in the Future of Virtual Worlds

DNEG continues its pioneering work with NVIDIA Omniverse Enterprise as the studio looks to the future of connected, collaborative workflows.

“The virtual world is where filmmaking is going,” said Salvini. “We now have advanced tools and technologies that are capable of delivering photorealistic virtual environments and digital characters, allowing us to create incredible, beautiful stylized worlds.”

With the shift toward real-time technology and more seamless, collaborative content creation pipelines, DNEG sees greater opportunities to interact with the filmmakers and art teams across the globe. This will allow for many more iterations to accomplish artistic goals in a fraction of the time.

DNEG uses Omniverse Enterprise with Dell Precision workstations with NVIDIA RTX A6000 GPUs, and Dell PowerEdge R7525 servers with NVIDIA A40 GPUs.

Learn more about how DNEG is transforming global film production workflows in the studio’s GTC session, now available on demand.

The post How DNEG Helped Win Another Visual-Effects Oscar by Bringing ‘Dune’ to Life With NVIDIA RTX appeared first on NVIDIA Blog.

Advances in trustworthy machine learning at Alexa AI

The team’s latest research on privacy-preserving machine learning, federated learning, and bias mitigation.Read More

How can we reduce the carbon footprint of global computing?

The voracious appetite for energy from the world’s computers and communications technology presents a clear threat for the globe’s warming climate. That was the blunt assessment from presenters in the intensive two-day Climate Implications of Computing and Communications workshop held on March 3 and 4, hosted by MIT’s Climate and Sustainability Consortium (MCSC), MIT-IBM Watson AI Lab, and the Schwarzman College of Computing.

The virtual event featured rich discussions and highlighted opportunities for collaboration among an interdisciplinary group of MIT faculty and researchers and industry leaders across multiple sectors — underscoring the power of academia and industry coming together.

“If we continue with the existing trajectory of compute energy, by 2040, we are supposed to hit the world’s energy production capacity. The increase in compute energy and demand has been increasing at a much faster rate than the world energy production capacity increase,” said Bilge Yildiz, the Breene M. Kerr Professor in the MIT departments of Nuclear Science and Engineering and Materials Science and Engineering, one of the workshop’s 18 presenters. This computing energy projection draws from the Semiconductor Research Corporations’s decadal report.

To cite just one example: Information and communications technology already account for more than 2 percent of global energy demand, which is on a par with the aviation industries emissions from fuel.

“We are the very beginning of this data-driven world. We really need to start thinking about this and act now,” said presenter Evgeni Gousev, senior director at Qualcomm.

Innovative energy-efficiency options

To that end, the workshop presentations explored a host of energy-efficiency options, including specialized chip design, data center architecture, better algorithms, hardware modifications, and changes in consumer behavior. Industry leaders from AMD, Ericsson, Google, IBM, iRobot, NVIDIA, Qualcomm, Tertill, Texas Instruments, and Verizon outlined their companies’ energy-saving programs, while experts from across MIT provided insight into current research that could yield more efficient computing.

Panel topics ranged from “Custom hardware for efficient computing” to “Hardware for new architectures” to “Algorithms for efficient computing,” among others.

The goal, said Yildiz, is to improve energy efficiency associated with computing by more than a million-fold.

“I think part of the answer of how we make computing much more sustainable has to do with specialized architectures that have very high level of utilization,” said Darío Gil, IBM senior vice president and director of research, who stressed that solutions should be as “elegant” as possible.

For example, Gil illustrated an innovative chip design that uses vertical stacking to reduce the distance data has to travel, and thus reduces energy consumption. Surprisingly, more effective use of tape — a traditional medium for primary data storage — combined with specialized hard drives (HDD), can yield a dramatic savings in carbon dioxide emissions.

Gil and presenters Bill Dally, chief scientist and senior vice president of research of NVIDIA; Ahmad Bahai, CTO of Texas Instruments; and others zeroed in on storage. Gil compared data to a floating iceberg in which we can have fast access to the “hot data” of the smaller visible part while the “cold data,” the large underwater mass, represents data that tolerates higher latency. Think about digital photo storage, Gil said. “Honestly, are you really retrieving all of those photographs on a continuous basis?” Storage systems should provide an optimized mix of of HDD for hot data and tape for cold data based on data access patterns.

Bahai stressed the significant energy saving gained from segmenting standby and full processing. “We need to learn how to do nothing better,” he said. Dally spoke of mimicking the way our brain wakes up from a deep sleep, “We can wake [computers] up much faster, so we don’t need to keep them running in full speed.”

Several workshop presenters spoke of a focus on “sparsity,” a matrix in which most of the elements are zero, as a way to improve efficiency in neural networks. Or as Dally said, “Never put off till tomorrow, where you could put off forever,” explaining efficiency is not “getting the most information with the fewest bits. It’s doing the most with the least energy.”

Holistic and multidisciplinary approaches

“We need both efficient algorithms and efficient hardware, and sometimes we need to co-design both the algorithm and the hardware for efficient computing,” said Song Han, a panel moderator and assistant professor in the Department of Electrical Engineering and Computer Science (EECS) at MIT.

Some presenters were optimistic about innovations already underway. According to Ericsson’s research, as much as 15 percent of the carbon emissions globally can be reduced through the use of existing solutions, noted Mats Pellbäck Scharp, head of sustainability at Ericsson. For example, GPUs are more efficient than CPUs for AI, and the progression from 3G to 5G networks boosts energy savings.

“5G is the most energy efficient standard ever,” said Scharp. “We can build 5G without increasing energy consumption.”

Companies such as Google are optimizing energy use at their data centers through improved design, technology, and renewable energy. “Five of our data centers around the globe are operating near or above 90 percent carbon-free energy,” said Jeff Dean, Google’s senior fellow and senior vice president of Google Research.

Yet, pointing to the possible slowdown in the doubling of transistors in an integrated circuit — or Moore’s Law — “We need new approaches to meet this compute demand,” said Sam Naffziger, AMD senior vice president, corporate fellow, and product technology architect. Naffziger spoke of addressing performance “overkill.” For example, “we’re finding in the gaming and machine learning space we can make use of lower-precision math to deliver an image that looks just as good with 16-bit computations as with 32-bit computations, and instead of legacy 32b math to train AI networks, we can use lower-energy 8b or 16b computations.”

Other presenters singled out compute at the edge as a prime energy hog.

“We also have to change the devices that are put in our customers’ hands,” said Heidi Hemmer, senior vice president of engineering at Verizon. As we think about how we use energy, it is common to jump to data centers — but it really starts at the device itself, and the energy that the devices use. Then, we can think about home web routers, distributed networks, the data centers, and the hubs. “The devices are actually the least energy-efficient out of that,” concluded Hemmer.

Some presenters had different perspectives. Several called for developing dedicated silicon chipsets for efficiency. However, panel moderator Muriel Medard, the Cecil H. Green Professor in EECS, described research at MIT, Boston University, and Maynooth University on the GRAND (Guessing Random Additive Noise Decoding) chip, saying, “rather than having obsolescence of chips as the new codes come in and in different standards, you can use one chip for all codes.”

Whatever the chip or new algorithm, Helen Greiner, CEO of Tertill (a weeding robot) and co-founder of iRobot, emphasized that to get products to market, “We have to learn to go away from wanting to get the absolute latest and greatest, the most advanced processor that usually is more expensive.” She added, “I like to say robot demos are a dime a dozen, but robot products are very infrequent.”

Greiner emphasized consumers can play a role in pushing for more energy-efficient products — just as drivers began to demand electric cars.

Dean also sees an environmental role for the end user.

“We have enabled our cloud customers to select which cloud region they want to run their computation in, and they can decide how important it is that they have a low carbon footprint,” he said, also citing other interfaces that might allow consumers to decide which air flights are more efficient or what impact installing a solar panel on their home would have.

However, Scharp said, “Prolonging the life of your smartphone or tablet is really the best climate action you can do if you want to reduce your digital carbon footprint.”

Facing increasing demands

Despite their optimism, the presenters acknowledged the world faces increasing compute demand from machine learning, AI, gaming, and especially, blockchain. Panel moderator Vivienne Sze, associate professor in EECS, noted the conundrum.

“We can do a great job in making computing and communication really efficient. But there is this tendency that once things are very efficient, people use more of it, and this might result in an overall increase in the usage of these technologies, which will then increase our overall carbon footprint,” Sze said.

Presenters saw great potential in academic/industry partnerships, particularly from research efforts on the academic side. “By combining these two forces together, you can really amplify the impact,” concluded Gousev.

Presenters at the Climate Implications of Computing and Communications workshop also included: Joel Emer, professor of the practice in EECS at MIT; David Perreault, the Joseph F. and Nancy P. Keithley Professor of EECS at MIT; Jesús del Alamo, MIT Donner Professor and professor of electrical engineering in EECS at MIT; Heike Riel, IBM Fellow and head science and technology at IBM; and Takashi Ando, principal research staff member at IBM Research. The recorded workshop sessions are available on YouTube.

Aging Brain Initiative awards fund five new ideas to study, fight neurodegeneration

Neurodegenerative diseases are defined by an increasingly widespread and debilitating death of nervous system cells, but they also share other grim characteristics: Their cause is rarely discernible and they have all eluded cures. To spur fresh, promising approaches and to encourage new experts and expertise to join the field, MIT’s Aging Brain Initiative (ABI) this month awarded five seed grants after a competition among labs across the Institute.

Founded in 2015 by nine MIT faculty members, the ABI promotes research, symposia, and related activities to advance fundamental insights that can lead to clinical progress against neurodegenerative conditions, such as Alzheimer’s disease, with an age-related onset. With an emphasis on spurring research at an early stage before it is established enough to earn more traditional funding, the ABI derives support from philanthropic gifts.

“Solving the mysteries of how health declines in the aging brain and turning that knowledge into effective tools, treatments, and technologies is of the utmost urgency given the millions of people around the world who suffer with no meaningful treatment options,” says ABI director and co-founder Li-Huei Tsai, the Picower Professor of Neuroscience in The Picower Institute for Learning and Memory and the Department of Brain and Cognitive Sciences. “We were very pleased that many groups across MIT were eager to contribute their expertise and creativity to that goal. From here, five teams will be able to begin testing their innovative ideas and the impact they could have.”

To address the clinical challenge of accurately assessing cognitive decline during Alzheimer’s disease progression and healthy aging, a team led by Thomas Heldt, associate professor of electrical and biomedical engineering in the Department of Electrical Engineering and Computer Science (EECS) and the Institute for Medical Engineering and Science, proposes to use artificial intelligence tools to bring diagnostics based on eye movements during cognitive tasks to everyday consumer electronics such as smartphones and tablets. By moving these capabilities to common at-home platforms, the team, which also includes EECS Associate Professor Vivian Sze, hopes to increase monitoring beyond what can only be intermittently achieved with high-end specialized equipment and dedicated staffing in specialists’ offices. The team will pilot their technology in a small study at Boston Medical Center in collaboration with neurosurgeon James Holsapple.

Institute Professor Ann Graybiel’s lab in the Department of Brain and Cognitive Sciences (BCS) and the McGovern Institute for Brain Research will test the hypothesis that mutations on a specific gene may lead to the early emergence of Alzheimer’s disease (AD) pathology in the striatum. That’s a a brain region crucial for motivation and movement that is directly and severely impacted by other neurodegenerative disorders including Parkinson’s and Huntington’s diseases, but that has largely been unstudied in Alzheimer’s. By editing the mutations into normal and AD-modeling mice, Research Scientist Ayano Matsushima and Graybiel hope to determine whether and how pathology, such as the accumulation of amyloid proteins, may result. Determining that could provide new insight into the progression of disease and introduce a new biomarker in a region that virtually all other studies have overlooked.

Numerous recent studies have highlighted a potential role for immune inflammation in Alzheimer’s disease. A team led by Gloria Choi, the Mark Hyman Jr. Associate Professor in BCS and The Picower Institute for Learning and Memory, will track one potential source of such activity by determining whether the brain’s meninges, which envelop the brain, becomes a means for immune cells activated by gut bacteria to circulate near the brain, where they may release signaling molecules that promote Alzheimer’s pathology. Working in mice, Choi’s lab will test whether such activity is prone to increase in Alzheimer’s and whether it contributes to disease.

A collaboration led by Peter Dedon, the Singapore Professor in MIT’s Department of Biological Engineering, will explore whether Alzheimer’s pathology is driven by dysregulation of transfer RNAs (tRNAs) and the dozens of natural tRNA modifications in the epitranscriptome, which play a key role in the process by which proteins are assembled based on genetic instructions. With Benjamin Wolozin of Boston University, Sherif Rashad of Tohoku University in Japan, and Thomas Begley of the State University of New York at Albany, Dedon will assess how the tRNA pool and epitranscriptome may differ in Alzheimer’s model mice and whether genetic instructions mistranslated because of tRNA dysregulation play a role in Alzheimer’s disease.

With her seed grant, Ritu Raman, the d’Arbeloff Assistant Professor of Mechanical Engineering, is launching an investigation of possible disruption of intercellular messages in amyotrophic lateral sclerosis (ALS), a terminal condition in which motor neuron causes loss of muscle control. Equipped with a new tool to finely sample interstitial fluid within tissues, Raman’s team will be able to monitor and compare cell-cell signaling in models of the junction between nerve and muscle. These models will be engineered from stem cells derived from patients with ALS. By studying biochemical signaling at the junction the lab hopes to discover new targets that could be therapeutically modified.

Major support for the seed grants, which provide each lab with $100,000, came from generous gifts by David Emmes SM ’76; Kathleen SM ’77, PhD ’86 and Miguel Octavio; the Estate of Margaret A. Ridge-Pappis, wife of the late James Pappis ScD ’59; the Marc Haas Foundation; and the family of former MIT President Paul Gray ’54, SM ’55, ScD ‘60, with additional funding from many annual fund donors to the Aging Brain Initiative Fund.

Your Odyssey Awaits: Stream ‘Lost Ark’ to Nearly Any Device This GFN Thursday

It’s a jam-packed GFN Thursday.

This week brings the popular, free-to-play, action role-playing game Lost Ark to gamers across nearly all their devices, streaming on GeForce NOW. And that’s not all.

GFN Thursday also delivers an upgraded experience in the 2.0.40 update. M1-based MacBooks, iMacs and Mac Minis are now supported natively.

Plus, membership gift cards can now be redeemed for RTX 3080, a membership reward for the “Heroic Edition” of Guild Wars 2 and 14 new games are joining the GeForce NOW library this week.

Embark on an Odyssey

Visit Arkesia, the vast and vibrant world of Lost Ark, now streaming to PC, Mac, Chromebook and more with GeForce NOW.

Explore new lands and encounter vibrant cultures, strange creatures and unexpected marvels waiting to be discovered. Seek out lost treasures in dungeons, test your mettle on epic quests, show your strength in intense raids against enemies and go head-to-head in expert PvP duels. Do it all solo, grouped with friends or matched with other players in this massive open world.

Forge your own destiny across almost all devices. Play to your fighting style with iconic character classes — each with their own distinct abilities — at up to 1440p or 1600p and 120 frames per second on PC and up to 4K on SHIELD TV with an RTX 3080 membership.

Choose weapons and gear to assist you on your journey gaming on the go with a mobile phone. Dive deep into non-combat skills, crafting, guilds, social systems and other rich features that bring the world alive, streaming even on a MacBook at up to 1600p or iMac up to 1440p.

The adventure is yours. Do it all, streaming on GeForce NOW.

Level Up With the GeForce NOW 2.0.40 Update

The newest update to the cloud enables the GeForce NOW macOS app to natively support the Apple M1 chip. This update provides lower power consumption, faster app startup times and an overall elevated GeForce NOW experience on M1-based MacBooks, iMacs and Mac Minis.

Streaming Statistics Overlay on GeForce NOW — *Enjoy PC gaming across nearly all devices.*

The upgrade brings some additional benefits to members.

Members can more easily discover new games to play in the app with the added Genre row at the bottom of the Games menu. Useful sorting options include the ability to see All Games available in specific regions and by device type, and multiple filters can help narrow down the list.

Finally, members can enjoy an improved Streaming Statistics Overlay that now includes server-side rendering frame rates. The overlay quickly toggles through Standard/Compact/Off using the hotkey Ctrl+N. Members can complete their whole log-in process on play.geforcenow.com within the same browser tab.

Give the Gift of Gaming

What could be as good as playing with the power of a GeForce RTX 3080? Try giving that power to a loved one, streaming from the cloud.

Gift Cards on GeForce NOW — *It’s the gift that keeps on giving for gamers.*

GeForce NOW RTX 3080 memberships are now available as digital gift cards with two-, three- and six-month options. GeForce NOW membership gift cards can be used to redeem an RTX 3080 membership or a Priority membership, depending on the recipient’s preference, so you can’t go wrong.

Spoil a special gamer in your life by giving them access to the cloud across their devices or bring a buddy onto the service to party up and play.

Gift cards can be added to an existing GeForce NOW account or redeemed on a new one. Existing Founders, Priority and RTX 3080 members will have the number of months added to their accounts. For more information, visit the GeForce NOW website.

Rewards of Heroic Quality

This week, members can receive the “Heroic Edition” of ArenaNet’s critically acclaimed free-to-play massively multiplayer online role-playing game Guild Wars 2. The Guild Wars 2 “Heroic Edition” comes with a full treasure trove of goodies, including the Suit of Legacy Armor, an 18-slot inventory expansion and four heroic Boosters.

Guild Wars 2 Heroic Edition on GeForce NOW — *You bring the heroics, we’ll bring the rewards in Guild Wars 2.*

Getting membership rewards for streaming games on the cloud is easy. Log in to your NVIDIA account and select “GEFORCE NOW” from the header, scroll down to “REWARDS” and click the “UPDATE REWARDS SETTINGS” button. Check the box in the dialogue window that pops up to start receiving special offers and in-game goodies.

Sign up for the GeForce NOW newsletter, including notifications for when rewards are available, by logging into your NVIDIA account and selecting “PREFERENCES” from the header. Check the “Gaming & Entertainment” box, and “GeForce NOW” under topic preferences.

No Time Like Playtime

Dune Spice Wars on GeForce NOW — *Lead your faction and battle for control over the harsh desert planet of Arrakis.*

To cap it all off, GFN Thursday has new games to play, as always. Members can look for the following 14 games ready to stream this week:

Dune: Spice Wars (New release on Steam)
Holomento (New release on Steam)
Prehistoric Kingdom (New release on Steam and Epic Games Store)
Romans: Age of Caesar (New release on Steam)
Sea of Craft (New release on Steam)
Trigon: Space Story (New release on Steam)
Vampire: The Masquerade – Bloodhunt (New release on Steam)
Conan Exiles (Epic Games Store)
Crawl (Steam)
Flashing Lights – Police, Firefighting, Emergency Services Simulator (Steam)
Galactic Civilizations II: Ultimate Edition (Steam)
Jupiter Hell (Steam)
Lost Ark (Steam)
SOL CRESTA (Steam)

Finally, we’ve got a question for you this week and are only accepting wrong answers. Let us know what you think on Twitter or in the comments below:

What does RPG stand for? Wrong answers only.

— NVIDIA GeForce NOW (@NVIDIAGFN) April 27, 2022

The post Your Odyssey Awaits: Stream ‘Lost Ark’ to Nearly Any Device This GFN Thursday appeared first on NVIDIA Blog.

When a passion for bass and brass help build better tools

We caught up with Kevin Millikin, a software engineer on the DevTools team. He’s in Salt Lake City this week to present at PyCon US, the largest annual gathering for those using and developing the open-source Python programming language.Read More

Using AI to identify relevance from a list of keywords

Using SageMaker and Hugging Face to quickly build advanced NLP capabilities

Using AWS to improve operational efficiency and find new innovation opportunities

About the Author

Solution overview

The PAWS dataset

Prerequisites

Set up the environment

Prepare the data

Tokenize the dataset

Fine-tune the model

Deploy the model and perform inference

Deploy to a real-time inference endpoint

Deploy to a Serverless Inference endpoint

Perform model inference

Evaluate model performance

Clean up

Conclusion

About the Authors

The service alert challenge

Existing solution and classification challenges

NLP based classification with a BERT framework

From research to production – overcoming operational challenges

AI Lake architecture

Results

Conclusion

About the Authors

A Mix of Sand and Simulations

Enhancing Production Workflows With Virtualization

Living in the Future of Virtual Worlds

Embark on an Odyssey

Level Up With the GeForce NOW 2.0.40 Update

Give the Gift of Gaming

Rewards of Heroic Quality

No Time Like Playtime

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.