Come Sale Away with GFN Thursday

GFN Thursday means more games for GeForce NOW members, every single week.

This week’s list includes the day-and-date release of Spacebase Startopia, but first we want to share the scoop on some fantastic sales available across our digital game store partners that members will want to take advantage of this very moment.

Discounts for All

GeForce NOW is custom-built for PC gamers, and our open platform philosophy is a hallmark of PC gaming. Gamers are used to buying their games from whichever digital store they choose, and often jump between them during big sales.

That’s why we support multiple digital game stores, as well as the games already in your libraries. Why lock you down if there are great deals to be had?

When supported games go on sale, members can purchase and instantly play from GeForce NOW’s cloud gaming servers. Plus, they know that they’re adding the real PC version to their gaming library for playing on their local machines whenever they like.

Your Base, Your Rules

Spacebase Startopia on GeForce NOW
Keeping your inhabitants happy is half the battle in Spacebase Startopia, joining GeForce NOW on March 26.

This open platform philosophy is one of the key reasons why Kalypso made its games available on GeForce NOW. The publisher, known for hit strategy sims like the Tropico series, understands that bringing its games to GeForce NOW means another easy way to welcome new gamers, and keep them playing.

“We want gamers to be able to play our games as easily as possible,” says Marco Nier, international marketing manager at Kalypso. “If they can play on their local machine? Great! If they can use GeForce NOW on their laptop, or Mac, or phone? Even better.”

Kalypso’s newest game, Spacebase Startopia, joins the GeForce NOW library when it releases tomorrow, March 26. Developed by Realmforge Studios, the game challenges you to manage and maintain your own space station, and mixes economic simulation, strategic conquest and real-time strategy gameplay with more than a dash of humor. It’s a game we’ve been excited about for a while, and members will be able to stream every moment across all of their devices.

Spacebase Startopia on GeForce NOW
You decide how to renovate your space station, in hopes it becomes a huge interstellar hub for alien visitors.

Additionally, to celebrate the Spacebase Startopia launch, Kalypso has put some of its greatest GeForce NOW-enabled games on sale on Steam.

  • Commandos 2 – HD Remaster – 40 percent off until March 29 (Steam)
  • Immortal Realms: Vampire Wars – 50 percent off until March 29 (Steam)
  • Praetorians – HD Remaster – 40 percent off until March 29 (Steam)

More Savings

If you’re still figuring out how to fill your weekend, look no further. We’re thrilled to share additional deals our members can take advantage of:

Ubisoft

Square Enix

  • Just Cause 3 – 85 percent off until March 29 (Steam)
  • Just Cause 4: Reloaded – 80 percent off until March 29 (Steam)
  • Rise of the Tomb Raider: 20 Year Celebration – 80 percent off until March 29 (Steam)
  • Shadow of the Tomb Raider: Definitive Edition – 75 percent off until March 29 (Steam)

Deep Silver

  • Gods Will Fall – 20 percent off until April 8 (Epic Games Store)
  • Metro Exodus: Standard Edition – 66 percent off until April 8 (Epic Games Store)
  • Outward – 70 percent off until March 29 (Steam)

Other GFN Thursday Favorites

  • Car Mechanic Simulator 2018 – 60 percent off until April 2 (Steam)
  • Farm Manager 2018 – 90 percent off until April 2 (Steam)
  • Lonely Mountains: Downhill – 33 percent off until March 31 (Steam)
  • Superhot – 60 percent off until March 29 (Steam)
  • Superhot: Mind Control Delete – 60 percent off until March 29 (Steam)
  • Thief Simulator – 62 percent off until April 2 (Steam)
  • Tower of Time – 75 percent off until April 1 (Steam)

Finding offers like these and more is never out of reach. Be sure to check out the Sales and Special Offers row in the GeForce NOW app.

Play Overcooked! All you can Eat! on GeForce NOW
Overcooked! All You Can Eat! is one of 12 games joining the GeForce NOW library this week.

Let’s Play

If all that wasn’t enough new gaming goodness, don’t forget it’s still GFN Thursday, and that means more additions to the GeForce NOW library. Members can look for the following:

  • Spacebase Startopia (day-and-date release on Steam and Epic Games Store, March 26)
  • Overcooked! All You Can Eat! (day-and-date release on Steam, March 23)
  • Paradise Lost (day-and-date release on Steam, March 24)
  • Door Kickers (Steam)
  • Evoland Legendary Edition (Steam)
  • Iron Conflict (Steam)
  • Railroad Corporation (Steam)
  • Sword and Fairy 7 Trial (Steam)
  • Thief Gold (Steam)
  • Trackmania United Forever (Steam)
  • Worms Reloaded (Steam)
  • Wrench (Steam)

What are you planning to play this weekend? Let us know on Twitter or in the comments below.

The post Come Sale Away with GFN Thursday appeared first on The Official NVIDIA Blog.

Read More

Introducing PyTorch Profiler – the new and improved performance tool

Along with PyTorch 1.8.1 release, we are excited to announce PyTorch Profiler – the new and improved performance debugging profiler for PyTorch. Developed as part of a collaboration between Microsoft and Facebook, the PyTorch Profiler is an open-source tool that enables accurate and efficient performance analysis and troubleshooting for large-scale deep learning models.

Analyzing and improving large-scale deep learning model performance is an ongoing challenge that grows in importance as the model sizes increase. For a long time, PyTorch users had a hard time solving this challenge due to the lack of available tools. There were standard performance debugging tools that provide GPU hardware level information but missed PyTorch-specific context of operations. In order to recover missed information, users needed to combine multiple tools together or manually add minimum correlation information to make sense of the data. There was also the autograd profiler (torch.autograd.profiler) which can capture information about PyTorch operations but does not capture detailed GPU hardware-level information and cannot provide support for visualization.

The new PyTorch Profiler (torch.profiler) is a tool that brings both types of information together and then builds experience that realizes the full potential of that information. This new profiler collects both GPU hardware and PyTorch related information, correlates them, performs automatic detection of bottlenecks in the model, and generates recommendations on how to resolve these bottlenecks. All of this information from the profiler is visualized for the user in TensorBoard. The new Profiler API is natively supported in PyTorch and delivers the simplest experience available to date where users can profile their models without installing any additional packages and see results immediately in TensorBoard with the new PyTorch Profiler plugin. Below is the screenshot of PyTorch Profiler – automatic bottleneck detection.

Getting started

PyTorch Profiler is the next version of the PyTorch autograd profiler. It has a new module namespace torch.profiler but maintains compatibility with autograd profiler APIs. The Profiler uses a new GPU profiling engine, built using Nvidia CUPTI APIs, and is able to capture GPU kernel events with high fidelity. To profile your model training loop, wrap the code in the profiler context manager as shown below.

 with torch.profiler.profile(
    schedule=torch.profiler.schedule(
        wait=2,
        warmup=2,
        active=6,
        repeat=1),
    on_trace_ready=tensorboard_trace_handler,
    with_trace=True
) as profiler:
    for step, data in enumerate(trainloader, 0):
        print("step:{}".format(step))
        inputs, labels = data[0].to(device=device), data[1].to(device=device)

        outputs = model(inputs)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        profiler.step()

The schedule parameter allows you to limit the number of training steps included in the profile to reduce the amount of data collected and simplify visual analysis by focusing on what’s important. The tensorboard_trace_handler automatically saves profiling results to disk for analysis in TensorBoard.

To view results of the profiling session in TensorBoard, install PyTorch Profiler TensorBoard Plugin package.

pip install torch_tb_profiler

Visual Studio Code Integration

Microsoft Visual Studio Code is one of the most popular code editors for Python developers and data scientists. The Python extension for VS Code recently added the integration of TensorBoard into the code editor, including support for the PyTorch Profiler. Once you have VS Code and the Python extension installed, you can quickly open the TensorBoard Profiler plugin by launching the Command Palette using the keyboard shortcut CTRL + SHIFT + P (CMD + SHIFT + P on a Mac) and typing the “Launch TensorBoard” command.

This integration comes with a built-in lifecycle management feature. VS Code will install the TensorBoard package and the PyTorch Profiler plugin package (coming in mid-April) automatically if you don’t have them on your system. VS Code will also launch TensorBoard process for you and automatically look for any TensorBoard log files within your current directory. When you’re done, just close the tab and VS Code will automatically close the process. No more Terminal windows running on your system to provide a backend for the TensorBoard UI! Below is PyTorch Profiler Trace View running in TensorBoard.

Learn more about TensorBoard support in VS Code in this blog.

Feedback

Review PyTorch Profiler documentation, give Profiler a try and let us know about your experience. Provide your feedback on PyTorch Discussion Forum or file issues on PyTorch GitHub.

Read More

Recursive Classification: Replacing Rewards with Examples in RL

Posted by Benjamin Eysenbach, Student Researcher, Google Research

A general goal of robotics research is to design systems that can assist in a variety of tasks that can potentially improve daily life. Most reinforcement learning algorithms for teaching agents to perform new tasks require a reward function, which provides positive feedback to the agent for taking actions that lead to good outcomes. However, actually specifying these reward functions can be quite tedious and can be very difficult to define for situations without a clear objective, such as whether a room is clean or if a door is sufficiently shut. Even for tasks that are easy to describe, actually measuring whether the task has been solved can be difficult and may require adding many sensors to a robot’s environment.

Alternatively, training a model using examples, called example-based control, has the potential to overcome the limitations of approaches that rely on traditional reward functions. This new problem statement is most similar to prior methods based on “success detectors”, and efficient algorithms for example-based control could enable non-expert users to teach robots to perform new tasks, without the need for coding expertise, knowledge of reward function design, or the installation of environmental sensors.

In “Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification,” we propose a machine learning algorithm for teaching agents how to solve new tasks by providing examples of success (e.g., if “success” examples show a nail embedded into a wall, the agent will learn to pick up a hammer and knock nails into the wall). This algorithm, recursive classification of examples (RCE), does not rely on hand-crafted reward functions, distance functions, or features, but rather learns to solve tasks directly from data, requiring the agent to learn how to solve the entire task by itself, without requiring examples of any intermediate states. Using a version of temporal difference learning — similar to Q-learning, but replacing the typical reward function term using only examples of success — RCE outperforms prior approaches based on imitation learning on simulated robotics tasks. Coupled with theoretical guarantees similar to those for reward-based learning, the proposed method offers a user-friendly alternative for teaching robots new tasks.

Top: To teach a robot to hammer a nail into a wall, most reinforcement learning algorithms require that the user define a reward function. Bottom: The example-based control method uses examples of what the world looks like when a task is completed to teach the robot to solve the task, e.g., examples where the nail is already hammered into the wall.

Example-Based Control vs Imitation Learning
While the example-based control method is similar to imitation learning, there is an important distinction — it does not require expert demonstrations. In fact, the user can actually be quite bad at performing the task themselves, as long as they can look back and pick out the small fraction of states where they did happen to solve the task.

Additionally, whereas previous research used a stage-wise approach in which the model first uses success examples to learn a reward function and then applies that reward function with an off-the-shelf reinforcement learning algorithm, RCE learns directly from the examples and skips the intermediate step of defining the reward function. Doing so avoids potential bugs and bypasses the process of defining the hyperparameters associated with learning a reward function (such as how often to update the reward function or how to regularize it) and, when debugging, removes the need to examine code related to learning the reward function.

Recursive Classification of Examples
The intuition behind the RCE approach is simple: the model should predict whether the agent will solve the task in the future, given the current state of the world and the action that the agent is taking. If there were data that specified which state-action pairs lead to future success and which state-action pairs lead to future failure, then one could solve this problem using standard supervised learning. However, when the only data available consists of success examples, the system doesn’t know which states and actions led to success, and while the system also has experience interacting with the environment, this experience isn’t labeled as leading to success or not.

Left: The key idea is to learn a future success classifier that predicts for every state (circle) in a trajectory whether the task will be solved in the future (thumbs up/down). Right: In the example-based control approach, the model is provided only with unlabeled experience (grey circles) and success examples (green circles), so one cannot apply standard supervised learning. Instead, the model uses the success examples to automatically label the unlabeled experience.

Nonetheless, one can piece together what these data would look like, if it were available. First, by definition, a successful example must be one that solves the given task. Second, even though it is unknown whether an arbitrary state-action pair will lead to success in solving a task, it is possible to estimate how likely it is that the task will be solved if the agent started at the next state. If the next state is likely to lead to future success, it can be assumed that the current state is also likely to lead to future success. In effect, this is recursive classification, where the labels are inferred based on predictions at the next time step.

The underlying algorithmic idea of using a model’s predictions at a future time step as a label for the current time step closely resembles existing temporal-difference methods, such as Q-learning and successor features. The key difference is that the approach described here does not require a reward function. Nonetheless, we show that this method inherits many of the same theoretical convergence guarantees as temporal difference methods. In practice, implementing RCE requires changing only a few lines of code in an existing Q-learning implementation.

Evaluation
We evaluated the RCE method on a range of challenging robotic manipulation tasks. For example, in one task we required a robotic hand to pick up a hammer and hit a nail into a board. Previous research into this task [1, 2] have used a complex reward function (with terms corresponding to the distance between the hand and the hammer, the distance between the hammer and the nail, and whether the nail has been knocked into the board). In contrast, the RCE method requires only a few observations of what the world would look like if the nail were hammered into the board.

We compared the performance of RCE to a number of prior methods, including those that learn an explicit reward function and those based on imitation learning , all of which struggle to solve this task. This experiment highlights how example-based control makes it easy for users to specify even complex tasks, and demonstrates that recursive classification can successfully solve these sorts of tasks.

Compared with prior methods, the RCE approach solves the task of hammering a nail into a board more reliably that prior approaches based on imitation learning [SQIL, DAC] and those that learn an explicit reward function [VICE, ORIL, PURL].

Conclusion
We have presented a method to teach autonomous agents to perform tasks by providing them with examples of success, rather than meticulously designing reward functions or collecting first-person demonstrations. An important aspect of example-based control, which we discuss in the paper, is what assumptions the system makes about the capabilities of different users. Designing variants of RCE that are robust to differences in users’ capabilities may be important for applications in real-world robotics. The code is available, and the project website contains additional videos of the learned behaviors.

Acknowledgements
We thank our co-authors, Ruslan Salakhutdinov and Sergey Levine. We also thank Surya Bhupatiraju, Kamyar Ghasemipour, Max Igl, and Harini Kannan for feedback on this post, and Tom Small for helping to design figures for this post.

Read More

Amazon Kendra adds new search connectors from AWS Partner, Perficient, to help customers search enterprise content faster

Today, Amazon Kendra is making nine new search connectors available in the Amazon Kendra connector library developed by Perficient, an AWS Partner. These include search connectors for IBM Case Manager, Adobe Experience Manger, Atlassian Jira and Confluence, and many others.

Improving the Enterprise Search Experience

These days, employees and customers expect an intuitive search experience. Search has evolved from simply being considered a feature to being considered a critical element of how every product works.

Online searches have trained us to expect specific, relevant answers, and not choices when we ask a search engine a question. Furthermore, we expect this level of accuracy everywhere we perform a search, particularly in the workplace. Unfortunately, enterprise search does not typically deliver the desired experience.

Amazon Kendra is an intelligent search service powered by machine learning. Amazon Kendra uses deep learning and reading comprehension to deliver more accurate search results. It also uses natural language understanding (NLU) to deliver a search experience that is more in line with asking an expert a specific question so you can receive a direct answer, quickly.

Amazon Kendra’s intelligent search capabilities improve the search and discovery experience, but enterprises are still faced with the challenge of connecting troves of unstructured data and making that data accessible to search. Content is often unstructured and scattered across multiple repositories making critical information hard to find, costing employees’ time and effort to track down the right answer.

Data Source Connectors Make Information Searchable

Due to the diversity of sources, quality of data, and variety of connection protocols for an organization’s data sources, it’s crucial for enterprises to avoid time spent building complex ETL jobs that aggregate data sources.

To achieve this, organizations can use data source connectors to quickly unify content as part of a single, searchable index, without needing to copy or move data from an existing location to a new one. This reduces the time and effort typically associated with creating a new search solution.

In addition to unifying scattered content, data source connectors that have the right source interfaces in place, and the ability to apply rules and enrich content before users begin searching, can make a search application even more accurate.

Perficient Connectors for Amazon Kendra

Perficient has years of experience developing data source connectors for a wide range of enterprise data sources. After countless one off implementations, they identified patterns which can be repeated for any connection type. They developed their search connector platform Handshake and integrated with Amazon Kendra to deliver the following:

  • Reduced setup time when creating intelligent search applications
    • Handshake’s integration with Amazon Kendra out of the box indexes, extracts and transforms metadata making it easy to configure and get started with whatever data source connector you choose.
  • More connectors for many popular enterprise data sources
    • With Handshake, Amazon Kendra customers can license any of Perficient’s existing source connectors, including IBM Case Manager, Adobe Experience Manger, Nuxeo, Atlassian Jira and Confluence, Elastic Path, Elastic search, SQL databases, file systems, any REST-based API, and 30 other interfaces ready upon request.
  • Access to the most up-to-date enterprise content
    • Data source connectors can be scheduled to automatically sync an Amazon Kendra index with a data source, to ensure you’re always securely searching through the most up-to-date content.

Additionally, you can also develop your own source connectors using Perficient’s framework as an accelerator. Wherever the data lives, whatever format, Perficient can build hooks to combine, enhance, and index the critical data to augment Amazon Kendra’s intelligent search.

Perficient has found Amazon Kendra’s intelligent search capabilities, customization potential, and APIs to be powerful tools in search experience design. We are thrilled to add our growing list of data source connectors to Amazon Kendra’s connector library.

To get started with Amazon Kendra, visit the Amazon Kendra Essentials+ workshop for an interactive walkthrough. To learn more about other Amazon Kendra data source connectors visit the Amazon Kendra connector library. For more information about Perficent’s Handshake platform or to contact the team directly visit their website.


About the Authors

Zach Fischer is a Solution Architect at Perficient with 10 years’ experience delivering Search and ECM implementations to over 40 customers. Recently, he has become the Product Manager for Perficient’s Handshake and Nero products. When not working, he’s playing music or DnD.

 

 

Jean-Pierre Dodel leads product management for Amazon Kendra, a new ML-powered enterprise search service from AWS. He brings 15 years of Enterprise Search and ML solutions experience to the team, having worked at Autonomy, HP, and search startups for many years prior to joining Amazon 4 years ago. JP has led the Kendra team from its inception, defining vision, roadmaps, and delivering transformative semantic search capabilities to customers like Dow Jones, Liberty Mutual, 3M, and PwC.

Read More

New Analytics API for researchers studying Facebook Page data

Today we are launching the Facebook Open Research & Transparency (FORT) Analytics API for researchers. This release includes a collection of API endpoints that helps academics identify trends on Facebook Pages and how they’ve evolved over time. You can leverage these insights to focus on specific Pages that are of interest.

We designed this API specifically for the academic community to conduct longitudinal analyses with time series data. With the launch of FORT Pages API in 2020 and now the Analytics API, we will continue to develop a product roadmap focused on sharing new types of Facebook and Instagram data with the academic research community, as well as additional analytics endpoints.

Analytics API features

With the FORT Analytics API, we are offering three new endpoints. Each of these are aggregated, time-series data, captured at daily intervals and include:

  1. Lifetime follower count (number of users who have ever followed a Page) by country per Page.
  2. Page admin Post count (applies only to Posts made by Page admins and not to user posts)
  3. Page engagement count, where engagement is defined as total number of likes, comments, clicks and shares on Posts created on that Page

Unlike other research-related APIs, these endpoints facilitate queries across the entirety of public Facebook Pages. (You can learn more about Facebook data sets that we make available to researchers here.)

For technical documentation, click here.

Important considerations

Current Analytics data: These endpoints provide aggregate metrics on Facebook Pages. They are designed to help researchers observe and analyze engagement patterns of Pages and leverage that analysis to decide which Pages to focus on, saving time and effort. Our systems take a snapshot of Page activity once per day and provide that information to you via the Pages API. Consequently, these aggregations are directionally accurate (as opposed to data which is dynamically generated for each query).

Historical Page data: When using these endpoints you may encounter small, historical inaccuracies with the counts. For instance, the Center for Disease Control Page seems to have lost a few hundred followers in January, most likely due to user account deletions etc. For now, we consider this an acceptable amount of inaccuracy to still deliver high quality research, but we continue to monitor these to incorporate into future updates. Note that a user deleting a Post, unliking a Post, or deleting a Comment will not be reflected in the analytics. In other words, the data does not account for content deletions.

How to access the FORT Analytics API

If you are a grantee of Social Science One, you will have default access to this API through the FORT platform. If you are not an SS1 researcher but are interested in using this API, please apply to join the Social Science One community here.

Technical limitations
While we won’t apply any formal limit on query size, we recommend the following for best performance:

  • No more than 250 page ids in a single API call
  • Request no more than 10,000 records in a single API call

See this documentation for more details.

About the Facebook Open Research and Transparency Platform

The Facebook Open Research and Transparency (FORT) Platform facilitates responsible research by providing flexible access to valuable data. The platform is built with validated privacy and security protections, such as data access controls, and has been penetration-tested by internal and external experts.

The FORT platform runs on an opinionated version of JupyterHub, an open source tool that is widely used by the academic community. The FORT platform supports multiple standard programs, including SQL, Python, and R, and a specialized bridge to specific Facebook Graph APIs.

Publication guidelines

Researchers may publish research conducted using this data without Facebook’s permission. As with other research conducted under the Research Data Agreement, Facebook is entitled to review (not approve or reject) research prior to publication, and remove any confidential or personally identifiable information.

The post New Analytics API for researchers studying Facebook Page data appeared first on Facebook Research.

Read More

PyTorch for AMD ROCm™ Platform now available as Python package

With the PyTorch 1.8 release, we are delighted to announce a new installation option for users of
PyTorch on the ROCm™ open software platform. An installable Python package is now hosted on
pytorch.org, along with instructions for local installation in the same simple, selectable format as
PyTorch packages for CPU-only configurations and other GPU platforms. PyTorch on ROCm includes full
capability for mixed-precision and large-scale training using AMD’s MIOpen & RCCL libraries. This
provides a new option for data scientists, researchers, students, and others in the community to get
started with accelerated PyTorch using AMD GPUs.

The ROCm Ecosystem

ROCm is AMD’s open source software platform for GPU-accelerated high performance computing and
machine learning. Since the original ROCm release in 2016, the ROCm platform has evolved to support
additional libraries and tools, a wider set of Linux® distributions, and a range of new GPUs. This includes
the AMD Instinct™ MI100, the first GPU based on AMD CDNA™ architecture.

The ROCm ecosystem has an established history of support for PyTorch, which was initially implemented
as a fork of the PyTorch project, and more recently through ROCm support in the upstream PyTorch
code. PyTorch users can install PyTorch for ROCm using AMD’s public PyTorch docker image, and can of
course build PyTorch for ROCm from source. With PyTorch 1.8, these existing installation options are
now complemented by the availability of an installable Python package.

The primary focus of ROCm has always been high performance computing at scale. The combined
capabilities of ROCm and AMD’s Instinct family of data center GPUs are particularly suited to the
challenges of HPC at data center scale. PyTorch is a natural fit for this environment, as HPC and ML
workflows become more intertwined.

Getting started with PyTorch for ROCm

The scope for this build of PyTorch is AMD GPUs with ROCm support, running on Linux. The GPUs
supported by ROCm include all of AMD’s Instinct family of compute-focused data center GPUs, along
with some other select GPUs. A current list of supported GPUs can be found in the ROCm Github
repository
. After confirming that the target system includes supported GPUs and the current 4.0.1
release of ROCm, installation of PyTorch follows the same simple Pip-based installation as any other
Python package. As with PyTorch builds for other platforms, the configurator at https://pytorch.org/getstarted/locally/ provides the specific command line to be run.

PyTorch for ROCm is built from the upstream PyTorch repository, and is a full featured implementation.
Notably, it includes support for distributed training across multiple GPUs and supports accelerated
mixed precision training.

More information

A list of ROCm supported GPUs and operating systems can be found at
https://github.com/RadeonOpenCompute/ROCm
General documentation on the ROCm platform is available at https://rocmdocs.amd.com/en/latest/
ROCm Learning Center at https://developer.amd.com/resources/rocm-resources/rocm-learning-center/ General information on AMD’s offerings for HPC and ML can be found at https://amd.com/hpc

Feedback

An engaged user base is a tremendously important part of the PyTorch ecosystem. We would be deeply
appreciative of feedback on the PyTorch for ROCm experience in the PyTorch discussion forum and, where appropriate, reporting any issues via Github.

Read More

Enable feature reuse across accounts and teams using Amazon SageMaker Feature Store

Amazon SageMaker Feature Store is a new capability of Amazon SageMaker that helps data scientists and machine learning (ML) engineers securely store, discover, and share curated data used in training and prediction workflows. As organizations build data-driven applications using ML, they’re constantly assembling and moving features between more and more functional teams. This constant movement of data can lead to inconsistencies in features and become a bottleneck when designing ML initiatives spanning multiple teams. For example, an ecommerce company might have several data science and engineering teams working on different aspects of their platform. The Core Search team focuses on query understanding and information retrieval tasks. The Product Success team solves problems involving customer reviews and feedback signals. The Personalization team uses clickstream and session data to create ML models for personalized recommendations. Additionally, data engineering teams like the Data Curation team can curate and validate user-specific information, which is an essential component that other teams can use. A Feature Store works as a unified interface between these teams, enabling one team to leverage the features generated by other teams, which minimizes the operational overhead of replicating and moving features across teams.

Training a production-ready ML model typically involves access to diverse set of features that aren’t always owned and maintained by the team that is building the model. A common practice for organizations that apply ML is to think of these data science teams as individual groups that work independently with limited collaboration. This results in ML workflows with no standardized way to share features across teams, which becomes a crucial limiting factor for data science productivity and makes it harder for data scientists to build new and complex models. With a shared feature store, organizations can achieve economies of scale. As more shared features become available, it becomes easier and cheaper for teams to build and maintain new models. These models can reuse features that are already developed, tested, and offered using a centralized feature store.

This post captures the essential cross-account architecture patterns for Feature Store that can be implemented in an organization with many data engineering and data science teams operating in different AWS accounts. We share how to enable sharing of features between accounts through a step-by-step example, which you can try out yourself with the code in our GitHub repo.

SageMaker Feature Store overview

By default, a SageMaker Feature Store is local to the account in which it is created, but it can also be centralized and shared by many accounts. An organization with multiple teams can have one centralized feature store that is shared across teams, as well as separate feature stores for use by individual teams. The separate stores can either hold feature groups that are of a sensitive nature or that are specific to a unique ML workload.

In this post, you first learn about the centralized feature store pattern. This pattern prescribes a central interface through which teams can create and publish new features, and from which other teams (or systems) can consume features. It also ensures that you have a single source of truth for feature data across your organization and simplifies resource management.

Next, you learn about the combined feature store pattern, which allows teams to maintain their own feature stores local to their account, while still being able to access shared features from the centralized feature store. These local feature stores are usually built for data science experimentation. By combining shared features from the centralized store with local features, teams can derive new enhanced features that can help when building more complex ML models. You can also use the local stores to store sensitive data that can’t be shared across the organization for regulatory and compliance reasons.

Lastly, we briefly cover a less common pattern involving replication of feature data.

Centralized feature store

Organizations can maximize the benefits of a feature store when it’s centralized. The centralized feature store pattern demonstrates how feature pipelines from multiple accounts can write to one centralized feature store, and how multiple other accounts can consume these features. This is a common pattern among mid- to large-sized enterprises where multiple teams manage different types of data or different parts of an application.

The process of hypothesizing, selecting, and transforming data inputs into a usable form suitable for ML models is called feature engineering. A feature pipeline encapsulates all the steps of the feature engineering process needed to convert raw data into useful features that ML models take as input for predictions. Maintaining feature pipelines is an expensive, time-consuming, and error-prone process. Also, replicating feature recipes and transformations across accounts can lead to inconsistencies and skew in feature characteristics. Because a centralized feature store facilitates knowledge sharing, teams don’t have to recreate feature recipes and rewrite pipelines from scratch in every project.

In this pattern, instead of writing features locally to an account-specific feature store, features are written to a centralized feature store. The centralized store serves as the central vault and creates a standardized way to access and maintain features for cross-team collaboration. It acts as an enabler and accelerator for AI adoption, reducing time to market for ML solutions, and allows for centralized governance and access control to ML features. You can grant access to external accounts, users, or roles to read and write individual feature groups in keeping with your data access policies. AWS recommends enforcing least privilege access to only the feature groups that you need for your job function. This is managed by the underlying AWS Identity and Access Management (IAM) policies. You can further refine access control with feature group tags and IAM conditions to decide which principals can perform specific actions. When you’re using a centralized store at scale, it’s important to also implement proper feature governance to ensure feature groups are well designed, have feature pipelines that are documented and supported, and have processes in place to ensure feature quality. This type of governance helps earn the trust required for feature reuse across teams.

Before walking through an example, let’s identify some key feature store concepts. First, feature groups are logical groups of features, typically originating from the same feature pipeline. An offline store contains large volumes of historical feature data used to create training and testing data for model development, or by batch applications for model scoring. The purpose of the online store is to serve these same features in real time with low latency. Unlike the offline store, which is append-only, the goal of the online store is to serve the most recent feature values. Behind the scenes, Feature Store automatically carries out data synchronization between the two stores. If you ingest new feature values into the online store, they’re automatically appended to the offline store. However, you can also create offline and online stores separately if this is a requirement for your team or project.

The following diagram depicts three functional teams, each with its own feature pipeline writing to a feature group in a centralized feature store.

The Personalization account manages user session data collected from a customer-facing application and owns a feature pipeline that produces a feature group called Sessions with features derived from the session data. This pipeline writes the generated feature values to the centralized feature store. Likewise, a feature pipeline in the Product Success account is responsible for producing features in the Reviews feature group, and the Data Curation account produces features in the Users feature group.

The centralized feature store account holds all features received from the three producer accounts, mapped to three feature groups: Sessions, Reviews, and Users. Feature pipelines can write to the centralized feature store by assuming a specific IAM role that is created in the centralized store account. We discuss how to enable this cross-account role later in this post. External accounts can also query features from the feature groups in the centralized store for training or inference, as shown in the preceding architecture diagram. For training, you can assume the IAM role from the centralized store and run cross-account Amazon Athena queries (as shown in the diagram), or initiate an Amazon EMR or SageMaker Processing job to create training datasets. In case of real-time inference, you can read online features directly via the same assumed IAM role for cross-account access.

In this model, the centralized feature store usually resides in a production account. Applications using this store can either live in this account or in other accounts with cross-account access to the centralized feature store. You can replicate this entire structure in lower environments, such as development or staging, for testing infrastructure changes before promoting them to production.

Combined feature store

In this section, we discuss a variant of the centralized feature store pattern called the combined feature store pattern. In feature engineering, a common practice is to combine existing features to derive new features. When teams combine shared features from the centralized store with local features in their own feature store, they can derive new enhanced features to help build more complex data models. We know from the previous section that the centralized store makes it easy for any data science team to access external features and use them with their existing pool of features to compound and evolve new features.

Security and compliance is another use case for teams to maintain a team-specific feature store in addition to accessing features from the centralized store. Many teams require specific access rights that aren’t granted to everyone in the organization. For example, it might not be feasible to publish features that are extracted from sensitive data to a centralized feature store within the organization.

In the following architecture diagram, the centralized feature store is the account that collects and catalogs all the features received from multiple feature pipelines into one central repository. In this example, the account of the combined store belongs to the Core Search team. This account is the consumer of the shareable features from the centralized store. In addition, this account manages user keyword data collected via a customer-facing search application.

This account maintains its own local offline and online stores. These local stores are populated by a feature pipeline set up locally to ingest user query keyword data and generate features. These features are grouped under a feature group named Keywords. Feature Store by default automatically creates an AWS Glue table for this feature group, which is registered in the AWS Glue Data Catalog in this account. The metadata of this table points to the Amazon S3 location of the feature group in this account’s offline store.

The combined store account can also access feature groups Sessions, Reviews, and Users from the centralized store. You can enable cross-account access by role, which we discuss in the next sections. Data scientists and researchers can use Athena to query feature groups created locally and join these internal features with external features derived from the centralized store for data science experiments.

Cross-account access overview

This section provides an overview of how to enable cross-account access for Feature Store between two accounts using an assumed role via AWS Security Token Service (AWS STS). AWS STS is a web service that enables you to request temporary, limited-privilege credentials for IAM users. AWS STS returns a set of temporary security credentials that you can use to access AWS resources that you might not normally have access to. These temporary credentials consist of an access key ID, secret access key, and security token.

To demonstrate this process, assume we have two accounts, A and B, as shown in the following diagram.

Account B maintains a centralized online and offline feature store. Account A needs access to both online and offline stores contained in Account B. To enable this, we create a role in Account B and let Account A assume that role using AWS STS. This enables Account A to behave like Account B, with permissions to perform specific actions identified by the role. AWS services like SageMaker (processing and training jobs, endpoints) and AWS Lambda used from Account A can assume the IAM role created in Account B by using an AWS STS client (see code block later in this post). This grants them the needed permissions to access resources like Amazon S3, Athena, and the AWS Glue Data Catalog inside Account B. After the services in Account A acquire the necessary permissions to the resources, they can access both the offline and online store in Account B. Depending on the choice of your service, you also need to add the IAM execution role for that service to the trusted policy of the cross-account IAM role in Account B. We discuss this in detail in the following section.

The preceding architecture diagram shows how Account A assumes a role from Account B to read and write to both online and offline stores contained within Account B. The seven steps in the diagram are as follows:

  1. Account B creates a role that can be assumed by others (for our use case, Account A).
  2. Account A assumes the IAM role from Account B using AWS STS. Account A can now generate temporary credentials that can be used to create AWS service clients that behave as if they are inside Account B.
  3. In Account A, SageMaker and other service clients (such as Amazon S3 and Athena) are created using the temporary credentials via the assumed role.
  4. The service clients in Account A can now create feature groups and populate feature values into Account B’s centralized online store using the AWS SDK.
  5. The online store in Account B automatically syncs with the offline store, also in Account B.
  6. The Athena service client inside Account A runs cross-account queries to read, group, and materialize feature sets using Athena tables inside Account B. Because the offline store exists in Account B, the corresponding AWS Glue tables, metadata catalog entries, and S3 objects all reside within Account B. Account A can use the AWS STS assume role to query the offline features (S3 objects) inside Account B.
  7. Athena query results are returned back as feature datasets into Account A’s S3 bucket.

The temporary credentials use the AWS STS GetSessionToken API and are limited to 1 hour. You can extend the duration of your session by using RefreshableCredentials, a Botocore class that can automatically refresh the credentials to work with your long-running applications beyond the 1-hour timeframe. An example notebook demonstrating this is available in our GitHub repo.

Create cross-account access

This section details all the steps to create the cross-account access roles, policies, and permissions to enable shareability of features between Accounts A and B according to our architecture.

Create a Feature Store access role

From Account B, we create a Feature Store access role. This is the role assumed by AWS services inside Account A to gain access to resources in Account B.

  1. On the IAM console, in the navigation pane, choose Roles.
  2. Choose Create role.
  3. Choose Another AWS account.
  4. For Account ID, enter the 12-digit account ID of Account B.
  5. Choose Next: Permissions.

  1. In the Permissions section, search for and attach the following AWS managed policies:
    1. AmazonSageMakerFullAccess (you can further restrict this to least privileges based on your use case)
    2. AmazonSageMakerFeatureStoreAccess
  2. Create and attach a custom policy to this new role (provide the S3 bucket name in Account A where the Athena query results collected in Account B are written):
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AthenaResultsS3BucketCrossAccountAccessPolicy",
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:PutObjectAcl",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::<ATHENA RESULTS BUCKET NAME IN ACCOUNT A>",
        "arn:aws:s3:::<ATHENA RESULTS BUCKET NAME IN ACCOUNT A>/*"
      ]
    }
  ]
}

When you use this new AWS STS cross-account role from Account A, it can run Athena queries against the offline store content in Account B. The custom policy allows Athena (inside Account B) to write back the results to a results bucket in Account A. Make sure that this results bucket is created in Account A before you create the preceding policy.

Alternatively, you can let the centralized feature store in Account B maintain all the Athena query results in an S3 bucket. In this case, you have to set up cross-account Amazon S3 read access policies for external accounts to read the saved results (S3 objects).

  1. After you attach the policies, choose Next.
  2. Enter a name for this role (for example, cross-account-assume-role).
  3. On the Summary page for the created role, under Trust relationships, choose Edit trust relationship.
  4. Edit the access control policy document as shown in the following code:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<ACCOUNT A ID>:root"
        ],
        "Service": [
          "sagemaker.amazonaws.com",
          "athena.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

The preceding code adds SageMaker and Athena as services in the Principal section. If you want more external accounts or roles to assume this role, you can add their corresponding ARNs in this section.

Create a SageMaker notebook instance

From Account A, create a SageMaker notebook instance with an IAM execution role. This role grants the SageMaker notebook in Account A the needed permissions to run actions on the feature store inside Account B. Alternatively, if you’re not using a SageMaker notebook and using Lambda instead, you need to create a role for Lambda with the same attached policies as shown in this section.

By default, the following policies are attached when you create a new execution role for a SageMaker notebook:

  • AmazonSageMaker-ExecutionPolicy
  • AmazonSageMakerFullAccess

We need to create and attach two additional custom policies. First, create a custom policy with the following code, which allows the execution role in Account A to perform certain S3 actions needed to interact with the offline store in Account B:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "FeatureStoreS3AccessPolicy",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetBucketAcl",
        "s3:GetObjectAcl"
      ],
      "Resource": [
        "arn:aws:s3:::<OFFLINE STORE BUCKET NAME IN ACCOUNT B>",
        "arn:aws:s3:::<OFFLINE STORE BUCKET NAME IN ACCOUNT B>/*"
      ]
    }
  ]
}

You can also attach the AWS managed policy AmazonSageMakerFeatureStoreAccess, if your offline store S3 bucket name contains the SageMaker keyword.

Second, create the following custom policy, which allows the SageMaker notebook in Account A to assume the role (cross-account-assume-role) created in Account B:

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": "sts:AssumeRole",
    "Resource": "arn:aws:iam::<ACCOUNT B ID>:role/cross-account-assume-role"
  }
}

We know Account A can access the online and offline store in Account B. When Account A assumes the cross-account AWS STS role of Account B, it can run Athena queries inside Account B against its offline store. However, the results of these queries (feature datasets) need to be saved in Account A’s S3 bucket in order to enable model training. Therefore, we need to create a bucket in Account A that can store the Athena query results as well as create a bucket policy (see the following code). This policy allows the cross-account AWS STS role to write and read objects in this bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "MyStatementSid",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<ACCOUNT B>:role/cross-account-assume-role"
        ]
      },
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::<ATHENA RESULTS BUCKET NAME IN ACCOUNT A>",
        "arn:aws:s3:::<ATHENA RESULTS BUCKET NAME IN ACCOUNT A>/*"
      ]
    }
  ]
}

Modify the trust relationship policy

Because we created an IAM execution role in Account A, we use the ARN of this role to modify the trust relationships policy of the cross-account assume role in Account B:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "ARN OF SAGEMAKER EXECUTION ROLE CREATED IN ACCOUNT A"
        ],
        "Service": [
          "sagemaker.amazonaws.com",
          "athena.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

Validate the setup process

After you set up all the roles and accompanying policies, you can validate the setup by running the example notebooks in the GitHub repo. The following code block is an excerpt from the example notebook and must be run in a SageMaker notebook running within Account A. It demonstrates how you can assume the cross-account role from Account B using AWS STS via the AssumeRole API call. This call returns a set of temporary credentials that Account A can use to create any service clients. When you use these clients, your code uses the permissions of the assumed role, and acts as if it belongs to Account B. For more information, see assume_role in the AWS SDK for Python (Boto 3) documentation.

 import boto3


# Create STS client
sts = boto3.client('sts')

# Role assumption B -> A
CROSS_ACCOUNT_ASSUME_ROLE = 'arn:aws:iam::<ACCOUNT B ID>:role/cross-account-assume-role'
metadata = sts.assume_role(RoleArn=CROSS_ACCOUNT_ASSUME_ROLE, 
                           RoleSessionName='FeatureStoreCrossAccountAccessDemo')

# Get temporary credentials
access_key_id = metadata['Credentials']['AccessKeyId']
secret_access_key = metadata['Credentials']['SecretAccessKey']
session_token = metadata['Credentials']['SessionToken']

region = boto3.Session().region_name
boto_session = boto3.Session(region_name=region)

# Create SageMaker client
sagemaker_client = boto3.client('sagemaker', 
                                 aws_access_key_id=access_key_id,
                                 aws_secret_access_key=secret_access_key,
                                 aws_session_token=session_token)

# Create SageMaker Feature Store runtime client
sagemaker_featurestore_runtime_client = boto3.client(service_name='sagemaker-featurestore-runtime', 
                                                     aws_access_key_id=access_key_id,
                                                     aws_secret_access_key=secret_access_key,
                                                     aws_session_token=session_token)
                                         
. . .

offline_config = {'OfflineStoreConfig': {'S3StorageConfig': {'S3Uri': f's3://{OFFLINE_STORE_BUCKET}'}}}

sagemaker_client.create_feature_group(FeatureGroupName=FEATURE_GROUP_NAME,
                                    RecordIdentifierFeatureName=record_identifier_feature_name,
                                    EventTimeFeatureName=event_time_feature_name,
                                    FeatureDefinitions=feature_definitions,
                                    Description='< DESCRIPTION >',
                                    Tags='< LIST OF TAGS >',
                                    OnlineStoreConfig={'EnableOnlineStore': True},
                                    RoleArn=CROSS_ACCOUNT_ASSUME_ROLE,
                                    **offline_config)

. . .

sagemaker_featurestore_runtime_client.put_record(FeatureGroupName=FEATURE_GROUP_NAME, 
                                                 Record=record)

After you create the SageMaker clients as per the preceding code example in Account A, you can create feature groups and populate features into Account B’s centralized online and offline store. For more information about how to create, describe, and delete feature groups, see create_feature_group in the Boto3 documentation. You can also use the Feature Store runtime client to put and get feature records to and from feature groups.

Offline store replication

Reproducibility is the ability to recreate an ML model exactly, so if you use the same features as input, the model returns the same output as the original model. This is essentially what we strive to achieve between the models we develop in a research environment and deploy in a production environment. Replicating feature engineering pipelines across accounts is a complex and time-consuming process that can introduce model discrepancies if not implemented properly. If the feature set used to train a model changes after the training phase, it may be difficult or impossible to reproduce a model.

Applications that reside on AWS usually have several distinct environments and accounts, such as development, testing, staging, and production. To achieve automated deployment of the application across different environments, we use CI/CD pipelines. Organizations often need to maintain isolated work environments and multiple copies of data in the same or different AWS Regions, or across different AWS accounts. In the context of Feature Store, some companies may want to replicate offline feature store data. Offline store replication via Amazon S3 replication can be a useful pattern in this case. This pattern enables isolated environments and accounts to retrain ML models using full feature sets without using cross-account roles or permissions.

Conclusion

In this post, we demonstrated various architecture patterns like the centralized feature store, combined feature store, and other design considerations for SageMaker Feature Store that are essential to cross-functional data science collaboration. We also showed how to set up cross-account access using AWS STS.

To learn more about Feature Store capabilities and use cases, see Understanding the key capabilities of Amazon SageMaker Feature Store and Using streaming ingestion with Amazon SageMaker Feature Store to make ML-backed decisions in near-real time.

If you have any comments or questions, please leave them in the comments section.


About the Authors

Arunprasath Shankar is an Artificial Intelligence and Machine Learning (AI/ML) Specialist Solutions Architect with AWS, helping global customers scale their AI solutions effectively and efficiently in the cloud. In his spare time, Arun enjoys watching sci-fi movies and listening to classical music.

 

 

Mark Roy is a Principal Machine Learning Architect for AWS, helping AWS customers design and build AI/ML solutions. Mark’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. He has helped companies in many industries, including Insurance, Financial Services, Media and Entertainment, Healthcare, Utilities, and Manufacturing. Mark holds 6 AWS certifications, including the ML Specialty Certification. Prior to joining AWS, Mark was an architect, developer, and technology leader for 25+ years, including 19 years in financial services.

Stefan Natu is a Sr. AI/ML Specialist Solutions Architect at Amazon Web Services. He is focused on helping financial services customers build end-to-end machine learning solutions on AWS. In his spare time, he enjoys reading machine learning blogs, playing the guitar, and exploring the food scene in New York City.

Read More

Core Data Science researchers discuss research award opportunity in adaptive experimentation

On February 24, Facebook launched a request for proposals (RFP) on sample-efficient sequential Bayesian decision making, which closes on April 21. With this RFP, the Facebook Core Data Science (CDS) team hopes to deepen its ties to the academic research community by seeking out innovative ideas and applications of Bayesian optimization that further advance the field. To provide an inside look from the team behind the RFP, we reached out to Eytan Bakshy and Max Balandat, who are leading the effort within CDS.
View RFPBakshy leads the Adaptive Experimentation team, which seeks to improve the throughput of experimentation with the help of machine learning and statistical learning. Balandat supports the team’s efforts on modeling and optimization, which is primarily focused on probabilistic models and Bayesian optimization. In this Q&A, Bakshy and Balandat contextualize the RFP by sharing more information about how the work of their team relates to the areas of interest for the call.

Q: What’s the goal of this RFP?

A: Primarily, we are keen to learn more about all the great research that is going on in this area. Conversely, we are also able to share a number of really interesting real-world use cases that we hope can help inspire additional applied research, and increase interest and research activity into sample-efficient sequential Bayesian decision making. Lastly, we aim to further strengthen our ties to academia and our collaborations with academics who are at the forefront of this area.

We are both excited to dive in and learn about creative applications and approaches to Bayesian optimization that researchers come up with in their proposals.

Q: What inspired you to launch this RFP?

A: We publish quite a bit in top-tier AI/ML venues, and all our papers are informed by very practical problems we face every day in our work. The need for exploring large design spaces via experiments with a limited budget is widespread across Facebook, Instagram, and Facebook Reality Labs. Much of our team’s work focuses on applied problems to help support the company and use-inspired basic research, but it is clear that there are plenty of ideas out there that can advance the area of sample-efficient sequential decision making, such as Bayesian optimization and related techniques.

In academia, it can sometimes be challenging to understand what exactly the most relevant and impactful “real-world” problems are. Conversely, academics may have an easier time taking a step back, looking at the bigger picture, and doing more exploratory research. With this RFP, we hope to help bridge this gap and foster increased collaboration and cross-pollination between industry and academia.

Q: What is Bayesian optimization and how is it applied at Facebook?

A: Bayesian optimization is a set of methodologies for exploring large design spaces on a limited budget. While Bayesian optimization is frequently used for hyperparameter optimization in machine learning (AutoML), our team’s work had originally been motivated by the use of online experiments (A/B tests) for optimizing software and recommender systems.

Since then, the applications for Bayesian optimization have expanded tremendously in scope, with applications ranging from the design of next-generation AR/VR hardware, to bridging the gap between simulations and real-world experiments, to efforts around providing affordable connectivity solutions to developing countries.

The main idea behind Bayesian optimization is to fit a probabilistic surrogate model to the “black-box” function one is trying to optimize, and then use this model to inform at which new parameters to evaluate the function next. Doing so allows for a principled way of trading off reducing uncertainty with exploring promising regions of the parameter space. As described earlier, we apply this approach to a large variety of problems in different domains at Facebook.

Q: What is BoTorch, and how does it relate to the RFP?

A: The adaptive experimentation team has been investing in methodological development and tooling for Bayesian optimization for over five years. A few years ago, we found that our current tooling was slowing down researchers’ ability to generate new ideas and engineers’ ability to scale out Bayesian optimization use cases.

To address these problems, we developed BoTorch, a framework for Bayesian optimization research, and Ax, a robust platform for adaptive experimentation. BoTorch follows the same modular design philosophy as PyTorch, which makes it very easy for users to swap out or rearrange individual components in order to customize all aspects of their algorithm, thereby empowering researchers to do state-of-the art research on modern Bayesian optimization methods. By exploiting modern parallel computing paradigms on both CPUs and GPUs, it is also fast.

BoTorch has really changed the way we approach Bayesian optimization research and accelerates our ability to tackle new problems. With the RFP, we hope to attract more widespread interest in this area and raise awareness of our open source tools.

Q: Where can people stay updated and learn more?

A: We actively engage with researchers on Twitter, so follow @eytan and @maxbalandat for the latest research, and always feel free to reach out to us via Twitter, email, or GitHub Issues if you have any questions or ideas.

You can find the latest and greatest of what we are working on in our open source projects, BoTorch and Ax. It also helps to keep an eye out for our papers in machine learning conferences, such as NeurIPS, ICML, and AISTATS.

—-

Applications for the RFP on sample-efficient sequential Bayesian decision making close on April 21, 2021, and winners will be announced the following month. To receive updates about new research award opportunities and deadline notifications, subscribe to our RFP email list.

The post Core Data Science researchers discuss research award opportunity in adaptive experimentation appeared first on Facebook Research.

Read More

Progress and Challenges in Long-Form Open-Domain Question Answering

Posted by Aurko Roy, Research Scientist, Google Research

Open-domain long-form question answering (LFQA) is a fundamental challenge in natural language processing (NLP) that involves retrieving documents relevant to a given question and using them to generate an elaborate paragraph-length answer. While there has been remarkable recent progress in factoid open-domain question answering (QA), where a short phrase or entity is enough to answer a question, much less work has been done in the area of long-form question answering. LFQA is nevertheless an important task, especially because it provides a testbed to measure the factuality of generative text models. But, are current benchmarks and evaluation metrics really suitable for making progress on LFQA?

In “Hurdles to Progress in Long-form Question Answering” (to appear at NAACL 2021), we present a new system for open-domain long-form question answering that leverages two recent advances in NLP: 1) state-of-the-art sparse attention models, such as Routing Transformer (RT), which allow attention-based models to scale to long sequences, and 2) retrieval-based models, such as REALM, which facilitate retrievals of Wikipedia articles related to a given query. To encourage more factual grounding, our system combines information from several retrieved Wikipedia articles related to the given question before generating an answer. It achieves a new state of the art on ELI5, the only large-scale publicly available dataset for long-form question answering.

However, while our system tops the public leaderboard, we discover several troubling trends with the ELI5 dataset and its associated evaluation metrics. In particular, we find 1) little evidence that models actually use the retrievals on which they condition; 2) that trivial baselines (e.g., input copying) beat modern systems, like RAG / BART+DPR; and 3) that there is a significant train/validation overlap in the dataset. Our paper suggests mitigation strategies for each of these issues.

Text Generation
The main workhorse of NLP models is the Transformer architecture, in which each token in a sequence attends to every other token in a sequence, resulting in a model that scales quadratically with sequence length. The RT model introduces a dynamic, content-based sparse attention mechanism that reduces the complexity of attention in the Transformer model from n2 to n1.5, where n is the sequence length, which enables it to scale to long sequences. This allows each word to attend to other relevant words anywhere in the entire piece of text, unlike methods such as Transformer-XL where a word can only attend to words in its immediate vicinity.

The key insight of the RT work is that each token attending to every other token is often redundant, and may be approximated by a combination of local and global attention. Local attention allows each token to build up a local representation over several layers of the model, where each token attends to a local neighborhood, facilitating local consistency and fluency. Complementing local attention, the RT model also uses mini-batch k-means clustering to enable each token to attend only to a set of most relevant tokens.

Attention maps for the content-based sparse attention mechanism used in Routing Transformer. The word sequence is represented by the diagonal dark colored squares. In the Transformer model (left), each token attends to every other token. The shaded squares represent the tokens in the sequence to which a given token (the dark square) is attending. The RT model uses both local attention (middle), where tokens attend only to other tokens in their local neighborhood, and routing attention (right), in which a token only attends to clusters of tokens most relevant to it in context. The dark red, green and blue tokens only attend to the corresponding color of lightly shaded tokens.

We pre-train an RT model on the Project Gutenberg (PG-19) data-set with a language modeling objective, i.e, the model learns to predict the next word given all the previous words, so as to be able to generate fluent paragraph long text.

Information Retrieval
To demonstrate the effectiveness of the RT model on the task of LFQA, we combine it with retrievals from REALM. The REALM model (Guu et al. 2020) is a retrieval-based model that uses the maximum inner product search to retrieve Wikipedia articles relevant to a particular query or question. The model was fine-tuned for factoid-based question answering on the Natural Questions dataset. REALM utilizes the BERT model to learn good representations for a question and uses SCANN to retrieve Wikipedia articles that have a high topical similarity with the question representation. This is then trained end-to-end to maximize the log-likelihood on the QA task.

We further improve the quality of REALM retrievals by using a contrastive loss. The idea behind this is to encourage the representation of a question to get close to its ground truth answer and diverge from the other answers in its mini-batch. This ensures that when the system retrieves relevant items using this question representation, it returns articles that are “similar” to ground truth answers. We call this retriever contrastive-REALM or c-REALM.

The combined RT + c-REALM system for LFQA.

Evaluation
We test the model on long-form question answering using the ELI5 dataset, which is a part of the KILT benchmark, and is the only publicly available large-scale LFQA dataset. The KILT benchmark measures text retrievals using Precision (R-Prec) and text generation using ROUGE-L. The two scores are combined to give a KILT R-L score, which determines a model’s ranking on the leaderboard. We fine-tune the pre-trained RT model together with retrievals from c-REALM on the ELI5 dataset from KILT.

Our submission tops the KILT leaderboard for long-form question answering on ELI5 with a combined KILT R-L score of 2.36. It improves on the previous leaderboard entry of BART + DPR (KILT R-L score of 1.9), while having a similar number of parameters as the other models on the leaderboard. In terms of text generation quality, we see an improvement of +4.11, +5.78 and +9.14 Rouge-L over T5, BART + DPR and RAG, respectively.

Our results on the KILT leaderboard for long form question answering on ELI5.

Example Generations from the RT Model

  • Q: Why are almost all boats white?

    Boats are generally painted white for obvious reasons, as they are inexpensive and easy to clean. Ships are also often painted to provide a visible visual contrast to the sea. Boats also often have high visibility ports to allow for easy international trade. In addition, many boats have an air-breathing function that makes them easier to spot and stay afloat, which is a good thing if you’re a pirate or trying to get into some country’s waters. Lastly, many ships are built in the last few decades, so they are not as exotic looking as they were thirty years ago. And lastly, many ships are made out of materials that are easy to work with, and so many of them are painted white as a practical matter. There are many, many more reasons for why ships are white, but those are the ones I know of. I’m sure there are many other reasons, but those are the ones I know of. Edit: I forgot to mention that the other answers about visibility and easy to see ports are also true. Ships are painted to appear to be larger, and more expensive. That makes them appear more impressive, and so they are more likely to be bought by people who want to show off their wealth and status.

  • Q: What causes the trail behind jets at high altitude?

    The same thing that causes the clouds you see at the start of a storm to form in the first place. The air pressure changes as the jet moves through the air, causing it to cool down and condense into water droplets. These droplets are in fact a lot bigger than the water droplets in the clouds, so when the plane passes through them, they can cause the air pressure to change, causing the water droplets to condense into clouds. This is what you see when you look at a cloud formation from the ground. In addition to this, when the plane passes through the clouds, it also moves through the air, causing it to change its temperature, causing the water droplets to condense and form clouds. These clouds are often quite large, and can block out the sun. This is what you see when you look at a contrail. Finally, the water vapor that forms at high altitudes is cooled by the air above, and as it cools, it condenses into clouds. This is what you see when you look at a cumulonimbus cloud. These clouds are even larger than the water droplets in the clouds, and they block out the sun.

  • Q: How do shampoo and conditioner work?

    The key ingredients in shampoo and conditioner are usually surfactants, they tend to attract oil from your skin. Shampoo will usually contain an anti-dandruff agent (dandruff is caused by fungi in your skin which cause dandruff). They also contain some moisturizing agents (moisturizers are things that stay in your skin longer and protect your scalp from cleaning), and a lot of anti-dandruff (dandruff is a side effect of high doses of both surfactants and moisturizers) Conditioner will contain moisturizing agents, like a skin moisturizer. They tend to cause oils to be separated from your hair (e.g. wash your hair with conditioner instead of shampoo). They both have a foaming agent that makes them foam up (usually Hydrogen Peroxide). They also have a foaming agent that stops dirt from attaching to your hair. Some conditioner will also suppress the action of the shampoo (to help prevent dirt from attaching to your hair).

  • Q: When glancing at a clock, why does the first second after glancing at it sometimes feel longer than the rest?

    This is known as chronostasis. Your brain gets used to seeing the second hand move after the first, so it ignores the first one. If you were to look at the clock a lot, you’d see the second hand move a lot more than the first, and you’d be able to tell the difference between the second and the first. The second hand will appear to move much slower than the first hand because your brain is used to seeing the second hand move a lot more than the first. If you were to go on a long trip, you’d get used to the second hand moving a lot more than the first, and you’d have no problem telling the difference between the second and the first hand. It’s kind of like how you don’t notice your nose when you’re driving, but if you look at it for a while, you start to notice it. Your brain is just getting used to seeing the nose move a lot more than the other, and it can’t tell the difference between the two. It’s kind of like that with your eyes.

  • Q: Why do Britain and other English empire countries still bow to monarchs? What real purpose does the queen serve?

    Depends what you define as “purpose”. There is no specific purpose for the British monarch – rather, they are figureheads that serve a role. The role is to have a high degree of influence over a country, though the rules of the monarchy can somewhat vary from country to country, and the ceremony of “kissing the royal corpse” does not always involve kissing the monarch. Whether that’s the only reason for the custom, or if it was the only reason, I can’t say, but that’s the reasoning. When the U.K. was in full power, the monarch was a puppet of Parliament, and the powers of the monarch were transferred to the Prime Minister and thus to the Prime Minister’s deputy, who then became the Prime Minister. Since then, Parliament has been able to vote on legislation that goes through the monarch, although they may still act as the monarch’s representative in negotiating treaties, which can have very very deep consequences. The Queen’s role, as a representative of Britain, doesn’t necessarily involve her formal approval of any of the laws or legislation that goes through Parliament, though.

  • Q: What exactly is fire, in detail? How can light and heat come from something we can’t really touch?

    Fire is the chemical reaction of fuel (oxygen) with (what we call) impurities (ash, soot, oil etc). These impurities are created due to a number of things. If you were to study the chemical composition of a log (or a candle) and analyse it you would discover the chemical formula (A, C, and G) which is a simplified version of what actually happens. There is always fuel, and there is always the ash/soot. Wood is made up of mostly carbon and hydrogen and these are the fuels that create the fire. Hydrogen and oxygen are reactive, and they form water (H2O) and CO2 (CO2). If we split a log into smaller pieces, we start to see the process of combustion as it occurs. The carbon is burned, and with the explosion of CO, H2O and H2CO2, we get water and carbon dioxide (mostly). If we smash a log together, we start to see what happens when we also hit another log with another log…imagine smashing two sand castles together. We see how the sand castles and brick castles form and collapse at different rates…then it starts to burn. The smoke, ash and flames are the actual combustion of fuel.

  • Q: What goes on in those tall tower buildings owned by major banks?

    The actual buildings are not there for profit but for show. There are many real world historical buildings that have little to no effect other than being a symbol of an economic entity or symbol of an era or location, and nothing more. For example look at Sears, Sears didn’t care what went on inside, it was all about the _appearance_ of its location, the prestige of the location, the facilities and so on. It didn’t care about how long it took it to operate, it was about how much people would pay to go see it. Sears was a landmark as a cultural movement and other big companies followed suit, so if you want to see a building you’ve never seen before, you have to go see Sears, just like you have to see a Toyota Camry for Toyota Camry. They used to be all about building new factories, some of them if I recall, but now that they’re bigger, that means that more factory jobs are coming to them. You’ve probably seen them in stores as stores where people buy and sell stuff, so there aren’t that many places for them to come from. Instead, it’s just for show, a symbol of rich people.

Hurdles Towards Progress in LFQA
However, while the RT system described here tops the public leaderboard, a detailed analysis of the model and the ELI5 dataset reveal some concerning trends.

  • Many held-out questions are paraphrased in the training set. Best answer to similar train questions gets 27.4 ROUGE-L.

  • Simply retrieving answers to random unrelated training questions yields relatively high ROUGE-L, while actual gold answers underperform generations.

  • Conditioning answer generation on random documents instead of relevant ones does not measurably impact its factual correctness. Longer outputs get higher ROUGE-L.

We find little to no evidence that the model is actually grounding its text generation in the retrieved documents — fine-tuning an RT model with random retrievals from Wikipedia (i.e., random retrieval + RT) performs nearly as well as the c-REALM + RT model (24.2 vs 24.4 ROUGE-L). We also find significant overlap in the training, validation and test sets of ELI5 (with several questions being paraphrases of each other), which may eliminate the need for retrievals. The KILT benchmark measures the quality of retrievals and generations separately, without making sure that the text generation actually use the retrievals.

Trivial baselines get higher Rouge-L scores than RAG and BART + DPR.

Moreover, we find issues with the Rouge-L metric used to evaluate the quality of text generation, with trivial nonsensical baselines, such as a Random Training Set answer and Input Copying, achieving relatively high Rouge-L scores (even beating BART + DPR and RAG).

Conclusion
We proposed a system for long form-question answering based on Routing Transformers and REALM, which tops the KILT leaderboard on ELI5. However, a detailed analysis reveals several issues with the benchmark that preclude using it to inform meaningful modelling advances. We hope that the community works together to solve these issues so that researchers can climb the right hills and make meaningful progress in this challenging but important task.

Acknowledgments
The Routing Transformer work has been a team effort involving Aurko Roy, Mohammad Saffar, Ashish Vaswani and David Grangier. The follow-up work on open-domain long-form question answering has been a collaboration involving Kalpesh Krishna, Aurko Roy and Mohit Iyyer. We wish to thank Vidhisha Balachandran, Niki Parmar and Ashish Vaswani for several helpful discussions, and the REALM team (Kenton Lee, Kelvin Guu, Ming-Wei Chang and Zora Tung) for help with their codebase and several useful discussions, which helped us improve our experiments. We are grateful to Tu Vu for help with the QQP classifier used to detect paraphrases in ELI5 train and test sets. We thank Jules Gagnon-Marchand and Sewon Min for suggesting useful experiments on checking ROUGE-L bounds. Finally we thank Shufan Wang, Andrew Drozdov, Nader Akoury and the rest of the UMass NLP group for helpful discussions and suggestions at various stages in the project.

Read More