Deploying machine learning to improve mental health

A machine-learning expert and a psychology researcher/clinician may seem an unlikely duo. But MIT’s Rosalind Picard and Massachusetts General Hospital’s Paola Pedrelli are united by the belief that artificial intelligence may be able to help make mental health care more accessible to patients.

In her 15 years as a clinician and researcher in psychology, Pedrelli says “it’s been very, very clear that there are a number of barriers for patients with mental health disorders to accessing and receiving adequate care.” Those barriers may include figuring out when and where to seek help, finding a nearby provider who is taking patients, and obtaining financial resources and transportation to attend appointments. 

Pedrelli is an assistant professor in psychology at the Harvard Medical School and the associate director of the Depression Clinical and Research Program at Massachusetts General Hospital (MGH). For more than five years, she has been collaborating with Picard, an MIT professor of media arts and sciences and a principal investigator at MIT’s Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic) on a project to develop machine-learning algorithms to help diagnose and monitor symptom changes among patients with major depressive disorder.

Machine learning is a type of AI technology where, when the machine is given lots of data and examples of good behavior (i.e., what output to produce when it sees a particular input), it can get quite good at autonomously performing a task. It can also help identify patterns that are meaningful, which humans may not have been able to find as quickly without the machine’s help. Using wearable devices and smartphones of study participants, Picard and Pedrelli can gather detailed data on participants’ skin conductance and temperature, heart rate, activity levels, socialization, personal assessment of depression, sleep patterns, and more. Their goal is to develop machine learning algorithms that can intake this tremendous amount of data, and make it meaningful — identifying when an individual may be struggling and what might be helpful to them. They hope that their algorithms will eventually equip physicians and patients with useful information about individual disease trajectory and effective treatment.

“We’re trying to build sophisticated models that have the ability to not only learn what’s common across people, but to learn categories of what’s changing in an individual’s life,” Picard says. “We want to provide those individuals who want it with the opportunity to have access to information that is evidence-based and personalized, and makes a difference for their health.”

Machine learning and mental health

Picard joined the MIT Media Lab in 1991. Three years later, she published a book, “Affective Computing,” which spurred the development of a field with that name. Affective computing is now a robust area of research concerned with developing technologies that can measure, sense, and model data related to people’s emotions. 

While early research focused on determining if machine learning could use data to identify a participant’s current emotion, Picard and Pedrelli’s current work at MIT’s Jameel Clinic goes several steps further. They want to know if machine learning can estimate disorder trajectory, identify changes in an individual’s behavior, and provide data that informs personalized medical care. 

Picard and Szymon Fedor, a research scientist in Picard’s affective computing lab, began collaborating with Pedrelli in 2016. After running a small pilot study, they are now in the fourth year of their National Institutes of Health-funded, five-year study. 

To conduct the study, the researchers recruited MGH participants with major depression disorder who have recently changed their treatment. So far, 48 participants have enrolled in the study. For 22 hours per day, every day for 12 weeks, participants wear Empatica E4 wristbands. These wearable wristbands, designed by one of the companies Picard founded, can pick up information on biometric data, like electrodermal (skin) activity. Participants also download apps on their phone which collect data on texts and phone calls, location, and app usage, and also prompt them to complete a biweekly depression survey. 

Every week, patients check in with a clinician who evaluates their depressive symptoms. 

“We put all of that data we collected from the wearable and smartphone into our machine-learning algorithm, and we try to see how well the machine learning predicts the labels given by the doctors,” Picard says. “Right now, we are quite good at predicting those labels.” 

Empowering users

While developing effective machine-learning algorithms is one challenge researchers face, designing a tool that will empower and uplift its users is another. Picard says, “The question we’re really focusing on now is, once you have the machine-learning algorithms, how is that going to help people?” 

Picard and her team are thinking critically about how the machine-learning algorithms may present their findings to users: through a new device, a smartphone app, or even a method of notifying a predetermined doctor or family member of how best to support the user. 

For example, imagine a technology that records that a person has recently been sleeping less, staying inside their home more, and has a faster-than-usual heart rate. These changes may be so subtle that the individual and their loved ones have not yet noticed them. Machine-learning algorithms may be able to make sense of these data, mapping them onto the individual’s past experiences and the experiences of other users. The technology may then be able to encourage the individual to engage in certain behaviors that have improved their well-being in the past, or to reach out to their physician. 

If implemented incorrectly, it’s possible that this type of technology could have adverse effects. If an app alerts someone that they’re headed toward a deep depression, that could be discouraging information that leads to further negative emotions. Pedrelli and Picard are involving real users in the design process to create a tool that’s helpful, not harmful.

“What could be effective is a tool that could tell an individual ‘The reason you’re feeling down might be the data related to your sleep has changed, and the data relate to your social activity, and you haven’t had any time with your friends, your physical activity has been cut down. The recommendation is that you find a way to increase those things,’” Picard says. The team is also prioritizing data privacy and informed consent.

Artificial intelligence and machine-learning algorithms can make connections and identify patterns in large datasets that humans aren’t as good at noticing, Picard says. “I think there’s a real compelling case to be made for technology helping people be smarter about people.”

Read More

Resolving High-Energy Impacts on Quantum Processors

Quantum processors are made of superconducting quantum bits (qubits) that — being quantum objects — are highly susceptible to even tiny amounts of environmental noise. This noise can cause errors in quantum computation that need to be addressed to continue advancing quantum computers. Our Sycamore processors are installed in specially designed cryostats, where they are sealed away from stray light and electromagnetic fields and are cooled down to very low temperatures to reduce thermal noise.

However, the world is full of high-energy radiation. In fact, there’s a tiny background of high-energy gamma rays and muons that pass through everything around us all the time. While these particles interact so weakly that they don’t cause any harm in our day-to-day lives, qubits are sensitive enough that even weak particle interactions can cause significant interference.

In “Resolving Catastrophic Error Bursts from Cosmic Rays in Large Arrays of Superconducting Qubits”, published in Nature Physics, we identify the effects of these high-energy particles when they impact the quantum processor. To detect and study individual impact events, we use new techniques in rapid, repetitive measurement to operate our processor like a particle detector. This allows us to characterize the resulting burst of errors as they spread through the chip, helping to better understand this important source of correlated errors.

The Dynamics of a High-Energy Impact
The Sycamore quantum processor is constructed with a very thin layer of superconducting aluminum on a silicon substrate, onto which a pattern is etched to define the qubits. At the center of each qubit is the Josephson junction, a superconducting component that defines the distinct energy levels of the qubit, which are used for computation. In a superconducting metal, electrons bind together into a macroscopic, quantum state, which allows electrons to flow as a current with zero resistance (a supercurrent). In superconducting qubits, information is encoded in different patterns of oscillating supercurrent going back and forth through the Josephson junction.

If enough energy is added to the system, the superconducting state can be broken up to produce quasiparticles. These quasiparticles are a problem, as they can absorb energy from the oscillating supercurrent and jump across the Josephson junction, which changes the qubit state and produces errors. To prevent any energy from being absorbed by the chip and producing quasiparticles, we use extensive shielding for electric and magnetic fields, and powerful cryogenic refrigerators to keep the chip near absolute zero temperature, thus minimizing the thermal energy.

A source of energy that we can’t effectively shield against is high-energy radiation, which includes charged particles and photons that can pass straight through most materials. One source of these particles are tiny amounts of radioactive elements that can be found everywhere, e.g., in building materials, the metal that makes up our cryostats, and even in the air. Another source is cosmic rays, which are extremely energetic particles produced by supernovae and black holes. When cosmic rays impact the upper atmosphere, they create a shower of high-energy particles that can travel all the way down to the surface and through our chip. Between radioactive impurities and cosmic ray showers, we expect a high energy particle to pass through a quantum chip every few seconds.

When a high-energy impact event occurs, energy spreads through the chip in the form of phonons. When these arrive at the superconducting qubit layer, they break up the superconducting state and produce quasiparticles, which cause the qubit errors we observe.

When one of these particles impinges on the chip, it passes straight through and deposits a small amount of its energy along its path through the substrate. Even a small amount of energy from these particles is a very large amount of energy for the qubits. Regardless of where the impact occurs, the energy quickly spreads throughout the entire chip through quantum vibrations called phonons. When these phonons hit the aluminum layer that makes up the qubits, they have more than enough energy to break the superconducting state and produce quasiparticles. So many quasiparticles are produced that the probability of the qubits interacting with one becomes very high. We see this as a sudden and significant increase in errors over the whole chip as those quasiparticles absorb energy from the qubits. Eventually, as phonons escape and the chip cools, these quasiparticles recombine back into the superconducting state, and the qubit error rates slowly return to normal.

A high-energy particle impact (at time = 0 ms) on a patch of the quantum processor, showing error rates for each qubit over time. The event starts by rapidly spreading error over the whole chip, before saturating and then slowly returning to equilibrium.

Detecting Particles with a Computer
The Sycamore processor is designed to perform quantum error correction (QEC) to improve the error rates and enable it to execute a variety of quantum algorithms. QEC provides an effective way of identifying and mitigating errors, provided they are sufficiently rare and independent. However, in the case of a high-energy particle going through the chip, all of the qubits will experience high error rates until the event cools off, producing a correlated error burst that QEC won’t be able to correct. In order to successfully perform QEC, we first have to understand what these impact events look like on the processor, which requires operating it like a particle detector.

To do so, we take advantage of recent advances in qubit state preparation and measurement to quickly prepare each qubit in their excited state, similar to flipping a classical bit from 0 to 1. We then wait for a short idle time and measure whether they are still excited. If the qubits are behaving normally, almost all of them will be. Further, the qubits that experience a decay out of their excited state won’t be correlated, meaning the qubits that have errors will be randomly distributed over the chip.

However, during the experiment we occasionally observe large error bursts, where all the qubits on the chip suddenly become more error prone all at once. This correlated error burst is a clear signature of a high-energy impact event. We also see that, while all qubits on the chip are affected by the event, the qubits with the highest error rates are all concentrated in a “hotspot” around the impact site, where slightly more energy is deposited into the qubit layer by the spreading phonons.

To detect high-energy impacts, we rapidly prepare the qubits in an excited state, wait a little time, and then check if they’ve maintained their state. An impact produces a correlated error burst, where all the qubits show a significantly elevated error rate, as shown around time = 8 seconds above.

Next Steps
Because these error bursts are severe and quickly cover the whole chip, they are a type of correlated error that QEC is unable to correct. Therefore, it’s very important to find a solution to mitigate these events in future processors that are expected to rely on QEC.

Shielding against these particles is very difficult and typically requires careful engineering and design of the cryostat and many meters of shielding, which becomes more impractical as processors grow in size. Another approach is to modify the chip, allowing it to tolerate impacts without causing widespread correlated errors. This is an approach taken in other complex superconducting devices like detectors for astronomical telescopes, where it’s not possible to use shielding. Examples of such mitigation strategies include adding additional metal layers to the chip to absorb phonons and prevent them from getting to the qubit, adding barriers in the chip to prevent phonons spreading over long distances, and adding traps for quasiparticles in the qubits themselves. By employing these techniques, future processors will be much more robust to these high-energy impact events.

As the error rates of quantum processors continue to decrease, and as we make progress in building a prototype of an error-corrected logical qubit, we’re increasingly pushed to study more exotic sources of error. While QEC is a powerful tool for correcting many kinds of errors, understanding and correcting more difficult sources of correlated errors will become increasingly important. We’re looking forward to future processor designs that can handle high energy impacts and enable the first experimental demonstrations of working quantum error correction.

Acknowledgements
This work wouldn’t have been possible without the contributions of the entire Google Quantum AI Team, especially those who worked to design, fabricate, install and calibrate the Sycamore processors used for this experiment. Special thanks to Rami Barends and Lev Ioffe, who led this project.

Read More

How Logz.io accelerates ML recommendations and anomaly detection solutions with Amazon SageMaker

Logz.io is an AWS Partner Network (APN) Advanced Technology Partner with AWS Competencies in DevOps, Security, and Data & Analytics. Logz.io offers a software as a service (SaaS) observability platform based on best-in-class open-source software solutions for log, metric, and tracing analytics. Customers are sending an increasing amount of data to Logz.io from various data sources to manage the health and performance of their applications and services. It can be overwhelming for new users who are looking to navigate across the various dashboards built over time, process different alert notifications, and connect the dots when troubleshooting production issues.

Mean time to detect (MTTD) and mean time to resolution (MTTR) are key metrics for our customers. They’re calculated by measuring the time a user in our platform starts to investigate an issue (such as production service down) to the point when they stop doing actions in the platform that are related to the specific investigation.

To help customers reduce MTTD and MTTR, Logz.io is turning to machine learning (ML) to provide recommendations for relevant dashboards and queries and perform anomaly detection via self-learning. As a result, the average user is equipped with the aggregated experience of their entire company, leveraging the wisdom of many. We found that our solution can reduce MTTR by up to 20%.

As MTTD decreases, users can identify the problem and resolve it faster. Our data semantic layer contains semantics for starting and stopping an investigation, and the popularity of each action the user is doing with respect to a specific alert.

In this post, we share how Logz.io used Amazon SageMaker to reduce the time and effort for our proof of concept (POC), experiments from research to production evaluation, and how we reduced our production inference cost.

The challenge

Until Logz.io used SageMaker, the time between research to POC testing and experiments on production was quite lengthy. This was because we needed to create Spark jobs to collect, clean, and normalize the data. DevOps required this work to read each data source. DevOps and data engineering skills aren’t part of our ML team, and this caused a high dependency between the teams.

Another challenge was to provide an ML inference service to our products while achieving optimal cost vs. performance ratio. Our optimal scenario is supporting as many models as possible for a computing unit, while providing high concurrency from customers with many models. We had flexibility on our inference time, because our initial window of the data stream for the inference service is 5 minutes bucket of logs.

Research phase

Data science is an iterative process that requires an interactive development environment for research, validating the data output on every iteration and data processing. Therefore, we encourage our ML researchers to use notebooks.

To accelerate the iteration cycle, we wanted to test our notebooks’ code on real production data, while running it at scale. Furthermore, we wanted to avoid the bottleneck of DevOps and data engineering during the initial test in production, while having the ability to view the outputs and trying to estimate the code runtime.

To implement this, we wanted to provide our data science team full control and end-to-end responsibility from research to initial test on production. We needed them to easily pull data, while preserving data access management and monitoring this access. They also needed to easily deploy their custom POC notebooks into production in a scalable manner, while monitoring the runtime and expected costs.

Evaluation phase

During this phase, we evaluated a few ML platforms in order to support both training and serving requirements. We found that SageMaker is the most appropriate for our use cases because it supports both training and inference. Furthermore, it’s customizable, so we can tailor it according to our preferred research process.

Initially, we started from local notebooks, testing various libraries. We ran into problems with pulling massive data from production. Later, we were stuck in a point of the modeling phase that took many hours on a local machine.

We evaluated many solutions and finally chose the following architecture:

  • DataPlate – The open-source version of DataPlate helped us pull and join our data easily by utilizing our Spark Amazon EMR clusters with a simple SQL, while monitoring the data access
  • SageMaker notebook instance and processing jobs – This helped us with the scalability of runtime and flexibility of machine types and ML frameworks, while collaborating our code via a Git connection

Research phase solution architecture

The following diagram illustrates the solution architecture of the research phase, and consists of the following components:

  • SageMaker notebooks – Data scientists use these notebooks to conduct their research.
  • AWS Lambda functionAWS Lambda is a serverless solution that runs a processing job on demand. The job uses a Docker container with the notebook we want to run during our experiment, together with all our common files that need to support the notebook (requirements.txt and the multi-processing functions code in a separate notebook).
  • Amazon ECRAmazon Elastic Container Registry (Amazon ECR) stores our Docker container.
  • SageMaker Processing job – We can run this data processing job on any ML machine, and it runs our notebook with parameters.
  • DataPlate – This service helps us use SQL and join several data sources easily. It translates it to Spark code and optimizes it, while monitoring data access and helping reduce data breaches. The Xtra version provided even more capabilities.
  • Amazon EMR – This service runs our data extractions as workloads over Spark, contacting all our data resources.

With the SageMaker notebook instance lifecycle, we can control the maximum notebook instance runtime, using the autostop.py template script.

After testing the ML frameworks, we chose the SageMaker MXNet kernel for our clustering and ranking phases.

To test the notebook code on our production data, we ran the notebook by encapsulating it via Docker in Amazon ECS and ran it as a processing job to validate the maximum runtime on different types of machines.

The Docker container also helps us share resources among notebooks’ tests. In some cases, a notebook calls other notebooks to utilize a multi-process by splitting big data frames into smaller data frames, which can run simultaneously on each vCPU in a large machine type.

The real-time production inference solution

In the research phase, we used Parquet Amazon Simple Storage Service (Amazon S3) files to maintain our recommendations. These are consumed once a day from our engineering pipeline to attach the recommendations to our alerts’ mechanism.

However, our roadmap requires a higher refresh rate solution and pulling once a day isn’t enough in the long term, because we want to provide recommendations even during the investigation.

To implement this solution at scale, we tested most of the SageMaker endpoint solutions in our anomaly-detection research. We tested 500 of the pre-built models with a single endpoint machine of various types and used concurrent multi-threaded clients to perform requests to the endpoint. We measured the response time, CPU, memory, and other metrics (for more information, see Monitor Amazon SageMaker with Amazon CloudWatch). We found that the multi-model endpoint is a perfect fit for our use cases.

A multi-model endpoint can reduce our costs dramatically in comparison to a single endpoint or even Kubernetes to use Flask (or other Python) web services. Our first assumption was that we must provide a single endpoint, using a 4-vCPU small machine, for each customer, and on average query four dedicated models, because each vCPU serves one model. With the multi-model endpoint, we could aggregate more customers on a single multi-endpoint machine.

We had a model and encoding files per customer, and after doing load tests, we determined that we could serve 50 customers, each using 10 models and even using the smallest ml.t2.medium instance for our solutions.

In this stage, we considered using multi-model endpoints. Multi-model endpoints provide a scalable and cost-effective solution to deploy a large number of models, enabling you to host multiple models with a single inference container. This reduces hosting costs by improving endpoint utilization compared to using multiple small single-model endpoints that each serve a single customer. It also reduces deployment overhead because SageMaker manages loading models in memory and scaling them based on the traffic patterns to them.

Furthermore, the multi-model endpoint advantage is that if you have a high inference rate from specific customers, its framework preserves the last serving models in memory for better performance.

After we estimated costs using multi-model endpoints vs. standard endpoints, we found out that it could potentially lead to cost reduction of approximately 80%.

The outcome

In this section, we review the steps and the outcome of the process.

We use the lifecycle notebook configuration to enable running the notebooks as processing jobs, by encapsulating the notebook in a Docker container in order to validate the code faster and use the autostop mechanism:

#!/bin/bash

# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.

set -e

# OVERVIEW
# This script installs the sagemaker_run_notebook extension package in SageMaker Notebook Instance
#
# There are two parameters you need to set:
# 1. S3_LOCATION is the place in S3 where you put the extension tarball
# 2. TARBALL is the name of the tar file that you uploaded to S3. You should just need to check
#    that you have the version right.
sudo -u ec2-user -i <<'EOF'
# PARAMETERS
VERSION=0.18.0
EXTENSION_NAME=sagemaker_run_notebook
# Set up the user setting and workspace directories
mkdir -p /home/ec2-user/SageMaker/.jupyter-user/{workspaces,user-settings}
# Run in the conda environment that the Jupyter server uses so that our changes are picked up
source /home/ec2-user/anaconda3/bin/activate JupyterSystemEnv
# Install the extension and rebuild JupyterLab so it picks up the new UI
aws s3 cp s3://aws-emr-resources-11111111-us-east-1/infra-sagemaker/sagemaker_run_notebook-0.18.0-Logz-latest.tar.gz ./sagemaker_run_notebook-0.18.0-Logz-latest.tar.gz
pip install sagemaker_run_notebook-0.18.0-Logz-latest.tar.gz

jupyter lab build
source /home/ec2-user/anaconda3/bin/deactivate
EOF

# sudo -u ec2-user -i <<'EOF'
# PARAMETERS
for PACKAGE in pandas dataplate awswrangler==2.0.0 ipynb==0.5.1 prison==0.1.3 PyMySQL==0.10.1 requests==2.25.0 scipy==1.5.4 dtaidistance joblib sagemaker_run_notebook-0.18.0-Logz-latest.tar.gz fuzzywuzzy==0.18.0; do
  echo $PACKAGE

  # Note that "base" is special environment name, include it there as well.
  for env in base /home/ec2-user/anaconda3/envs/*; do
      source /home/ec2-user/anaconda3/bin/activate $(basename "$env")
      if [ $env = 'JupyterSystemEnv' ]; then
          continue
      fi
      pip install --upgrade "$PACKAGE"
      source /home/ec2-user/anaconda3/bin/deactivate
  done
done
jupyter lab build

# Tell Jupyter to use the user-settings and workspaces directory on the EBS
# volume.
echo "export JUPYTERLAB_SETTINGS_DIR=/home/ec2-user/SageMaker/.jupyter-user/user-settings" >> /etc/profile.d/jupyter-env.sh
echo "export JUPYTERLAB_WORKSPACES_DIR=/home/ec2-user/SageMaker/.jupyter-user/workspaces" >> /etc/profile.d/jupyter-env.sh

# The Jupyter server needs to be restarted to pick up the server part of the
# extension. This needs to be done as root.
initctl restart jupyter-server --no-wait

# OVERVIEW
# This script stops a SageMaker notebook once it's idle for more than 2 hour (default time)
# You can change the idle time for stop using the environment variable below.
# If you want the notebook the stop only if no browsers are open, remove the --ignore-connections flag
#
# Note that this script will fail if either condition is not met
#   1. Ensure the Notebook Instance has internet connectivity to fetch the example config
#   2. Ensure the Notebook Instance execution role permissions to SageMaker:StopNotebookInstance to stop the notebook
#       and SageMaker:DescribeNotebookInstance to describe the notebook.
# PARAMETERS
IDLE_TIME=3600

echo "Fetching the autostop script"
wget https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/master/scripts/auto-stop-idle/autostop.py

echo "Starting the SageMaker autostop script in cron"

(crontab -l 2>/dev/null; echo "*/5 * * * * /usr/bin/python $PWD/autostop.py --time $IDLE_TIME --ignore-connections") | crontab -

We clone the sagemaker-run-notebook GitHub project, and add the following to the container:

  • Our pip requirements
  • The ability to run notebooks from within a notebook, which enables us multi-processing behavior to utilize all the ml.m5.12xlarge instance cores

This enables us to run workflows that consist of many notebooks running as processing jobs in a line of code, while defining the instance type to run on.

Because we can add parameters to the notebook, we can scale our processing by running simultaneously at different hours, days, or months to pull and process data.

We can also create scheduling jobs that run notebooks (and even limit the run time).

We also can observe the last runs and their details, such as processing time.

With the papermill that is used in the container, we can view the output of every run, which helps us debug in production.

Our notebook output review is in the form of a standard read-only notebook.

Multi-processing utilization helps us scale on each notebook processing and utilize all its cores. We generated functions in other notebooks that can do heavy processing, such as the following:

  • Explode JSONs
  • Find relevant rows in a DataFrame while the main notebook splits the DataFrame in #cpu-cores elements
  • Run clustering per alert type actions simultaneously

We then add these functional notebooks into the container that runs the notebook as a processing job. See the following Docker file (notice the COPY commands):

ARG BASE_IMAGE=need_an_image
FROM $BASE_IMAGE

ENV JUPYTER_ENABLE_LAB yes
ENV PYTHONUNBUFFERED TRUE

COPY requirements.txt /tmp/requirements.txt
RUN pip install papermill jupyter nteract-scrapbook boto3 requests==2.20.1
RUN pip install -r /tmp/requirements.txt

ENV PYTHONUNBUFFERED=TRUE
ENV PATH="/opt/program:${PATH}"

# Set up the program in the image
COPY multiprocessDownloadNormalizeFunctions.ipynb /tmp/multiprocessDownloadNormalizeFunctions.ipynb
COPY multiprocessFunctions.ipynb /tmp/multiprocessFunctions.ipynb
COPY run_notebook execute.py /opt/program/
ENTRYPOINT ["/bin/bash"]

# because there is a bug where you have to be root to access the directories
USER root

Results

During the research phase, we evaluated the option to run our notebooks as is to experiment and evaluate how our code performs on all our relevant data, not just a sample of data. We found that encapsulating our notebooks using processing jobs can be a great fit for us, because we don’t need to rewrite code and we can utilize the power of AWS compute optimized and memory optimized instances and follow the status of the process easily.

During the inference assessment, we evaluated various SageMaker endpoint solutions. We found that using a multi-model endpoint can help us serve approximately 50 customers, each having multiple (approximately 10) models in a single instance, which can meet our low-latency constraints, and therefore save us up to 80% of the cost.

With this solution architecture, we were able to reduce the MTTR of our customers, which is a main metric for measuring success using our platform. It reduces the total time from the point of responding to our alert link, which describes an issue in your systems, to when you’re done investigating the problem using our platform. During the investigation phase, we measure the users’ actions with and without our ML recommendation solution. This helps us provide recommendations for the best action to resolve the specific issue faster and pinpoint anomalies to identify the actual cause of the problem.

Conclusion and next steps

In this post, we shared how Logz.io used SageMaker to improve MTTD and MTTR.

As a next step, we’re considering expanding the solution with the following features:

We encourage you to try out SageMaker notebooks. For more examples, check out the SageMaker examples GitHub repo.


About the Authors

Amit Gross is leading the Research department of Logz.io, which is responsible for the AI solutions of all Logz.io products, from the research phase to the integration phase. Prior to Logz.io Amit has managed both Data Science and Security Research Groups at Here inc. and Cellebrite inc. Amit has M.Sc in computer science from Tel-Aviv University.

Yaniv Vaknin is a Machine Learning Specialist at Amazon Web Services. Prior to AWS, Yaniv held leadership positions with AI startups and Enterprise including co-founder and CEO of Dipsee.ai. Yaniv works with AWS customers to harness the power of Machine Learning to solve real world tasks and derive value. In his spare time, Yaniv enjoys playing soccer with his boys.

Eitan Sela is a Machine Learning Specialist Solutions Architect with Amazon Web Services. He works with AWS customers to provide guidance and technical assistance, helping them build and operate machine learning solutions on AWS. In his spare time, Eitan enjoys jogging and reading the latest machine learning articles.

Read More

Hatch Me If You Can: Startup’s Sorting Machines Use AI to Protect Healthy Fish Eggs

Fisheries collect millions upon millions of fish eggs, protecting them from predators to increase fish yield and support the propagation of endangered species — but an issue with gathering so many eggs at once is that those infected with parasites can put healthy ones at risk.

Jensorter, an Oregon-based startup, has created AI-powered fish egg sorters that can rapidly identify healthy versus unhealthy eggs. The machines, built on the NVIDIA Jetson Nano module, can also detect egg characteristics such as size and fertility status.

The devices then automatically sort the eggs based on these characteristics, allowing Jensorter’s customers in Alaska, the Pacific Northwest and Russia to quickly separate viable eggs from unhealthy ones — and protect them accordingly.

Jensorter is a member of NVIDIA Inception, a program that nurtures cutting-edge startups revolutionizing industries with advancements in AI, data science, high performance computing and more.

Picking Out the Good Eggs

According to Curt Edmondson, patent counsel and CTO of Jensorter, many fisheries aim to quickly dispose of unhealthy eggs to lower the risk of infecting healthy ones.

Using AI, Jensorter machines look at characteristics like color to discern an egg’s health status and determine whether it’s fertilized — at a speed of about 30 milliseconds per egg.

“Our fish egg sorters are achieving a much higher accuracy with the addition of AI powered by NVIDIA Jetson, which is allowing us to create advanced capabilities,” Edmondson said.

The startup offers several machines, each tailored to varying volumes of eggs to be sorted. The Model JH device, optimal for egg volumes of three to 10 million, can sort nearly 200,000 eggs per hour, eliminating the slow and laborious process of hand-picking.

“Using AI to capture and process images of eggs in real time could have great value over the long term,” Edmondson said. “If hatcheries come together and centralize their images in a database, we could identify patterns of egg characteristics that lead to healthy eggs.”

This could help propagate salmon and trout, species that play important roles in their ecosystems and are common food sources for humans, and which are on the decline in many areas, he added.

The Oregon Hatchery Research Center recently used Jensorter devices to conduct an alpha test examining whether smaller eggs lead to healthier fish. In the spring, the center will use the machines to proceed with beta testing in hatcheries, before publishing study results.

Jensorter also plans to create next-generation sorters that are faster still and can detect, count and separate eggs based on their sex, number of zygotes and other metrics that would be useful to fisheries.

Watch a tutorial on how Jensorter equipment works and learn more about NVIDIA Inception.

The post Hatch Me If You Can: Startup’s Sorting Machines Use AI to Protect Healthy Fish Eggs appeared first on The Official NVIDIA Blog.

Read More

Cynthia Breazeal named dean for digital learning at MIT

In a letter to the MIT community today, Vice President for Open Learning Sanjay Sarma announced the appointment of Professor Cynthia Breazeal as dean for digital learning, effective Feb. 1. As dean, she will supervise numerous business units and research initiatives centered on developing and deploying digital technologies for learning. These include MIT xPRO, Bootcamps, Horizon, the Center for Advanced Virtuality, MIT Integrated Learning Initiative, RAISE, and other strategic initiatives. Breazeal has served as senior associate dean for open learning since the fall.

As dean, Breazeal will lead corporate education efforts, helping to grow the existing portfolio of online professional courses, content libraries, and boot camps, while looking more holistically at the needs of companies and professionals to identify areas of convergence and innovation. She will also lead research efforts at MIT Open Learning into teaching, learning, and how new technologies can enhance both, with a special focus on virtual and augmented reality, artificial intelligence, and learning science. Breazeal will help infuse these new technologies and pedagogies into all of the teams’ learning offerings.

“Cynthia brings to the deanship a remarkable combination of experience and expertise. She consistently displays an outstanding facility for leadership and collaboration, bringing together people, ideas, and technologies in creative and fruitful ways,” Sarma wrote in his letter to the community. “Cynthia is an ambassador for women in STEM and a trailblazer in interdisciplinary research and community engagement.”

The director of MIT RAISE — a cross-MIT research effort on advancing AI education for K-12 and adult learners — and head of the Personal Robots research group at the MIT Media Lab, Breazeal is a professor of media arts and sciences and a pioneer in human-robot interaction and social robotics. Her research focus includes technical innovation in AI and user experience design combined with understanding the psychology of engagement to design personified AI technologies that promote human flourishing and personal growth. Over the past decade, her work has expanded to include outreach, engagement, and education in the design and use of AI, as well as AI literacy. She has placed particular emphasis on diversity and inclusion for all ages, backgrounds, and comfort levels with technology.

“The work that Open Learning is doing to extend the best of MIT’s teaching, knowledge, and technology to the world is so thrilling to me,” says Breazeal. “I’m excited to work with these teams to grow and expand their respective programs and to develop new, more integrated, potentially thematic solutions for corporations and professionals.”

TC Haldi, senior director of MIT xPRO, says, “There’s an increasing sophistication in the needs of the professional workforce, as technologies and systems grow more complex in every sector. Cynthia has a deep understanding of the intersection between research and industry, and her insights into learning and technology are invaluable.”

Breazeal will also continue to head the Personal Robots research group, whose recent work focuses on the theme of “living with AI” and understanding the long-term impact of social robots that can build relationships and provide personalized support as helpful companions in daily life. Under her continued direction, the RAISE initiative, a joint collaboration between the Media Lab, Open Learning, and the MIT Schwarzman College of Computing, is bringing AI resources and education opportunities to teachers and students across the United States and the world through workshops and professional training, hands-on activities, research, and curricula.

Read More

UK Biobank Advances Genomics Research with NVIDIA Clara Parabricks

UK Biobank is broadening scientists’ access to high-quality genomic data and analysis by making its massive dataset available in the cloud alongside NVIDIA GPU-accelerated analysis tools.

Used by more than 25,000 registered researchers around the world, UK Biobank is a large-scale biomedical database and research resource with deidentified genetic datasets, along with medical imaging and health record data, from more than 500,000 participants across the U.K.

Regeneron Genetics Center, the high-throughput sequencing center of biotech leader Regeneron, recently teamed up with UK Biobank to sequence and analyze the exomes — all protein-coding portions of the genome — of all the biobank participants.

The Regeneron team used NVIDIA Clara Parabricks, a software suite for secondary genomic analysis of next-generation sequencing data, during the exome sequencing process.

UK Biobank has released 450,000 of these exomes for access by approved researchers, and is now providing scientists six months of free access to Clara Parabricks through its cloud-based Research Analysis Platform. It was developed by bioinformatics platform DNAnexus, which lets scientists use Clara Parabricks running on NVIDIA GPUs in the AWS cloud.

“As demonstrated by Regeneron, GPU acceleration with Clara Parabricks achieves the throughputs, speed and reproducibility needed when processing genomic datasets at scale,” said Dr. Mark Effingham, deputy CEO of UK Biobank. “There are a number of research groups in the U.K. who were pushing for these accelerated tools to be available in our platform for use with our extensive dataset.”

Regeneron Exome Research Accelerated by Clara Parabricks

Regeneron’s researchers used the DeepVariant Germline Pipeline from NVIDIA Clara Parabricks to run their analysis with a model specific to the genetic center’s workflow.

Its researchers identified 12 million coding variants and hundreds of genes associated with health-related traits — certain genes were associated with increased risk for liver disease and eye disease, and others were linked to lower risk of diabetes and asthma.

The unique set of tools the researchers used for high-quality variant detection is available to UK Biobank registered users through the Research Analysis Platform. This capability will allow scientists to harmonize their own exome data with sequenced exome data from UK Biobank by running the same bioinformatics pipeline used to generate the initial reference dataset.

Cloud-Based Platform Improves Equity of Access

Researchers deciphering the genetic codes of humans — and of the viruses and bacteria that infect humans — can often be limited by the computational resources available to them.

UK Biobank is democratizing access by making its dataset open to scientists around the world, with a focus on further extending use by early-career researchers and those in low- and middle-income countries. Instead of researchers needing to download this huge dataset to use on their own compute resources, they can instead tap into UK Biobank’s cloud platform through a web browser.

“We were being contacted by researchers and clinicians who wanted to access UK Biobank data, but were struggling with access to the basic compute needed to work with even relatively small-scale data,” said Effingham. “The cloud-based platform provides access to the world-class technology needed for large-scale exome sequencing and whole genome sequencing analysis.”

Researchers using the platform pay only for the computational cost of their analyses and for storage of new data they generate from the biobank’s petabyte-scale dataset, Effingham said.

Using Clara Parabricks on DNAnexus helps reduce both the time and cost of this genomic analysis, delivering a whole exome analysis that would take nearly an hour of computation on a 32-vCPU machine in less than five minutes — while also reducing cost by approximately 40 percent.

Exome Sequencing Provides Insights for Precision Medicine

For researchers studying links between genetics and disease, exome sequencing is a critical tool — and the UK Biobank dataset includes nearly half a million participant exomes to work with.

The exome is approximately 1.5 percent of the human genome, and consists of all the known genes and their regulatory elements. By studying genetic variation in exomes across a large, diverse population, scientists can better understand the population’s structure, helping researchers address evolutionary questions and describe how the genome works.

With a dataset as large as UK Biobank’s, it is also possible to identify the specific genetic variants associated with inherited diseases, including cardiovascular disease, neurodegenerative conditions and some kinds of cancer.

Exome sequencing can even shed light on potential genetic drivers that might increase or decrease an individual’s risk of severe disease from COVID-19 infection, Effingham said. As the pandemic continues, UK Biobank is adding COVID case data, vaccination status, imaging data and patient outcomes for thousands of participants to its database.

Get started with NVIDIA Clara Parabricks on the DNAnexus-developed UK Biobank Research Analysis Platform. Learn more about the exome sequencing project by registering for this webinar, which takes place Feb. 17 at 8am Pacific.

Subscribe to NVIDIA healthcare news here

Main image shows the freezer facility at UK Biobank where participant samples are stored. Image courtesy of UK Biobank. 

The post UK Biobank Advances Genomics Research with NVIDIA Clara Parabricks appeared first on The Official NVIDIA Blog.

Read More

Introducing Text and Code Embeddings in the OpenAI API

Introducing Text and Code Embeddings in the OpenAI API

We are introducing embeddings, a new endpoint in the OpenAI API that makes it easy to perform natural language and code tasks like semantic search, clustering, topic modeling, and classification. Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts. Our embeddings outperform top models in 3 standard benchmarks, including a 20% relative improvement in code search.

Read documentationRead paper

Embeddings are useful for working with natural language and code, because they can be readily consumed and compared by other machine learning models and algorithms like clustering or search.

Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API

Embeddings that are numerically similar are also semantically similar. For example, the embedding vector of “canine companions say” will be more similar to the embedding vector of “woof” than that of “meow.”

Introducing Text and Code Embeddings in the OpenAI API
Introducing Text and Code Embeddings in the OpenAI API

The new endpoint uses neural network models, which are descendants of GPT-3, to map text and code to a vector representation—“embedding” them in a high-dimensional space. Each dimension captures some aspect of the input.

The new /embeddings endpoint in the OpenAI API provides text and code embeddings with a few lines of code:

import openai
response = openai.Embedding.create(
    input="canine companions say",
    engine="text-similarity-davinci-001")

print(response)
{
  "data": [
    {
      "embedding": [
        0.000108064,
        0.005860855,
        -0.012656143,
        ...
        -0.006642727,
        0.002583989,
        -0.012567150
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  "model": "text-similarity-babbage:001",
  "object": "list"
}

We’re releasing three families of embedding models, each tuned to perform well on different functionalities: text similarity, text search, and code search. The models take either text or code as input and return an embedding vector.

Models Use Cases
Text similarity: Captures semantic similarity between pieces of text. text-similarity-{ada, babbage, curie, davinci}-001 Clustering, regression, anomaly detection, visualization
Text search: Semantic information retrieval over documents. text-search-{ada, babbage, curie, davinci}-{query, doc}-001 Search, context relevance, information retrieval
Code search: Find relevant code with a query in natural language. code-search-{ada, babbage}-{code, text}-001 Code search and relevance

Text Similarity Models

Text similarity models provide embeddings that capture the semantic similarity of pieces of text. These models are useful for many tasks including clustering, data visualization, and classification.

The following interactive visualization shows embeddings of text samples from the DBpedia dataset:

Drag to pan, scroll or pinch to zoom

Embeddings from the text-similarity-babbage-001 model, applied to the DBpedia dataset. We randomly selected 100 samples from the dataset covering 5 categories, and computed the embeddings via the /embeddings endpoint. The different categories show up as 5 clear clusters in the embedding space. To visualize the embedding space, we reduced the embedding dimensionality from 2048 to 3 using PCA. The code for how to visualize embedding space in 3D dimension is available here.

To compare the similarity of two pieces of text, you simply use the dot product on the text embeddings. The result is a “similarity score”, sometimes called “cosine similarity,” between 0 and 1, where a higher number means more similarity. In most applications, the embeddings can be pre-computed, and then the dot product comparison is extremely fast to carry out.

import openai, numpy as np

resp = openai.Embedding.create(
    input=["feline friends go", "meow"],
    engine="text-similarity-davinci-001")

embedding_a = resp['data'][0]['embedding']
embedding_b = resp['data'][1]['embedding']

similarity_score = np.dot(embedding_a, embedding_b)

One popular use of embeddings is to use them as features in machine learning tasks, such as classification. In machine learning literature, when using a linear classifier, this classification task is called a “linear probe.” Our text similarity models achieve new state-of-the-art results on linear probe classification in SentEval (Conneau et al., 2018), a commonly used benchmark for evaluating embedding quality.

Linear probe classification over 7 datasets
Previous SOTA (Guo et al. 2021)
90.2%
text-similarity-davinci-001
92.2%
Show more
text-similarity-curie-001
91.5%
text-similarity-babbage-001
91.1%
text-similarity-ada-001
89.3%

Text Search Models

Text search models provide embeddings that enable large-scale search tasks, like finding a relevant document among a collection of documents given a text query. Embedding for the documents and query are produced separately, and then cosine similarity is used to compare the similarity between the query and each document.

Embedding-based search can generalize better than word overlap techniques used in classical keyword search, because it captures the semantic meaning of text and is less sensitive to exact phrases or words. We evaluate the text search model’s performance on the BEIR (Thakur, et al. 2021) search evaluation suite and obtain better search performance than previous methods. Our text search guide provides more details on using embeddings for search tasks.

Average accuracy over 11 search tasks in BEIR
Previous SOTA (Izacard, et al. 2021)
50.2%
text-search-davinci-{doc, query}-001
52.8%
Show more
text-search-curie-{doc, query}-001
50.9%
text-search-babbage-{doc, query}-001
50.4%
text-search-ada-{doc, query}-001
49.0%

Code Search Models

Code search models provide code and text embeddings for code search tasks. Given a collection of code blocks, the task is to find the relevant code block for a natural language query. We evaluate the code search models on the CodeSearchNet (Husian et al., 2019) evaluation suite where our embeddings achieve significantly better results than prior methods. Check out the code search guide to use embeddings for code search.

Average accuracy over 6 programming languages
Previous SOTA (Guo, et al. 2021)
77.4%
code-search-babbage-{doc, query}-001
93.5%
Show more
code-search-ada-{doc, query}-001
93.4%

Examples of the Embeddings API in Action

JetBrains Research

JetBrains Research’s Astroparticle Physics Lab analyzes data like The Astronomer’s Telegram and NASA’s GCN Circulars, which are reports that contain astronomical events that can’t be parsed by traditional algorithms.

Powered by OpenAI’s embeddings of these astronomical reports, researchers are now able to search for events like “crab pulsar bursts” across multiple databases and publications. Embeddings also achieved 99.85% accuracy on data source classification through k-means clustering.

FineTune Learning

FineTune Learning is a company building hybrid human-AI solutions for learning, like adaptive learning loops that help students reach academic standards.

OpenAI’s embeddings significantly improved the task of finding textbook content based on learning objectives. Achieving a top-5 accuracy of 89.1%, OpenAI’s text-search-curie embeddings model outperformed previous approaches like Sentence-BERT (64.5%). While human experts are still better, the FineTune team is now able to label entire textbooks in a matter of seconds, in contrast to the hours that it took the experts.

Comparison of our embeddings with Sentence-BERT, GPT-3 search and human subject-matter experts for matching textbook content with learned objectives. We report accuracy@k, the number of times the correct answer is within the top-k predictions.

Fabius

Fabius helps companies turn customer conversations into structured insights that inform planning and prioritization. OpenAI’s embeddings allow companies to more easily find and tag customer call transcripts with feature requests.

For instance, customers might use words like “automated” or “easy to use” to ask for a better self-service platform. Previously, Fabius was using fuzzy keyword search to attempt to tag those transcripts with the self-service platform label. With OpenAI’s embeddings, they’re now able to find 2x more examples in general, and 6x–10x more examples for features with abstract use cases that don’t have a clear keyword customers might use.

All API customers can get started with the embeddings documentation for using embeddings in their applications.

Read documentation


Acknowledgments

Thanks to the following for their contributions to this release:

Tao Xu, Chris Hallacy, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Johannes Heidecke, Pranav Shyam, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger, David Schnurr, Felipe Petroski Such, Kenny Hsu, Madeleine Thompson, Tabarak Khan, and Toki Sherbakov.

Thanks to the following for their feedback on this post: Tom Kleinpeter, Morgan Gallant, Sam Altman, Ilya Sutskever, Steve Dowling, Rachel Lim, Arun Vijayvergiya, Rajeev Nayak, Peter Welinder, Justin Jay Wang.


OpenAI

Animator Lets 3D Characters Get Their Groove on With NVIDIA Omniverse and Reallusion

Editor’s note: This post is a part of our Meet the Omnivore series, which features individual creators and developers who use NVIDIA Omniverse to boost their artistic or engineering processes.

Benny Dee

Benjamin Sokomba Dazhi, aka Benny Dee, has learned the ins and outs of the entertainment industry from many angles — first as a rapper, then as a music video director and now as a full-time animator.

After eight years of self-teaching, Dazhi has mastered the art of animation — landing roles as head animator for the film The Legend of Oronpoto, and as creator and director of the Cartoon Network Africa Dance Challenge, a series of dance-along animations that teaches children African-inspired choreography.

Based in north-central Nigeria, Dazhi is building a team for his indie animation studio, JUST ART, which creates animation films focused on action, sci-fi, horror and humor.

Dazhi uses NVIDIA Omniverse — a physically accurate 3D design collaboration platform available with RTX-powered GPUs and part of the NVIDIA Studio suite of tools for creators — with Reallusion’s iClone and Character Creator to supercharge his artistic workflow.

He uses Omniverse Connectors for Reallusion apps for character and prop creation and animation, set dressing and cinematics.

Music, Movies and Masterful Rendering

From animated music videos to clips for action films, Dazhi has a multitude of projects — and accompanying deadlines.

“The main challenges I faced when trying to meet deadlines were long render times and difficulties with software compatibility, but using an Omniverse Connector for Reallusion’s iClone app has been game-changing for my workflow,” he said.

Using Omniverse, Dazhi accomplishes lighting and materials setup, rendering, simulation and post-production processes.

With these tools, it took Dazhi just four minutes to render this clip of a flying car — a task, he said, that would have otherwise taken hours.

“The rendering speed and photorealistic output quality of Omniverse is a breakthrough — and Omniverse apps like Create and Machinima are very user-friendly,” he said.

Such 3D graphics tools are especially important for the development of indie artists, Dazhi added.

“In Nigeria, there are very few animation studios, but we are beginning to grow in number thanks to easy-to-use tools like Reallusion’s iClone, which is the main animation software I use,” he said.

Dazhi plans to soon expand his studio, working with other indie artists via Omniverse’s real-time collaboration feature. Through his films, he hopes to show viewers “that it’s more than possible to make high-end content as an indie artist or small company.”

See Dazhi’s work in the NVIDIA Omniverse Gallery, and hear more about his creative workflow live during a Twitch stream on Jan. 26 at 11 a.m. Pacific.

Creators can download NVIDIA Omniverse for free and get started with step-by-step tutorials on the Omniverse YouTube channel. For additional resources and inspiration, follow Omniverse on Instagram, Twitter and Medium. To chat with the community, check out the Omniverse forums and join our Discord Server.

The post Animator Lets 3D Characters Get Their Groove on With NVIDIA Omniverse and Reallusion appeared first on The Official NVIDIA Blog.

Read More