A new YouTube show: TensorFlow.js Community Show & Tell

A new YouTube show: TensorFlow.js Community Show & Tell

Posted by Jason Mayes, Developer Relations Engineer for TensorFlow.js

The TensorFlow YouTube channel has a new show called “TensorFlow.js Community Show & Tell.” In this program, we highlight amazing tech demos from the TensorFlow.js community every quarter. Our next show will be on 11th December 9AM PT over on the TensorFlow YouTube channel, but if you missed the previous ones, you can find past episodes on this playlist.

TensorFlowJS Community Show and Tell thumbnail

About the show

Do you love great tech demos that push the boundaries of what is possible for a given industry? If that sounds like you and you’re looking for fresh inspiration, along with insights from the engineers themselves, then this may be the YouTube show for you.

After hacking with many wonderful folk in the TensorFlow.js community it became clear to us that the creativity and the work you were producing was simply incredible. For that reason, we have put together a brand new format, known as the TensorFlow Show & Tell to showcase top projects that are being made and give developers a platform to share their work. With many subscribers who are as passionate about machine learning as we are, we figured this was a great way to do that and connect great minds.

If you missed our latest show you can check our most recent broadcast here:

We have seen a whole bunch of amazing demos on the first 3 shows using machine learning in truly novel ways. From making music from thin air using your hands which is then web connected to a Tesla coil (yes, that actually happened), to combining machine learning models with mind blowing mixed reality and 3D graphics, we have had a good variety of presenters from all around the world.

Who has presented so far?

We’ve had 25 presenters, and there are more in the making!

1st Show: Rogerio Chaves (Netherlands), Max Bittker (USA), Junya Ishihara (Japan), Ben Farrell (USA), Lizzie Siegle (USA), Manish Raj (India), Yiwen Lin (UK), Jimmy (Thailand)

2nd Show: Cyril Diagne (France), John Cohn (USA), Va Barbosa (USA), FollowTheDarkside (Japan), Jaume Sanchez (UK), Olesya Chernyavskaya (Russia), Alexandre Devaux (France), Amruta (India), Shan Huan (China).

3rd Show: Gant Laborde (USA), Hugo Zanini (Brazil), Charlie Gerard (Amsterdam), Shivay Lamba (India), Anders Jessen (Denmark), Benson Ruan (Australia), Cristina Maillo (Spain), James Seo (USA).

How can I get on the show?

If you have made something using TensorFlow.js simply use the #MadeWithTFJS hashtag over on Twitter or LinkedIn with a post demonstrating your creation and link to try it out. We will select our top picks each quarter to be featured in the show and reach out to you directly if you are selected. You can view existing submissions for Twitter as an example.

I missed an episode – where can I view them?

Catch up on previous live episodes via this playlist, and if you wish to watch shorter bite sized videos over a coffee break you can do so with the Made With TensorFlow.js playlist to allow you to watch just the ones that are relevant to you on demand.

When is the next show?

Join us in December for the next show and tell – we have 6 wonderful new demos lined up. We’re aiming for 11th December, 9AM PT, over on the TensorFlow YouTube channel.

Be sure to subscribe and click the bell icon to set notifications for when new videos are posted (we are aiming for every quarter), or add a Google calendar reminder and see you there!

Also, if you enjoyed the show and would like to see this format replicated for TensorFlow Core, TensorFlow Lite, and more, let us know! You can drop me a message on Twitter or LinkedIn – would love to see what you have made.

Acknowledgements

A huge thank you to all 24 of our show & tell presenters thus far, and to the amazing community who have submitted projects or helped tag amazing finds for us to reach out to. You truly make this show what it is and we are super excited to see even more in the future and look forward to seeing you all again soon.

Read More

Bringing Enterprise Medical Imaging to Life: RSNA Highlights What’s Next for Radiology

Bringing Enterprise Medical Imaging to Life: RSNA Highlights What’s Next for Radiology

As the healthcare world battles the pandemic, the medical-imaging field is gaining ground with AI, forging new partnerships and funding startup innovation. It will all be on display at RSNA, the Radiological Society of North America’s annual meeting, taking place Nov. 29 – Dec. 5.

Radiologists, healthcare organizations, developers and instrument makers at RSNA will share their latest advancements and what’s coming next — with an eye on the growing ability of AI models to integrate with medical-imaging workflows. More than half of informatics abstracts submitted to this year’s virtual conference involve AI.

In a special public address at RSNA, Kimberly Powell, NVIDIA’s VP of healthcare, will discuss how we’re working with research institutions, the healthcare industry and AI startups to bring workflow acceleration, deep learning models and deployment platforms to the medical imaging ecosystem.

Healthcare and AI experts worldwide are putting monumental effort into developing models that can help radiologists determine the severity of COVID cases from lung scans. They’re also building platforms to smoothly integrate AI into daily workflows, and developing federated learning techniques that help hospitals work together on more robust AI models.

The NVIDIA Clara Imaging application framework is poised to advance this work with NVIDIA GPUs and AI models that can accelerate each step of the radiology workflow, including image acquisition, scan annotation, triage and reporting.

Delivering Tools to Radiologists, Developers, Hospitals

AI developers are working to bridge the gap between their models and the systems radiologists already use, with the goal of creating seamless integration of deep learning insights into tools like PACS digital archiving systems. Here’s how NVIDIA is supporting their work:

  • We’ve strengthened the NVIDIA Clara application framework’s full-stack GPU-accelerated libraries and SDKs for imaging, with new pretrained models available on the NGC software hub. NVIDIA and the U.S. National Institutes of Health jointly developed AI models that can help researchers classify COVID cases from chest CT scans, and evaluate the severity of these cases.
  • Using the NVIDIA Clara Deploy SDK, Mass General Brigham researchers are testing a risk assessment model that analyzes chest X-rays to determine the severity of lung disease. The tool was developed by the Athinoula A. Martinos Center for Biomedical Imaging, which has adopted NVIDIA DGX A100 systems to power its research.
  • Together with King’s College London, we introduced this year MONAI, an open-source AI framework for medical imaging. Based on the Ignite and PyTorch deep learning frameworks, the modular MONAI code can be easily ported to researchers’ existing AI pipelines. So far, the GitHub project has dozens of contributors and over 1,500 stars.
  • NVIDIA Clara Federated Learning enables researchers to collaborate on training robust AI models without sharing patient information. It’s been used by hospitals and academic medical centers to train models for mammogram assessment, and to assess the likelihood that patients with COVID-19 symptoms will need supplemental oxygen.

NVIDIA at RSNA

RSNA attendees can check out NVIDIA’s digital booth to discover more about GPU-accelerated AI in medical imaging. Hands-on training courses from the NVIDIA Deep Learning Institute are also available, covering medical imaging topics including image classification, coarse-to-fine contextual memory and data augmentation with generative networks. The following events feature NVIDIA speakers:

Over 50 members of NVIDIA Inception — our accelerator program for AI startups — will be exhibiting at RSNA, including Subtle Medical, which developed the first AI tools for medical imaging enhancement to receive FDA clearance and this week announced $12 million in Series A funding.

Another, TrainingData.io used the NVIDIA Clara SDK to train a segmentation AI model to analyze COVID disease progression in chest CT scans. And South Korean startup Lunit recently received the European CE mark and partnered with GE Healthcare on an AI tool that flags abnormalities on chest X-rays for radiologists’ review.

Visit the NVIDIA at RSNA webpage for a full list of activities at the show. Email to request a meeting with our deep learning experts.

Subscribe to NVIDIA healthcare news here.

The post Bringing Enterprise Medical Imaging to Life: RSNA Highlights What’s Next for Radiology appeared first on The Official NVIDIA Blog.

Read More

Learning State Abstractions for Long-Horizon Planning

Learning State Abstractions for Long-Horizon Planning

Many tasks that we do on a regular basis, such as navigating a city, cooking a
meal, or loading a dishwasher, require planning over extended periods of time.
Accomplishing these tasks may seem simple to us; however, reasoning over long
time horizons remains a major challenge for today’s Reinforcement Learning (RL)
algorithms. While unable to plan over long horizons, deep RL algorithms excel
at learning policies for short horizon tasks, such as robotic grasping,
directly from pixels. At the same time, classical planning methods such as
Dijkstra’s algorithm and A$^*$ search can plan over long time horizons, but
they require hand-specified or task-specific abstract representations of the
environment as input.

To achieve the best of both worlds, state-of-the-art visual navigation methods
have applied classical search methods to learned graphs. In particular, SPTM [2] and SoRB [3] use a replay buffer of observations as nodes in a graph and learn
a parametric distance function to draw edges in the graph. These methods have
been successfully applied to long-horizon simulated navigation tasks that were
too challenging for previous methods to solve.

Automating business processes with machine learning in the COVID-19 pandemic

Automating business processes with machine learning in the COVID-19 pandemic

COVID-19 has changed our world significantly. All of this change has been almost instantaneous, forcing companies to pivot quickly and find new ways to operate. Automation is playing an increasingly important role to help companies adjust. The ability to automate business processes with machine learning (ML) is unlocking new efficiencies and allowing companies to move faster where they might have otherwise been stuck using antiquated systems. What might have previously taken an organization years is now happening in weeks. In this post, we discuss how AWS customers are applying ML in areas such as document processing and forecasting to quickly respond to the challenges at hand.

Automating document processing

The ability to automate document processing remotely has proven essential as companies face new challenges in this pandemic. Demand for services like loan processing and grocery delivery has spiked in areas that no one could have predicted and the ability to quickly respond to those demands remains vital.

In April 2020, the US federal government announced the Paycheck Protection Program (PPP) to provide small businesses with funds to cover up to 8 weeks of payroll, mortgage, rent, and utility expenses. With phenomenal demand and over $349 billion allocated in just the first 13 days of the program, small business owners were scrambling to qualify.

BlueVine, a fintech company that provides small business banking, used their technology and engineering expertise to help process billions in loans. They chose Amazon Textract, a fully managed ML service that automatically extracts text and data from documents, to help automate the loan application process. In just a few days, they were up and running, analyzing tens of thousands of pages with high accuracy. In just 4 months, they were able to serve more than 155,000 small businesses with over $4.5 billion in loans. They delivered services to those who needed it most, with 68% of loans going to customers with fewer than 10 employees and 90% of loans under $60,000—serving small businesses struggling to remain afloat. BlueVine worked closely with DoorDash as their strategic partner to serve many stressed small independent restaurants, and simplify and accelerate the loan process. BlueVine used ML to automate loan application processing and scale quickly to meet the unprecedented demand. The company estimates they helped save 470,000 jobs as a result of their efforts.

Other areas of the economy were also experiencing unprecedented demand and needed to staff up quickly. However, it was a challenge to process new hire employment paperwork at the rate required. A typical PDF form has about 50 form fields; to recreate it as a digital form, the customer had to drag and drop data to the right location on each form—a particularly time-consuming task. Enter HelloSign, a Dropbox company that automates the signature process.  HelloWorks is a HelloSign product that turns PDFs into mobile friendly forms. It uses Amazon Textract to automate document processing and save customers hundreds of hours. A popular on-demand grocery delivery service was able to onboard millions of shoppers using HelloWorks in a few weeks. HelloWorks helped the company scale their onboarding paperwork quickly by automating document processing with ML. An NY-based urgent care started to use HelloSign to register new patients. An ambulance service started using HelloWorks to send out COVID-19 test applications. What’s more, this was all happening online. As organizations continue to limit in-person interactions, demand surged for HelloWorks with users creating 3x more forms than they used to. With Textract, HelloWorks was able to automate the process and automatically create all of the fields and components, saving customers time and keeping them safe.

Forecasting the pandemic

Forecasting is a growing challenge as supply chains and demand have been disrupted on a global scale. Amazon Forecast, a fully managed service that uses ML to deliver highly accurate forecasts, is helping customers deliver forecasts for everything from product demand to financial performance. Many forecasting tools only look at a historical series of data to predict the future, with the assumption being that the future is determined by the past. When faced with irregular trends this approach falters – as demonstrated by the challenges faced by companies to develop models that accurately capture the complexities of the real world since the beginning of the COVID-19 pandemic. With Amazon Forecast, you can integrate irregular trends and a multitude of other variables—on top of your historical series of data—to deliver the most accurate forecast possible with no ML experience required.

One of the largest challenges when it comes to forecasting has been understanding the projection of the disease itself. How quickly will it spread? When will it spike next? How many hospital beds will be needed to accommodate that spike? Forecasting models have the potential to assess disease trends and the course of the COVID-19 pandemic. However, the nature of the COVID-19 time-series makes forecasting extremely challenging, given the variations we’ve observed in disease spread across multiple communities and populations. COVID-19 remains a relatively unknown disease with no historic data to predict trends, such as seasonality and vulnerable sections of the population.

To better understand and forecast the disease, Rackspace Technology, University of California Irvine, Scientific Systems, and Plan4Co have come together to introduce a new COVID-19 forecasting model to deliver greater accuracy using Amazon Forecast. The team of medical, academic, data science modeling, and forecasting experts worked together to use Amazon Forecast DeepAR+ to incorporate related time-series to build more powerful forecasting models. Their model used deep learning to learn patterns between related time-series, like mobility data, and the target time-series. As a result, the model outperformed other approaches, such as those provided by the well-known IHME model.

With Amazon Forecast, the team was able to preprocess the time-series, train dozens of models quickly, compare model performance, and quantify the best forecasts. These forecasts can be developed on a daily and weekly basis, now available for countries, states, counties, and zip-codes. This information can, for example, help forecast what new cases will be in the short-term and long-term by learning from real-world data, like time to peak and rate of transmission. This information is critical because government agencies frequently use the occurrence of new cases in a population over a specified period of time to help determine when it’s safe to re-open sectors of the economy.

Conclusion

As the pandemic continues, new challenges will inevitably arise. When time was of the essence, these organizations looked to ML technology and automation to serve their customers’ needs and find new ways to operate. This use of new technology will not only help them respond to the pandemic today, but also set them up to thrive in the future.

To learn about other ways AWS is working toward solutions in the COVID-19 pandemic check out Introducing the COVID-19 Simulator and Machine Learning Toolkit for Predicting COVID-19 Spread and Intelligently connect to customers using machine learning in the COVID-19 pandemic.

 


About the Author

Taha A. Kass-Hout, MD, MS, is director of machine learning and chief medical officer at Amazon Web Services (AWS). Taha received his medical training at Beth Israel Deaconess Medical Center, Harvard Medical School, and during his time there, was part of the BOAT clinical trial. He holds a doctor of medicine and master’s of science (bioinformatics) from the University of Texas Health Science Center at Houston.

Read More

Science Magnified: Gordon Bell Winners Combine HPC, AI

Science Magnified: Gordon Bell Winners Combine HPC, AI

Seven finalists including both winners of the 2020 Gordon Bell awards used supercomputers to see more clearly atoms, stars and more — all accelerated with NVIDIA technologies.

Their efforts required the traditional number crunching of high performance computing, the latest data science in graph analytics, AI techniques like deep learning or combinations of all of the above.

The Gordon Bell Prize is regarded as a Nobel Prize in the supercomputing community, attracting some of the most ambitious efforts of researchers worldwide.

AI Helps Scale Simulation 1,000x

Winners of the traditional Gordon Bell award collaborated across universities in Beijing, Berkeley and Princeton as well as Lawrence Berkeley National Laboratory (Berkeley Lab). They used a combination of HPC and neural networks they called DeePMDkit to create complex simulations in molecular dynamics, 1,000x faster than previous work while maintaining accuracy.

In one day on the Summit supercomputer at Oak Ridge National Laboratory, they modeled 2.5 nanoseconds in the life of 127.4 million atoms, 100x more than the prior efforts.

Their work aids understanding complex materials and fields with heavy use of molecular modeling like drug discovery. In addition, it demonstrated the power of combining machine learning with physics-based modeling and simulation on future supercomputers.

Atomic-Scale HPC May Spawn New Materials 

Among the finalists, a team including members from Berkeley Lab and Stanford optimized the BerkeleyGW application to bust through the complex math needed to calculate atomic forces binding more than 1,000 atoms with 10,986 electrons, about 10x more than prior efforts.

“The idea of working on a system with tens of thousands of electrons was unheard of just 5-10 years ago,” said Jack Deslippe, a principal investigator on the project and the application performance lead at the U.S. National Energy Research Scientific Computing Center.

Their work could pave a way to new materials for better batteries, solar cells and energy harvesters as well as faster semiconductors and quantum computers.

The team used all 27,654 GPUs on the Summit supercomputer to get results in just 10 minutes, thanks to harnessing an estimated 105.9 petaflops of double-precision performance.

Developers are continuing the work, optimizing their code for Perlmutter, a next-generation system using NVIDIA A100 Tensor Core GPUs that sport hardware to accelerate 64-bit floating-point jobs.

Analytics Sifts Text to Fight COVID

Using a form of data mining called graph analytics, a team from Oak Ridge and Georgia Institute of Technology found a way to search for deep connections in medical literature using a dataset they created with 213 million relationships among 18.5 million concepts and papers.

Their DSNAPSHOT (Distributed Accelerated Semiring All-Pairs Shortest Path) algorithm, using the team’s customized CUDA code, ran on 24,576 V100 GPUs on Summit, delivering results on a graph with 4.43 million vertices in 21.3 minutes. They claimed a record for deep search in a biomedical database and showed the way for others.

Graph analytics from Gordon Bell 2020 finalists at Oak Ridge and GIT
Graph analytics finds deep patterns in biomedical literature related to COVID-19.

“Looking forward, we believe this novel capability will enable the mining of scholarly knowledge … (and could be used in) natural language processing workflows at scale,” Ramakrishnan Kannan, team lead for computational AI and machine learning at Oak Ridge, said in an article on the lab’s site.

Tuning in to the Stars

Another team pointed the Summit supercomputer at the stars in preparation for one of the biggest big-data projects ever tackled. They created a workflow that handled six hours of simulated output from the Square Kilometer Array (SKA), a network of thousands of radio  telescopes expected to come online later this decade.

Researchers from Australia, China and the U.S. analyzed 2.6 petabytes of data on Summit to provide a proof of concept for one of SKA’s key use cases. In the process they revealed critical design factors for future radio telescopes and the supercomputers that study their output.

The team’s work generated 247 GBytes/second of data and spawned 925 GBytes/s in I/O. Like many other finalists, they relied on the fast, low-latency InfiniBand links powered by NVIDIA Mellanox networking, widely used in supercomputers like Summit to speed data among thousands of computing nodes.

Simulating the Coronavirus with HPC+AI

The four teams stand beside three other finalists who used NVIDIA technologies in a competition for a special Gordon Bell Prize for COVID-19.

The winner of that award used all the GPUs on Summit to create the largest, longest and most accurate simulation of a coronavirus to date.

“It was a total game changer for seeing the subtle protein motions that are often the important ones, that’s why we started to run all our simulations on GPUs,” said Lilian Chong, an associate professor of chemistry at the University of Pittsburgh, one of 27 researchers on the team.

“It’s no exaggeration to say what took us literally five years to do with the flu virus, we are now able to do in a few months,” said Rommie Amaro, a researcher at the University of California at San Diego who led the AI-assisted simulation.

The post Science Magnified: Gordon Bell Winners Combine HPC, AI appeared first on The Official NVIDIA Blog.

Read More

Helping small businesses deliver personalized experiences with the Amazon Personalize extension for Magento

Helping small businesses deliver personalized experiences with the Amazon Personalize extension for Magento

This is a guest post by Jeff Finkelstein, founder of Customer Paradigm, a full-service interactive media firm and Magento solutions partner.

Many small retailers use Magento, an open-source ecommerce platform, to create websites or mobile applications to sell their products online. Personalization is key to creating high-quality ecommerce experiences, but small businesses often lack access to resources required to implement a scalable, sophisticated personalization solution—especially one that is powered by machine learning (ML). An ML-based solution results in better end-user engagement, conversion, and increased sales compared to traditional rudimentary techniques like static rules. With the Amazon Personalize extension for Magento, we’re now making the same ML personalization technology used by Amazon.com accessible to small businesses that use Magento.

This is the case with Hoopologie, a small hula hoop supply company based in Boulder, Colorado. Founder Melina Rider started the business in 2013 and has an extensive product line of over 1,000 SKUs. For Hoopologie, like many other small ecommerce merchants, creating sophisticated personalized experiences has been out of reach. Hiring ML experts to implement an ML solution isn’t affordable, and using a rules-based system requires a lot of manual maintenance, limiting scale and performance.

“Creating personalized recommendations for every type of user on the site would take us hours and hours every month. It’s not affordable, and I’d rather have our limited staff help our customers in other ways,” Rider says.

Hoopologie uses Magento to power their website, and has seen an increase of 40.5% in sales and an average order value increase of just over $50 per order by using the Amazon Personalize extension for Magento. In this post, we show you how to implement the Amazon Personalize extension for Magento to start creating ML-powered personalized experiences for your customers to improve engagement, conversion, and revenue.

Amazon Personalize for Magento

Amazon Personalize is a fully managed ML service that leverages over 20 years of experience at Amazon.com to help you deliver personalized experiences faster. The Amazon Personalize extension allows Magento merchants to take advantage of the benefits of Amazon Personalize and use its algorithms to power a Magento store’s product recommendations. All data and ML models are stored privately and securely in the merchant’s AWS account.

It’s easy to install the extension on your site, create an AWS account, and authorize the extension to access your AWS account. For instructions, see Amazon Personalize for Magento 2: Installation & Configuration Instructions. After you complete those steps, you see the Amazon Personalize extension in your Magento admin area, under Stores, Configuration.

Entering configuration details

For instructions on configuring the extension, watch the video clip Amazon Personalize for Magento 2: Extension Configuration on YouTube.

On the configuration page, you can provide all the values that tie your Magento site into Amazon Personalize in your AWS account: 

  1. For License Key, enter the license key that you received for the extension.

If you installed the extension but don’t have a license key, you can start a 15-day free trial.

If your license is active, the License Active field shows as yes.

  1. If the license isn’t activated, add a valid access key.
  2. For Module Enabled, you can enable or disable the module from the admin area.

This is helpful if you need to troubleshoot a site, or if you have a copy of the site on a test server.

When your Amazon Personalize campaign (an ML-powered recommender trained on your data) is active in Amazon Personalize, Campaign Active shows as yes.

The system automatically uploads, ingests, and trains the data. This is a read-only display; this isn’t something you can change to turn on or off the campaign.

  1. For File owner home directory, enter a file directory outside your web root (such as ../../keys).

Amazon’s security requirements mandate that your AWS credentials not be stored in the Magento database. Instead, your keys need to be stored in a directory outside the web root, usually up a level or two from where your Magento site is stored in your file system.

  1. For AWS Region, enter the Region where you want the extension to access Amazon Personalize.

Because Amazon Personalize may not be available in all Regions, be sure to enter a Region code where Amazon Personalize is available and that is located geographically closest to your Magento server’s physical location.

  1. For AWS Account Number, enter your AWS account number.
  2. For Access Key, enter the access key ID for the user that you created in the authorization part of the setup.
  3. For Secret Key, enter the secret key ID for the user that you created in authorization part of the setup.

 

For instructions on finding your AWS account number, access key, and secret key, watch the video clip Amazon Personalize for Magento 2: Creating IAM User in AWS on YouTube.

  1. Choose Save Config.

This saves the access information and writes the access key and secret key to a file outside the web root.

Starting training

To begin the training process, choose Start Process.

Make sure that your Magento cron is running. If it’s not running, it’s highly likely that the process won’t move from one step to the next.

The process includes the following high-level steps:

  • Exporting historical data from your Magento site into CSV files.
  • Creating a private Amazon Simple Storage Service (Amazon S3) bucket in your AWS account to stage CSV files.
  • Uploading the CSV files to the S3 bucket in your AWS account.
  • Instructing Amazon Personalize to create a custom solution and campaign based on your data. The result is a private ML model hosted in your AWS account.

The following screenshot shows the progress tracker of the individual steps.

You can restart the process at any time by choosing Reset Process. This process starts at the beginning, and may incur additional data upload and training costs.

A/B split testing

To evaluate the effectiveness of the Amazon Personalize campaign created on your Magento store, we built in an A/B split testing system. A/B testing allows you to expose two subsets of your users to two variations of a user experience on your site and measure which variation results in the highest conversion rate. Control is default Magento; Test is personalized recommendations from your Amazon Personalize campaign.

  1. For the system to be active, make sure that Enabled is set to Yes.

If Enabled is set to No, Magento doesn’t use the extension.

  1. For Set Percentage, choose from the following A/B split test options:
    1. 100% Control / 0% test (Amazon Personalize isn’t used at all)
    2. 75% Control / 25% test (Amazon Personalize is used 25% of the time)
    3. 50% Control / 50% test (Amazon Personalize is used 50% of the time)
    4. 25% Control / 75% test (Amazon Personalize is used 75% of the time)
    5. 10% Control / 90% test (Amazon Personalize is used 90% of the time)
    6. 0% Control / 100% test (Amazon Personalize is used 100% of the time)

  1. Choose Save Config to save the settings.

Next you allow Amazon Personalize to train. Depending on the amount of historical data in your Magento system, this process can take a few hours to complete. You can revisit this page to see the progress of the training.

After you complete this process the first time, you can retrain new versions of the model while continuing to use the active campaign to provide product recommendations. To contain costs and create the most relevant dataset for your site, we’ve limited historical data to the previous 6 months. (Interaction data older than 6 months provides less value in making relevant product recommendations.)

After the Amazon Personalize system is enabled, the system automatically does the following:

  • Displays personalized product recommendations to your end-users.
  • Adds a “We also suggest” list of recommended products to your product page.
  • Automatically adds a Google Analytics tag (if your site uses Google Analytics) for any order that was placed on the site when Amazon Personalize was active. This uses the existing Google Analytics system. If you’re not using Google Analytics, this step is omitted.
  • Adds a field to the Magento database that indicates if an order was placed when Amazon Personalize was active.
  • Adds real-time data interaction indicators, allowing Amazon Personalize to learn from users when they add a product to their cart, wishlist, or complete a purchase.

To add Amazon Personalize to additional pages, choose Content, Pages, Select a Page.

For help troubleshooting, see Amazon Personalize Extension for Magento 2.

Conclusion

In just a few easy steps, you can add the Amazon Personalize extension for Magento to an ecommerce site. Hooplogie started testing the Amazon Personalize extension in January 2020. They began by using an A/B split test to show the personalized recommendations to 50% of their users. After 6 weeks, users seeing personalized recommendations spent $50.02 more per transaction, and overall revenue from users in the test was up by 42%. Hoopologie continues to use the Amazon Personalize extension to create personalized experiences for their customers. They intend to expand usage by continuously evaluating the system, retraining as new products are added, and adding personalization to additional areas of the site.

Advanced personalization is no longer beyond the reach of many ecommerce merchants using Magento. The results from Hoopologie are a testament to the positive impact of implementing a scalable, sophisticated personalization solution powered by ML. Learn more about the Amazon Personalize extension for Magento for Customer Paradigm and enjoy a complementary 15-day free trial.

 


About the Author

Jeff Finklelstein is the founder of Customer Paradigm based in Boulder, Colorado. Founded in 2002, their team has completed more than 12,600 web development and marketing projects for ecommerce and other clients throughout the world. The Customer Paradigm team previously built an integration between Magento and Amazon Marketplace, which was incorporated into the core Magento framework in 2017. More information can be found at https://www.CustomerParadigm.com.

Read More

COVID-19 Spurs Scientific Revolution in Drug Discovery with AI

COVID-19 Spurs Scientific Revolution in Drug Discovery with AI

Research across global academic and commercial labs to create a more efficient drug discovery process won recognition today with a special Gordon Bell Prize for work fighting COVID-19.

A team of 27 researchers led by Rommie Amaro at the University of California at San Diego (UCSD) combined high performance computing (HPC) and AI to provide the clearest view to date of the coronavirus, winning the award.

Their work began in late March when Amaro lit up Twitter with a picture of part of a simulated SARS-CoV-2 virus that looked like an upside-down Christmas tree.

Seeing it, one remote researcher noticed how a protein seemed to reach like a crooked finger from behind a protective shield to touch a healthy human cell.

“I said, ‘holy crap, that’s crazy’… only through sharing a simulation like this with the community could you see for the first time how the virus can only strike when it’s in an open position,” said Amaro, who leads a team of biochemists and computer experts at UCSD.

Tweet of coronavirus from Amaro Lab
Amaro shared her early results on Twitter.

The image in the tweet was taken by Amaro’s lab using what some call a computational microscope, a digital tool that links the power of HPC simulations with AI to see details beyond the capabilities of conventional instruments.

It’s one example of work around the world using AI and data analytics, accelerated by NVIDIA Clara Discovery, to slash the $2 billion in costs and ten-year time span it typically takes to bring a new drug to market.

A Virtual Microscope Enhanced with AI

In early October, Amaro’s team completed a series of more ambitious HPC+AI simulations. They showed for the first time fine details of how the spike protein moved, opened and contacted a healthy cell.

One simulation (below) packed a whopping 305 million atoms, more than twice the size of any prior simulation in molecular dynamics. It required AI and all 27,648 NVIDIA GPUs on the Summit supercomputer at Oak Ridge National Laboratory.

More than 4,000 researchers worldwide have downloaded the results that one called “critical for vaccine design” for COVID and future pathogens.

Today, it won a special Gordon Bell Prize for COVID-19, the equivalent of a Nobel Prize in the supercomputing community.

Two other teams also used NVIDIA technologies in work selected as finalists in the COVID-19 competition created by the ACM, a professional group representing more than 100,000 computing experts worldwide.

And the traditional Gordon Bell Prize went to a team from Beijing, Berkeley and Princeton that set a new milestone in molecular dynamics, also using a combination of HPC+AI on Summit.

An AI Funnel Catches Promising Drugs

Seeing how the infection process works is one of a string of pearls that scientists around the world are gathering into a new AI-assisted drug discovery process.

Another is screening from a vast field of 1068 candidates the right compounds to arrest a virus. In a paper from part of the team behind Amaro’s work, researchers described a new AI workflow that in less than five months filtered 4.2 billion compounds down to the 40 most promising ones that are now in advanced testing.

“We were so happy to get these results because people are dying and we need to address that with a new baseline that shows what you can get with AI,” said Arvind Ramanathan, a computational biologist at Argonne National Laboratory.

Ramanathan’s team was part of an international collaboration among eight universities and supercomputer centers, each contributing unique tools to process nearly 60 terabytes of data from 21 open datasets. It fueled a set of interlocking simulations and AI predictions that ran across 160 NVIDIA A100 Tensor Core GPUs on Argonne’s Theta system with massive AI inference runs using NVIDIA TensorRT on the many more GPUs on Summit.

Docking Compounds, Proteins on a Supercomputer

Earlier this year, Ada Sedova put a pearl on the string for protein docking (described in the video below) when she described plans to test a billion drug compounds against two coronavirus spike proteins in less than 24 hours using the GPUs on Summit. Her team’s work cut to just 21 hours the work that used to take 51 days, a 58x speedup.

In a related effort, colleagues at Oak Ridge used NVIDIA RAPIDS and BlazingSQL to accelerate by an order of magnitude data analytics on results like Sedova produced.

Among the other Gordon Bell finalists, Lawrence Livermore researchers used GPUs on the Sierra supercomputer to slash the training time for an AI model used to speed drug discovery from a day to just 23 minutes.

From the Lab to the Clinic

The Gordon Bell finalists are among more than 90 research efforts in a supercomputing collaboration using 50,000 GPU cores to fight the coronavirus.

They make up one front in a global war on COVID that also includes companies such as Oxford Nanopore Technologies, a genomics specialist using NVIDIA’s CUDA software to accelerate its work.

Oxford Nanopore won approval from European regulators last month for a novel system the size of a desktop printer that can be used with minimal training to perform thousands of COVID tests in a single day. Scientists worldwide have used its handheld sequencing devices to understand the transmission of the virus.

Relay Therapeutics uses NVIDIA GPUs and software to simulate with machine learning how proteins move, opening up new directions in the drug discovery process. In September, it started its first human trial of a molecule inhibitor to treat cancer.

Startup Structura uses CUDA on NVIDIA GPUs to analyze initial images of pathogens to quickly determine their 3D atomic structure, another key step in drug discovery. It’s a member of the NVIDIA Inception program, which gives startups in AI access to the latest GPU-accelerated technologies and market partners.

From Clara Discovery to Cambridge-1

NVIDIA Clara Discovery delivers a framework with AI models, GPU-optimized code and applications to accelerate every stage in the drug discovery pipeline. It provides speedups of 6-30x across jobs in genomics, protein structure prediction, virtual screening, docking, molecular simulation, imaging and natural-language processing that are all part of the drug discovery process.

It’s NVIDIA’s latest contribution to fighting the SARS-CoV-2 and future pathogens.

NVIDIA Clara Discovery
NVIDIA Clara Discovery speeds each step of a drug discovery process using AI and data analytics.

Within hours of the shelter-at-home order in the U.S., NVIDIA gave researchers free access to a test drive of Parabricks, our genomic sequencing software. Since then, we’ve provided as part of NVIDIA Clara open access to AI models co-developed with the U.S. National Institutes of Health.

We’ve also committed to build with partners including GSK and AstraZeneca Europe’s largest supercomputer dedicated to driving drug discovery forward. Cambridge-1 will be an NVIDIA DGX SuperPOD system capable of delivering more than 400 petaflops of AI performance.

Next Up: A Billion-Atom Simulation

The work is just getting started.

Ramanathan of Argonne sees a future where self-driving labs learn what experiments they should launch next, like autonomous vehicles finding their own way forward.

“And I want to scale to the absolute max of screening 1068 drug compounds, but even covering half that will be significantly harder than what we’ve done so far,” he said.

“For me, simulating a virus with a billion atoms is the next peak, and we know we will get there in 2021,” said Amaro. “Longer term, we need to learn how to use AI even more effectively to deal with coronavirus mutations and other emerging pathogens that could be even worse,” she added.

Hear NVIDIA CEO Jensen Huang describe in the video below how AI in Clara Discovery is advancing drug discovery.

At top: An image of the SARS-CoV-2 virus based on the Amaro lab’s simulation showing 305 million atoms.

The post COVID-19 Spurs Scientific Revolution in Drug Discovery with AI appeared first on The Official NVIDIA Blog.

Read More

Detecting hidden but non-trivial problems in transfer learning models using Amazon SageMaker Debugger

Detecting hidden but non-trivial problems in transfer learning models using Amazon SageMaker Debugger

Rapid development of deep learning technology has produced an abundance of open-sourced, pre-trained models in computer vision and natural language processing. As a result, transfer learning has become a popular approach in deep learning. Transfer learning is a machine learning technique where a model pre-trained on one task is fine-tuned on a new task. Given the significant compute and time resources required to develop neural network models, adapting a pre-trained model to new data is compelling in business applications. If you’re new to deep learning, transfer learning is also a good starting point because you don’t have to build a model from scratch. For deep learning beginners, one question you may have is, how do I systematically examine model predictions to see what mistakes were made now that my data is in the form of pictures or text?

In this post, we show you an end-to-end example of doing transfer learning by using Amazon SageMaker Debugger to detect hidden problems that may have serious consequences. Debugger doesn’t incur additional costs if you’re running training on Amazon SageMaker. Moreover, you can enable the built-in rules with just a few lines of code when you call the Amazon SageMaker estimator function. For our use case, we do transfer learning using a ResNet model to recognize German traffic signs [1].

In this post, we focus on issues that occur during training. For more information about using Debugger for inference and explainability, see Detecting and analyzing incorrect model predictions with Amazon SageMaker Model Monitor and Debugger.

Setting up a transfer learning training job

For our use case, we want to adapt a pre-trained computer vision model to recognize traffic signs in Germany. We use the GTSRB dataset [1] for this new task. You can find the notebook and training script in the GitHub repo.

Applying preprocessing on the dataset

We first apply some typical preprocessing for a ResNet model on our dataset (see the complete notebook for where to download the dataset). To improve model generalization, we apply data augmentation (RandomResizedCrop and RandomHorizontalFlip). These operations ensure that an image looks differently in each epoch.  Lastly, we normalize the data: because the model has been pre-trained on the ImageNet dataset, we apply the same preprocessing and normalization (subtract the mean and divide by the standard deviation of the ImageNet dataset). See the following code:

from torchvision import datasets, models, transforms

# Define pre-processing
train_transform =  transforms.Compose([
                                        transforms.RandomResizedCrop(224),
                                        transforms.RandomHorizontalFlip(),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                                    ])

We use Pytorch’s ImageFolder function, which takes a local folder and loads all images located in the subdirectories and encodes the directory name as a label. Next we specify the dataloader that takes the batch size and dataset. We use the dataloader during training to provide new batches in each iteration. See the following code:

# Apply the pre-processing to the training dataset
dataset = datasets.ImageFolder(root='GTSRB/Training', transform=train_transform)
train_dataloader = torch.utils.data.DataLoader(dataset, batch_size=64, shuffle=True)

For the validation dataset, we don’t apply data augmentation and only resize images to the appropriate size:

# Apply the pre-processing to validation dataset
val_transform = transforms.Compose([
                                        transforms.Resize(256),
                                        transforms.CenterCrop(224),
                                        transforms.ToTensor(),
                                        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                                        ])

dataset_test = datasets.ImageFolder(root='GTSRB/Final_Test', transform=val_transform)
val_dataloader = torch.utils.data.DataLoader(dataset, batch_size=64, shuffle=False)

Loading a pre-trained ResNet model

Because we have a limited variety of traffic signs, we pick a simpler ResNet model for this task: resnet18. You can load a ResNet18 from the PyTorch model zoo with pre-trained weights using just one line of code:

#get pretrained ResNet model
model = models.resnet18(pretrained=True)

The model has been pre-trained on the ImageNet dataset, which consists of 1,000 image classes. For our use case, we fine-tune it on a dataset that only has 43 classes. We adjust the last layer, which is a fully connected Linear layer:

#traffic sign dataset has 43 classes
nfeatures = model.fc.in_features
model.fc = torch.nn.Linear(nfeatures, 43)

Because we train a multi-classification model, we use the cross entropy loss function:

#loss for multi label classification
loss_function = torch.nn.CrossEntropyLoss()

Next we specify the optimizer that takes the model parameters and learning rate. Here we use the stochastic gradient descent optimizer:

# optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

Defining the training loop

The following code blocks define the training loop. We iterate over ten epochs, perform the forward and backward pass, and update the model parameters.

for epoch in range(10):  # loop over the entire dataset 10 times
   
    for data in train_dataloader:
    
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        
        # zero the parameter gradients
        optimizer.zero_grad()

        # forward 
        outputs = model(inputs)
        
        #compute loss
        loss = loss_function(outputs, labels)
        
        #backward pass
        loss.backward()
        
        #optimize 
        optimizer.step()
        
        #get predictions
        _, preds = torch.max(outputs, 1)

        # statistics
        epoch_loss += loss.item() 
        
        print('Epoch {}/{} Loss: {:.4f}'.format(epoch, 1, epoch_loss))

If you just run the preceding code, the training runs just on your Amazon SageMaker notebook. To make the most out of Amazon SageMaker, you want to use the pre-built DLC containers, which come with optimized performance and let you access the full feature sets of Debugger at no additional cost. By running on Amazon SageMaker, we can easily train our models at scale. Most deep learning models are trained on GPU due to the computational intensity. With Amazon SageMaker, GPU instances are automatically created and torn down after training completes, so you only pay for the time the resources were used.

Making a training script compatible with Amazon SageMaker

To run training on Amazon SageMaker, you need to change the location variable in your pre-processing code to the generic Amazon SageMaker environment variables. When Amazon SageMaker spins up the training instance, it automatically downloads the training and validation data from Amazon Simple Storage Service (Amazon S3) into a local folder on the training instance. We can retrieve the local path with os.environ['SM_CHANNEL_TRAIN'] and os.environ['SM_CHANNEL_TEST']:

# update environment variable for training and testing data sets
dataset = datasets.ImageFolder(os.environ['SM_CHANNEL_TRAIN'], transform=train_transform)
dataset_test = datasets.ImageFolder(os.environ['SM_CHANNEL_TEST'], transform=val_transform)

After the change, you should save all the model code as a separate script called train.py.

Uploading data to the S3 bucket

As mentioned in the previous step, Amazon SageMaker automatically downloads training and validation data into the training instance. We need to upload the data to Amazon S3 first. You can find detailed instructions on how to do that in the notebook.

Setting up debugger

Now that we have defined the training script and uploaded the data, we’re ready to start training on Amazon SageMaker. We run the training with several Debugger built-in rules enabled. Via the Amazon SageMaker Python SDK and the rule_configs module, we can select any of the 20 available built-in rules, which run at no additional cost. For demonstration purposes, we select loss_not_decreasing, class_imbalance and dead_relu. We can configure several parameters for these rules: for instance, most rules take a threshold parameter that can be adjusted to define when a rule should trigger. We can also define the set of tensors the rules should run on.

The class imbalance rule takes the inputs into the loss function and counts the number of samples per class that the model has seen throughout training. To create the rule, we specify rule_configs.class_imbalance() and the rule runs on the inputs of the loss function. To fine-tune the model, we use the cross entropy loss function, which takes predictions and labels and outputs a loss value. See the following code:

from sagemaker.debugger import Rule, CollectionConfig, rule_configs
class_imbalance_rule = Rule.sagemaker(base_config=rule_configs.class_imbalance(),
                                    rule_parameters={"labels_regex": "CrossEntropyLoss_input_1"}
                                    )

Next we define the loss_not_decreasing rule. It determines if the training or validation loss is decreasing and raises an issue if the loss has not decreased by a certain percentage in the last few iterations. In contrast to the previous rule, this rule runs on the outputs of the loss function (CrossEntropyLoss_output_0). See the following code:

loss_not_decreasing_rule = Rule.sagemaker(base_config=rule_configs.loss_not_decreasing(),
                             rule_parameters={"tensor_regex": "CrossEntropyLoss_output_0",
                                             "mode": "TRAIN"})

The dead_relu rule identifies how many rectified linear unit (ReLU) activations are outputting zero values. ReLU is a non-linear activation function used in many state-of-the-art models. It increases linearly for increasing positive values and outputs zero otherwise. A model can suffer from the dying ReLU problem, where the gradients become zero due to the activation output being zero. If the majority of ReLU activations output zero values, the model can’t effectively learn because weights are no longer getting updated. We instantiate the rule by specifying rule_configs.dead_relu(), and the rule runs on all tensors that captured outputs from ReLU activations:

dead_relu_rule = Rule.sagemaker(base_config=rule_configs.dead_relu(),
                                rule_parameters={"tensor_regex": "relu_output"})

To record additional tensors, we can specify a debugger hook configuration. We can either use default collections such as weights and gradients or define our own custom collection. The following collection saves model inputs and loss function inputs and outputs. We just need to specify a regular expression of tensor names. We save the tensors every 500 steps, and a step is one forward and backward pass. So we get tensors for step 0, 500, 1,000, and so on. See the following code:

from sagemaker.debugger import DebuggerHookConfig, CollectionConfig

debugger_hook_config = DebuggerHookConfig(
      collection_configs=[ 
          CollectionConfig(
                name="custom_collection",
                parameters={ "include_regex": "*ResNet_input|*CrossEntropyLoss",
                             "save_interval": "500" })])

For a full list of collections and rules Debugger offers, see List of Debugger Built-in Rules. Debugger captures the tensor collections you specified throughout the training steps and automatically analyzes them against the rules.

Calling the Amazon SageMaker training API

We define the PyTorch estimator that takes the separate training script we saved earlier and specify the instance type that Amazon SageMaker creates for us. To run the training with Debugger and built-in rules, we only have to pass the list of rules and the debugger hook configuration:

from sagemaker.pytorch import PyTorch

pytorch_estimator = PyTorch(entry_point='train.py',
                            role=sagemaker.get_execution_role(),
                            train_instance_type='ml.p2.xlarge',
                            train_instance_count=1,
                            framework_version='1.6.0',
                            py_version='py3',
                            debugger_hook_config=debugger_hook_config,
                            rules=[class_imbalance_rule, dead_relu_rule, loss_not_decreasing_rule]
                           )

Now we start the training on Amazon SageMaker by calling fit(). The function takes a dictionary that specifies the location of the training and validation data in Amazon S3. The keys of the dictionary are the name of data channels Amazon SageMaker creates in the training instance. See the following code:

pytorch_estimator.fit(inputs={'train': 's3://{}/train'.format(bucket), 
                              'test': 's3://{}/test'.format(bucket)}, 
                      wait=True)

While the training is in progress, we can monitor the rule status in real time in Amazon SageMaker Studio. It turns out that the loss_not_decreasing and class_imbalance rules are triggered. The training runs for 10 epochs and reaches a final test accuracy of 96.3%.

This seems good, but why were the rules triggered? Let’s dive into the data Debugger captured to find out the root causes.

Using SageMaker Debugger rules and data to uncover hidden problems

In this section, we investigate the data to find any hidden problems, create custom rules to fix the model, and rerun the training.

Inspecting loss_not_decreasing

We use Debugger to investigate what triggered the loss_not_decreasing rule. We use the smdebug library, which provides all the functionalities to read and access Debugger data. First we create a trial object that takes the path where the Debugger data is stored as input. This can either be a local or Amazon S3 path. With just a few lines of code, we can retrieve and visualize the loss values as training is still in progress. With trial.steps(), we retrieve the number of recorded steps: a step is one forward and backward pass. We can also specify a mode to retrieve data from the training (modes.TRAIN) or validation phase (modes.EVAL). Debugger’s default sampling interval is 500, so we get loss values for step 0, 500, 1,000, and so on.

To access the loss values, we pass the name of the loss into the trial.tensor() function. The cross entropy loss function we picked measures the performance of a multi-classification model. It takes two inputs: the model outputs and ground truth labels. We can access its outputs via trial.tensor('CrossEntropyLoss_output_0').values():

trial.tensor('CrossEntropyLoss_output_0').values(mode=modes.TRAIN)

{0: array(3.9195325, dtype=float32),
 500: array(0.8488243, dtype=float32),
 1000: array(0.54870504, dtype=float32),
 1500: array(0.25874993, dtype=float32),
 2000: array(0.20406848, dtype=float32),
 2500: array(0.29052508, dtype=float32),
 3000: array(0.18074727, dtype=float32),
 3500: array(0.1956144, dtype=float32),
 4000: array(0.2597512, dtype=float32)}

This code returns a dictionary in which the keys are the step numbers and the values are the loss values. We can now easily visualize the loss values as training is still in progress. See the following code:

import matplotlib.pyplot as plt
from smdebug.trials import create_trial

"path = pytorch_estimator.latest_job_debugger_artifacts_path()
create_trial(path)"

plt.ylabel('Train Loss')
plt.xlabel('Steps')
plt.plot(trial.steps(mode=modes.TRAIN),
         list(trial.tensor('CrossEntropyLoss_output_0').values(mode=modes.TRAIN).values()))
plt.show()

The blue curve in the following graph shows that the default training configuration ran the training for too long. Instead of training for 4,000 steps, early_stopping should have been applied after 1,000 steps. We can use Debugger to enable auto-termination, which stops the training when a rule triggers. For our use case, doing so reduces compute time by more than half (orange curve).

Debugger can auto-terminate training jobs. Metrics are sent to Amazon CloudWatch, so you can set up a CloudWatch alarm and AWS Lambda function that stops a training job if a rule triggers.

For more information about how the auto-termination feature helped one customer reduce compute costs by 70%, see Autodesk optimizes visual similarity search model in Fusion 360 with Amazon SageMaker Debugger.

Inspecting class_imbalance

Real-world datasets are often imbalanced and noisy. If the model training doesn’t account for these factors, it produces a model that has low or no predictive power for the classes with few samples. You can address this in different ways, such as during data-loading, when more samples can be drawn from the under-represented classes, or you can adjust the loss function to assign a higher penalty to incorrect predictions using class weights.

To investigate the class imbalance issue, we retrieve the inputs of the loss function (previously we retrieved the outputs) The loss function takes the model predictions and the ground truth labels as inputs. We use the latter (CrossEntropyLoss_input_1) to count the number of samples the model has seen during training:

from collections import Counter

labels = []
for step in trial.steps(mode=modes.TRAIN):
    labels.extend(trial.tensor("CrossEntropyLoss_input_1").value(step, mode=modes.TRAIN))

label_counts = Counter(labels)
plt.bar(np.arange(0,43),  label_counts.values()

The following visualization shows a high imbalance and several classes with fewer than a hundred samples.

To fix the class imbalance issue, we change the default configuration of the dataloaders to take the class weights into account and draw more samples from classes with fewer samples. Therefore, we define WeightedRandomSampler:

sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))                     

train_dataloader = torch.utils.data.DataLoader(dataset, 
                                                batch_size=64,
                                                sampler=sampler)

During training, the dataloader now draws more samples from classes with lower counts. Class imbalance may lead to the problem where the model performs well on classes with a lot of samples but poorly on classes with fewer counts. Because we trained the model without WeightedRandomSampler, let’s see which classes had particularly low accuracy by looking at the confusion matrix.

Visualizing the confusion matrix in real time

To evaluate the performance of our model, we retrieve labels and predictions and create the confusion matrix:

from sklearn.metrics import confusion_matrix
import seaborn as sns

predictions = []
labels = []
for step in trial.steps(mode=modes.EVAL):
    predictions.extend(np.argmax(trial.tensor("CrossEntropyLoss_input_0").value(step, mode=modes.EVAL), axis=1))
    labels.extend(trial.tensor("CrossEntropyLoss_input_1").value(step, mode=modes.EVAL))

cm = confusion_matrix(labels,predictions)
sns.heatmap(cm, ax=ax, cbar=True)

Each row in the matrix corresponds to the actual class, and the column indicates the predicted classes. For example, the first row shows class 0 and how often it was predicted as class 0, class 1, class 2, and so on. Ideally, we want high counts on the diagonal, because these are correctly predicted classes. Elements not on the diagonal are incorrect predictions. The confusion matrix helps us determine if particular classes in our dataset get confused more often with each other. This can happen, for instance, because samples from two different classes may be very similar. Debugger also provides a confusion built-in rule that computes the confusion matrix while the training is in progress and triggers if the ratio of data on-diagonal values and off-diagonal values exceeds a pre-defined threshold.

The following image shows that in most cases our model is predicting the correct classes, but there are a few outliers. You can use Debugger to look more closely into those outliers.

Inspecting incorrect model predictions

To find out what is causing those outliers in the confusion matrix, we investigate the examples upon which the model made false predictions. To do this analysis, we take both inputs into the loss function into account: CrossEntropyLoss_input_0 presents the model predictions, and CrossEntropyLoss_input_1 are the labels. We also retrieve the model inputs ResNet_input_0, which presents the input images. We perform the analysis on data recorded during the validation phase, so we specify mode=modes.EVAL.

We iterate over the predictions and model inputs saved by Debugger and select those where the label and prediction do not match. Then we plot the predictions and corresponding images:

for step in trial.steps(mode=modes.EVAL):
    
    predictions = np.argmax(trial.tensor('CrossEntropyLoss_input_0').value(step, mode=modes.EVAL),axis=1)
    labels = trial.tensor('CrossEntropyLoss_input_1').value(step, mode=modes.EVAL)
    images = trial.tensor('ResNet_input_0').value(step, mode=modes.EVAL)
    
    for prediction, label, image in zip(predictions, labels, images):
        if prediction != label:
            print(f"Predicted: '{signnames[p]}' Groundtruth: '{signnames[l]}' ")
            plt.imshow(i)

The following images show the result of the code segment. The analysis reveals that the model is often confused by traffic signs that involve a direction. Clearly this is a severe model bug despite the model achieving a decent test accuracy of 96.3%.

The root cause of this is the data augmentation pipeline, which performs a random horizontal flip on the training data. This data augmentation step is typically used when ResNet models are trained from scratch, but it causes a problem in our use case where the dataset contains similar classes where images just differ in their direction.

Running training with a custom rule

With Debugger, we can easily write a custom rule that checks how often the model was confused about directions. For example, we take the image “Dangerous curve to left” (class 19) and count how often it was mistaken as “Dangerous curve to right” (class 20) or vice versa. We just need to implement the function invoke_at_step that Debugger calls every time data for a new step is available. Like before, we access the inputs into the loss function: check if image class 19 or 20 is present, and count how often it was mistaken for the other. If this happens more than 10 times, the rule triggers. See the following code:

from smdebug.rules.rule import Rule

class MyCustomRule(Rule):
    def __init__(self, base_trial):
        super().__init__(base_trial)
        self.counter = 0
        
    def invoke_at_step(self, step):
        
        
        predictions = np.argmax(trial.tensor('CrossEntropyLoss_input_0').value(step),axis=1)
        labels = trial.tensor('CrossEntropyLoss_input_1').value(step)

        for prediction, label in zip(predictions, labels):
            
            if prediction == 19 and label == 20 or prediction == 20 and label == 19:
                self.counter += 1
                if self.counter > 10:
                    self.logger.info(f'Found {self.counter} cases where class 19 was mistaken as class 20 and vice versa')
                    return True
                
        return False

We can easily test and run the custom rule locally by creating the trial object and invoking the rule on the data:

from smdebug.rules import invoke_rule
from smdebug.exceptions import *

rule = MyCustomRule(trial)
try:
    invoke_rule(rule, raise_eval_cond=True)
except RuleEvaluationConditionMet as e:
    print(e)

Running the code cell in the notebook gives the following output:

[2020-10-18 18:51:24.588 28f0f34b9e29:12513 INFO rule_invoker.py:15] Started execution of rule MyCustomRule at step 0
[2020-10-18 18:53:11.846 28f0f34b9e29:12513 INFO <ipython-input-69-cae132ce9a97>:19] Found 11 cases where class 19 was mistaken as class 20 and vice versa
Evaluation of the rule MyCustomRule at step 1812 resulted in the condition being met

The rule triggered at step 1812. After the rule has been tested locally, we can run it as part of our Amazon SageMaker training job. First we need to save the rule in a separate file and then define the following configuration where we indicate on which instance type the rule should run:

from sagemaker.debugger import Rule, CollectionConfig

custom_rule = Rule.custom(
    name='MyCustomRule',
    image_uri='759209512951.dkr.ecr.us-west-2.amazonaws.com/sagemaker-debugger-rule-evaluator:latest', 
    instance_type='ml.t3.medium',     
    source='my_custom_rule.py',
    volume_size_in_gb=10, 
    rule_to_invoke='MyCustomRule',     
)

After we define the configuration, we add the custom_rule to the list of rules in the estimator object.

Fixing the model and rerunning training

Now that Debugger has helped us identify some critical issues in our model, we apply the fixes and rerun the training. As mentioned before, weighted re-sampling allows us to fix the class imbalance problem. We also change the data augmentation pipeline and remove the horizontal flip. We reduce the number of epochs from 10 to 3, because we have seen that the loss doesn’t decrease after roughly 1,000 iterations.

With Debugger, we can now compare data from different training jobs and see if the issues persist or not. We just need to create a new trial object, read data from both trials, and compare their tensors:

trial1 = create_trial("s3://bucket/training-job/debug-output")
trial2 = create_trial("s3://bucket/improved-training-job/debug-output")

The following visualization shows the label counts for the original training job and the one where we applied weighted re-sampling (orange). We see that there is no longer a class imbalance issue and the model sees roughly the same amount of instances per class.

We run the training with the same built-in rules as before and add our own custom rule. We monitor the status of the rules and can now see that none of them trigger.

Summary

In this post, we have shown an end-to-end example of how to use Amazon SageMaker Debugger to automatically find, inspect, and fix issues in a deep neural network training.

As the state-of-the-art models grow in size and complexity, finding issues early in the prototyping phase is critical to save time and costs. Model bugs may not always be obvious and as we have shown in this post, a suboptimal model may still achieve an overall good accuracy.

In our use case, we not only found critical bugs, but also reduced training time by a factor of two and improved model performance.

There is no extra cost for running Debugger built-in rules, and you benefit by having them enabled because you may discover non-obvious model issues. If you want to learn more about Debugger, check out the following:

References

[1] Johannes Stallkamp, Marc Schlipsing, Jan Salmen, Christian Igel, The German traffic sign recognition benchmark: A multi-class classification competition, The 2011 International Joint Conference on Neural Networks, 2011

 


About the Authors

Nathalie Rauschmayr is an Applied Scientist at AWS, where she helps customers develop deep learning applications.

 

 

 

Lu Huang is a Senior Product Manager on the AWS Deep Engine team, managing Sagemaker Debugger.

 

 

 

Satadal Bhattacharjee is Principal Product Manager at AWS AI. He leads the machine learning engine PM team on projects such as SageMaker and optimizes machine learning frameworks such as TensorFlow, PyTorch, and MXNet.

Read More

A Binding Decision: Startup Uses Microscopy Breakthrough to Speed Creation of COVID-19 Vaccines

A Binding Decision: Startup Uses Microscopy Breakthrough to Speed Creation of COVID-19 Vaccines

In the global race to tame the spread of COVID-19, scientific researchers and pharmaceutical companies first must understand the virus’s protein structure.

Doing so requires building detailed 3D models of protein molecules, which until recently has been an intensely time-consuming task. Structura Biotechnology’s groundbreaking software is helping speed things along.

The GPU-powered machine learning algorithms underlying Structura’s software power the image processing stage of a technology called cryo-electron microscopy, or cryo-EM, a revolutionary breakthrough in biochemistry that was the subject of the 2017 Nobel Prize in chemistry.

Cryo-EM enables powerful electron microscopes to capture detailed images of biomolecules in their near-native states. These images can then be used to reconstruct a 3D model of the biomolecules.

With cryo-EM providing valuable 2D image data, Structura’s AI-infused software, called cryoSPARC, can quickly analyze the resulting microscopy data to solve the 3D atomic structures of the embedded protein molecules. That, in turn, allows researchers to more rapidly gauge how effective drugs will be in binding to those molecules, significantly speeding up the process of drug discovery.

Hundreds of labs around the world already use the three-year-old Toronto-based company’s software, with a significant, but not surprising, surge during 2020. In fact, CEO Ali Punjani states that Structura’s software has been used by scientists to visualize COVID-19 proteins in multiple publications.

“Our software helps scientists to understand what their proteins look like and how their proposed therapeutics may bind,” Punjani said. “The more they can see about the structure of the target, the easier it becomes to design or identify a molecule that locks onto that structure and stops it.”

An Intriguing Test Case

The idea for Structura came from a conversation Punjani overheard, during his undergraduate work at the University of Toronto, about trying to solve protein structures using microscopic images. He thought the topic would make an intriguing test case for his developing interest in machine learning research.

Punjani formed his team in 2017, and Structura started building its software, backed by large-scale inference and computer vision algorithms that help to recover a 3D model from 2D image data. The key, he said, is to collect and analyze — with increasing accuracy — a sufficient amount of microscopic data to enable high-quality 3D reconstructions.

“It’s a highly scientific domain with zero tolerance for error,” Punjani said. “Getting it wrong can be a huge waste of time and money.”

Structura’s software is deployed on premises, typically on customers’ hardware, which must be up to the task of processing real-time 3D microscope data. Punjani said labs often run this work on NVIDIA Quadro RTX 6000 GPUs, or something similar, while many larger pharmaceutical companies have invested in clusters of NVIDIA V100 Tensor Core GPUs accompanied by a variety of NVIDIA graphics cards.

Structura does all of its model training and software development on machines running multi-GPU nodes of V100 GPUs. Punjani said his team writes all of its GPU kernels from scratch because of the particular and exotic nature of the problem. The code that runs on Structura’s GPUs is written in CUDA, while cuDNN is used for some high-end computing tasks.

Right Software at the Right Time

Given the value of Structura’s innovations, and the importance of cryo-EM, Punjani isn’t holding back on his ambitions for the company, which recently joined NVIDIA Inception, an accelerator program designed to nurture startups revolutionizing industries with advancements in AI and data sciences.

Punjani says that any research related to living things can now make use of the information from 3D protein structures that cryo-EM offers and, as a result, there’s a lot of industry attention focused on the kind of work Structura’s software enables.

“What we’re building right now is a fundamental building block for cryo-EM to better enable structure-based drug discovery,” he said. “Cryo-EM is set to become ubiquitous throughout all biological research.”

Stay up to date with the latest healthcare news from NVIDIA.

The post A Binding Decision: Startup Uses Microscopy Breakthrough to Speed Creation of COVID-19 Vaccines appeared first on The Official NVIDIA Blog.

Read More

NVIDIA RTX Real-Time Rendering Inspires Vivid Visuals, Captivating Cinematics for Film and Television

NVIDIA RTX Real-Time Rendering Inspires Vivid Visuals, Captivating Cinematics for Film and Television

Concept art is often considered the bread and butter of filmmaking, and Ryan Church is the concept design supervisor that’s behind the visuals of many of our favorite films.

Church has created concept art for blockbusters such as Avatar, Tomorrowland and Transformers. He’s collaborated closely with George Lucas on the Star Wars prequels and sequels trilogies. Now, he’s working on the popular series The Mandalorian.

All images courtesy of Ryan Church.

When he’s not creating unique vehicles and dazzling worlds for film and television, Church captures new visions and illustrates designs in his personal time. He’s always had a close relationship with cutting-edge technology to produce the highest-quality visuals, even when he’s working at home.

Recently, Church got his hands on an HP Z8 workstation powered by the NVIDIA Quadro RTX 6000. With the performance and speed of RTX behind his concept designs, he can render stunning images of architecture, vehicles and scenery faster than ever.

RTX Delivers More Time for Precision and Creativity

Filmmakers are always trying to figure out the quickest way to bring a concept or idea to life in a fast-paced environment.

Church says that directors nowadays don’t just want to see a drawing of a place or item for the set, but to see the actual place or item in front of them.

To do so, Church creates his 3D models in Foundry’s Modo and turns to OctaneRender, a GPU render engine that uses NVIDIA RTX to accelerate the rendering performance for his scenes. This allows him to achieve real-time rendering, and with the large memory capacity and performance gains of NVIDIA RTX, Church can create massive worlds freely without worrying about optimizing the geometry of his scenes.

“NVIDIA RTX has allowed me to work without babysitting the geometry all along the way,” said Church. “The friction has been removed from the creation process, allowing me to stay focused on the art.”

Like Church, many concept artists are using technology to create and design complex virtual sets and elaborate 3D mattes for virtual production in real time. The large GPU memory capacities of RTX allow for free flow of art creation while working with multiple creative applications.

And when trying to find the perfect lighting, or tweaking the depth of field or reflections of a scene, the NVIDIA RTX GPU speeds up the workflow to allow for better, quicker designs. Church can do up to 20-30 passes on a scene, enabling him to iterate on his designs more often so he can get the look and feel he’s aiming for.

“The RTX card in the Z8 allows me to have that complex scene and really dial in much better and faster,” said Church. “With design, lighting, texturing happening all in real time, I can model and move lights around, and see it all happening in the active, updating viewport.”

When Church needs desktop-class performance on the go, he turns to his HP ZBook Studio mobile workstation. Featuring the NVIDIA Studio driver and NVIDIA Quadro RTX GPU, the ZBook Studio has been tested and certified to work with the top creative applications.

As a leading concept designer standing at the intersection between art and technology, Church has inspired countless artists, and his work will continue to inspire for generations to come.

Concept artist Ryan Church pushes boundaries of creativity with NVIDIA RTX.

Learn more about NVIDIA RTX.

The post NVIDIA RTX Real-Time Rendering Inspires Vivid Visuals, Captivating Cinematics for Film and Television appeared first on The Official NVIDIA Blog.

Read More