An illusion of predictability in scientific results: Even experts confuse inferential uncertainty and outcome variability
In many fields, practitioners focus on inference (precisely estimating an unknown quantity, such as a population average) instead of prediction (forecasting individual outcomes). In a newly published article, researchers from Microsoft demonstrate that this focus on inference over prediction can mislead readers into thinking that the results of scientific studies are more definitive than they actually are.
Through a series of randomized experiments, the researchers demonstrate that this confusion arises for one of the most basic ways of presenting statistical findings and affects even experts whose jobs involve producing and interpreting such results, including medical professionals, data scientists, and tenure-track faculty. In contrast, the paper shows that communicating both inferential and predictive information side by side provides a simple and effective alternative, leading to calibrated interpretations of scientific results.
This article was published in the Proceedings of the National Academy of Sciences (PNAS).
FiGURe: Simple and Efficient Unsupervised Node Representations with Filter Augmentations
Contrastive learning is a powerful method for unsupervised graph representation learning. It is typically deployed on homophilic tasks, where task labels strongly correlate with the graph’s structure. However, these representations struggle when dealing with heterophilic tasks, where edges tend to connect nodes with different labels.
Several papers have tackled the problem of heterophily by leveraging information from both low and high frequency components. Yet these methods operate in semi-supervised settings, and the extension of these ideas in unsupervised learning still needs to be explored.
In a new paper: FiGURe: Simple and Efficient Unsupervised Node Representations with Filter Augmentations, researchers from Microsoft propose using filter banks for learning representations that can cater to both heterophilic and homophilic tasks. They address the related computational and storage burdens by sharing the encoder across these various filter views, and by learning a low-dimensional representation which is projected to high dimensions using Random Fourier Features. FiGURe achieves a gain of up to 4.4%, compared to the state-of-the-art unsupervised models, across all datasets in consideration, both homophilic and heterophilic.
Kathleen Sullivan named to Insider’s 30 under 40 in healthcare list
Microsoft Research congratulates Kathleen Sullivan (opens in new tab) for being named to Insider’s list of 30 under 40 forging a new future in healthcare (opens in new tab). After a competitive nomination and interview, Kathleen was selected for this inspiring list of “entrepreneurs, scientists, doctors, and business leaders who are transforming the healthcare industry.”
As senior director of strategy and operations within the health and life sciences division of Microsoft Research, Sullivan helps steer the company’s investments in AI. She helped engineer a Microsoft collaboration with Nuance Technologies–a precursor to Microsoft’s acquisition of Nuance in 2021. In 2018, Sullivan helped secure Microsoft’s partnership with Adaptive Biotechnologies to map the human immune system (opens in new tab).
Read the Insider article (opens in new tab)