We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an…Apple Machine Learning Research
Controllable Music Production with Diffusion Models and Guidance Gradients
This paper was accepted at the NeurIPS 2023 workshop on Diffusion Models.
We demonstrate how conditional generation from diffusion models can be used to tackle a variety of realistic tasks in the production of music in 44.1kHz stereo audio with sampling-time guidance. The scenarios we consider include continuation, inpainting and regeneration of musical audio, the creation of smooth transitions between two different music tracks, and the transfer of desired stylistic characteristics to existing audio clips. We achieve this by applying guidance at sampling time in a simple framework that…Apple Machine Learning Research
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
This paper was accepted at the workshop I Can’t Believe It’s Not Better! (ICBINB) at NeurIPS 2023.
Recent advances in image tokenizers, such as VQ-VAE, have enabled text-to-image generation using auto-regressive methods, similar to language modeling. However, these methods have yet to leverage pre-trained language models, despite their adaptability to various downstream tasks. In this work, we explore this gap, and find that pre-trained language models offer limited help in auto-regressive text-to-image generation. We provide a two-fold explanation by analyzing tokens from each modality…Apple Machine Learning Research
Generating Molecular Conformers with Manifold Diffusion Fields
This paper was accepted at Generative AI and Biology workshop at NeurIPS 2023.
In this paper we tackle the problem of generating a molecule conformation in 3D space given its 2D structure. We approach this problem through the lens of a diffusion model for functions in Riemannian Manifolds. Our approach is simple and scalable, and obtains results that are on par with state-of-the-art while making no assumptions about the explicit structure of molecules.Apple Machine Learning Research
FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge
Large language models’ inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to trust their responses. Even humans are prone to factual errors in their writing. Therefore verifying the factual accuracy of textual information, whether generated by large language models or curated by humans, is an important task. However, manually validating and correcting factual errors tends to be a tedious and labor-intensive process. In this paper, we propose FLEEK for automatic fact verification and correction. FLEEK automatically extracts factual…Apple Machine Learning Research
4M: Massively Multimodal Masked Modeling
*=Equal Contributors
Current machine learning models for vision are often highly specialized and limited to a single modality and task. In contrast, recent large language models exhibit a wide range of capabilities, hinting at a possibility for similarly versatile models in computer vision. In this paper, we take a step in this direction and propose a multimodal training scheme called 4M. It consists of training a single unified Transformer encoder-decoder using a masked modeling objective across a wide range of input/output modalities – including text, images, geometric, and semantic…Apple Machine Learning Research
Adaptive Weight Decay
We propose adaptive weight decay, which automatically tunes the hyper-parameter for weight decay during each training iteration. For classification problems, we propose changing the value of the weight decay hyper-parameter on the fly based on the strength of updates from the classification loss (i.e., gradient of cross-entropy), and the regularization loss (i.e., -norm of the weights). We show that this simple modification can result in large improvements in adversarial robustness — an area which suffers from robust overfitting — without requiring extra data across various datasets and…Apple Machine Learning Research
Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR
This paper was accepted at the Federated Learning in the Age of Foundation Models workshop at NeurIPS 2023.
While automatic speech recognition (ASR) has witnessed remarkable achievements in recent years, it has not garnered a widespread focus within the federated learning (FL) and differential privacy (DP) communities. Meanwhile, ASR is also a well suited benchmark for FL and DP as there is (i) a natural data split across users by using speaker information; (ii) heterogeneous data across speakers close to practical settings; (iii) interplay between acoustic and language modeling; (iv) and it…Apple Machine Learning Research
Swap Agnostic Learning, or Characterizing Omniprediction via Multicalibration
A recent line of work shows that notions of multigroup fairness imply surprisingly strong notions of omniprediction: loss minimization guarantees that apply not just for a specific loss function, but for any loss belonging to a large family of losses. While prior work has derived various notions of omniprediction from multigroup fairness guarantees of varying strength, it was unknown whether the connection goes in both directions. In this work, we answer this question in the affirmative, establishing equivalences between notions of multicalibration and omniprediction. The new definitions that…Apple Machine Learning Research