Apple – Page 14 – Vedere AI

Revisit Large-Scale Image–Caption Data in Pre-training Multimodal Foundation Models

April 8, 2025

by Apple

Recent advancements in multimodal models highlight the value of rewritten captions for improving performance, yet key challenges remain. Notably, the role of synthetic captions and their interaction with original web-crawled AltTexts in pre-training is still unclear. Additionally, different multimodal foundation models may have distinct preferences for specific caption formats while the efforts of studying the optimal captions for each foundation model remain limited. In this work, we introduce a novel, controllable, and scalable captioning pipeline that generates diverse caption formats…Apple Machine Learning Research

Apple Workshop on Natural Language Understanding 2024

April 7, 2025

by Apple

Progress in natural language processing enables more intuitive ways of interacting with technology. For example, many of Apple’s products and services, including Siri and search, use natural language understanding and generation to enable a fluent and seamless interface experience for users. Natural language is a rapidly moving area of machine learning research, and includes work on large-scale data curation across multiple languages, novel architectures and algorithms, and new evaluation regimes, all of which involve important issues of privacy and security, as well as of performance and…Apple Machine Learning Research

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

April 4, 2025

by Apple

Large Language Models (LLMs) have transformed natural language processing, but face significant challenges in widespread deployment due to their high runtime cost. In this paper, we introduce SeedLM, a novel post-training compression method that uses seeds of a pseudo-random generator to encode and compress model weights. Specifically, for each block of weights, we
find a seed that is fed into a Linear Feedback Shift Register (LFSR) during inference to efficiently generate a random matrix. This matrix is then linearly combined with compressed coefficients to reconstruct the weight block…Apple Machine Learning Research

Interpreting and Improving Optimal Control Problems With Directional Corrections

April 3, 2025

by Apple

Many robotics tasks, such as path planning or trajectory optimization, are formulated as optimal control problems (OCPs). The key to obtaining high performance lies in the design of the OCP’s objective function. In practice, the objective function consists of a set of individual components that must be carefully modeled and traded off such that the OCP has the desired solution. It is often challenging to balance multiple components to achieve the desired solution and to understand, when the solution is undesired, the impact of individual cost components. In this paper, we present a framework…Apple Machine Learning Research

Modeling Speech Emotion With Label Variance and Analyzing Performance Across Speakers and Unseen Acoustic Conditions

April 2, 2025

by Apple

Spontaneous speech emotion data usually contain perceptual grades where graders assign emotion score after listening to the speech files. Such perceptual grades introduce uncertainty in labels due to grader opinion variation. Grader variation is addressed by using consensus grades as groundtruth, where the emotion with the highest vote is selected, and as a consequence fails to consider ambiguous instances where a speech sample may contain multiple emotions, as captured through grader opinion uncertainty. We demonstrate that using the probability density function of the emotion grades as…Apple Machine Learning Research

Universally Instance-Optimal Mechanisms for Private Statistical Estimation

April 2, 2025

by Apple

We consider the problem of instance-optimal statistical estimation under the constraint of differential privacy where mechanisms must adapt to the difficulty of the input dataset. We prove a
new instance specific lower bound using a new divergence and show it characterizes the local minimax optimal rates for private statistical estimation. We propose two new mechanisms that are
universally instance-optimal for general estimation problems up to logarithmic factors. Our first
mechanism, the total variation mechanism, builds on the exponential mechanism with stable approximations of the total…Apple Machine Learning Research

Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization

April 2, 2025

by Apple

In this work, we propose Mutual Reinforcing Data Synthesis (MRDS) within LLMs to improve few-shot dialogue summarization task. Unlike prior methods that require external knowledge, we mutually reinforce the LLM’s dialogue synthesis and summarization capabilities, allowing them to complement each other during training and enhance overall performances. The dialogue synthesis capability is enhanced by directed preference optimization with preference scoring from summarization capability. The summarization capability is enhanced by the additional high quality dialogue-summary paired data produced…Apple Machine Learning Research

The Role of Prosody in Spoken Question Answering

April 2, 2025

by Apple

Spoken language understanding research to date has generally carried a heavy text perspective. Most datasets are derived from text, which is then subsequently synthesized into speech, and most models typically rely on automatic transcriptions of speech. This is to the detriment of prosody–additional information carried by the speech signal beyond the phonetics of the words themselves and difficult to recover from text alone. In this work, we investigate the role of prosody in Spoken Question Answering. By isolating prosodic and lexical information on the SLUE-SQA-5 dataset, which consists of…Apple Machine Learning Research

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025

March 31, 2025

by Apple

Apple Machine Learning Research

VibE: A Visual Analytics Workflow for Semantic Error Analysis of CVML Models at Subgroup Level

March 31, 2025

by Apple

Effective error analysis is critical for the successful development and deployment of CVML models. One approach to understanding model errors is to summarize the common characteristics of error samples. This can be particularly challenging in tasks that utilize unstructured, complex data such as images, where patterns are not always obvious. Another method is to analyze error distributions across pre-defined categories, which requires analysts to hypothesize about potential error causes in advance. Forming such hypotheses without access to explicit labels or annotations makes it difficult to…Apple Machine Learning Research

Vedere AI

Posts in category: Apple

Revisit Large-Scale Image–Caption Data in Pre-training Multimodal Foundation Models

Apple Workshop on Natural Language Understanding 2024

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Interpreting and Improving Optimal Control Problems With Directional Corrections

Modeling Speech Emotion With Label Variance and Analyzing Performance Across Speakers and Unseen Acoustic Conditions

Universally Instance-Optimal Mechanisms for Private Statistical Estimation

Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization

The Role of Prosody in Spoken Question Answering

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025

VibE: A Visual Analytics Workflow for Semantic Error Analysis of CVML Models at Subgroup Level

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.