Apple – Page 18 – Vedere AI

Device-Directed Speech Detection for Follow-up Conversations Using Large Language Models

November 5, 2024

by Apple

This paper was accepted at the Adaptive Foundation Models (AFM) workshop at NeurIPS Workshop 2024.
Follow-up conversations with virtual assistants (VAs) enable a user to seamlessly interact with a VA without the need to repeatedly invoke it using a keyword (after the first query). Therefore, accurate Device-Directed Speech Detection (DDSD) from the follow-up queries is critical for enabling naturalistic user experience. To this end, we explore the notion of Large Language Models (LLMs) and model the first query when making inference about the follow-ups (based on the ASR-decoded text), via…Apple Machine Learning Research

Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval

November 5, 2024

by Apple

Neural contextual biasing allows speech recognition models to leverage contextually relevant information, leading to improved transcription accuracy. However, the biasing mechanism is typically based on a cross-attention module between the audio and a catalogue of biasing entries, which means computational complexity can pose severe practical limitations on the size of the biasing catalogue and consequently on accuracy improvements. This work proposes an approximation to cross-attention scoring based on vector quantization and enables compute- and memory-efficient use of large biasing…Apple Machine Learning Research

Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP

November 4, 2024

by Apple

Large pretrained vision-language models like CLIP have shown promising generalization capability, but may struggle in specialized domains (e.g., satellite imagery) or fine-grained classification (e.g., car models) where the visual concepts are unseen or under-represented during pretraining. Prompt learning offers a parameter-efficient finetuning framework that can adapt CLIP to downstream tasks even when limited annotation data are available. In this paper, we improve prompt learning by distilling the textual knowledge from natural language prompts (either human- or LLM-generated) to provide…Apple Machine Learning Research

On Device Llama 3.1 with Core ML

November 1, 2024

by Apple

Many app developers are interested in building on device experiences that integrate increasingly capable large language models (LLMs). Running these models locally on Apple silicon enables developers to leverage the capabilities of the user’s device for cost-effective inference, without sending data to and from third party servers, which also helps protect user privacy. In order to do this, the models must be carefully optimized to effectively utilize the available system resources, because LLMs often have high demands for both memory and processing power.
This technical post details how to…Apple Machine Learning Research

Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs

October 30, 2024

by Apple

Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than transliteration and word-for-word translation. In this paper, we address the problem of cross-cultural translation on two fronts: (i) we introduce XC-Translate, the first large-scale, manually-created benchmark for machine translation that focuses on text that contains potentially culturally-nuanced entity names, and (ii) we propose KG-MT, a novel end-to-end…Apple Machine Learning Research

ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models

October 29, 2024

by Apple

The rapid evolution of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present…Apple Machine Learning Research

Computational Bottlenecks of Training Small-Scale Large Language Models

October 29, 2024

by Apple

This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) workshop at NeurIPS Workshop 2024.
While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and computational requirements of SLMs. In this study, we explore the computational bottlenecks of training SLMs (up to 2B parameters) by examining the effects of various hyperparameters and configurations, including GPU type, batch size…Apple Machine Learning Research

Promoting Cross-Modal Representations to Improve Multimodal Foundation Models for Physiological Signals

October 28, 2024

by Apple

Many healthcare applications are inherently multimodal, involving several physiological signals. As sensors for these signals become more common, improving machine learning methods for multimodal healthcare data is crucial. Pretraining foundation models is a promising avenue for success. However, methods for developing foundation models in healthcare are still in early exploration and it is unclear which pretraining strategies are most effective given the diversity of physiological signals. This is partly due to challenges in multimodal health data: obtaining data across many patients is…Apple Machine Learning Research

Smart Audit System Empowered by LLM

October 28, 2024

by Apple

Manufacturing quality audits are pivotal for ensuring high product standards in mass production environments. Traditional auditing processes, however, are labor-intensive and heavily reliant on human expertise, posing challenges in maintaining transparency, accountability, and continuous improvement across complex global supply chains. To address these challenges, we propose a smart audit system empowered by large language models (LLMs). Our approach introduces three key innovations: a dynamic risk assessment model that streamlines audit procedures and optimizes resource allocation; a…Apple Machine Learning Research

Divide-or-Conquer? Which Part Should You Distill Your LLM?

October 25, 2024

by Apple

Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase and show that the strategy is able to outperform a single stage solution. Further, we hypothesize that the decomposition should be easier to distill into a smaller model compared to the problem solving because the latter requires large amounts of domain knowledge while the former only requires…Apple Machine Learning Research

Vedere AI

Posts in category: Apple

Device-Directed Speech Detection for Follow-up Conversations Using Large Language Models

Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval

Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP

On Device Llama 3.1 with Core ML

Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs

ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models

Computational Bottlenecks of Training Small-Scale Large Language Models

Promoting Cross-Modal Representations to Improve Multimodal Foundation Models for Physiological Signals

Smart Audit System Empowered by LLM

Divide-or-Conquer? Which Part Should You Distill Your LLM?

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.