CATEGORY

Apple

MotionPrint: Ready-to-Use, Device-Agnostic, and Location-Invariant Motion Activity Models

Wearable sensors have permeated into people's lives, ushering impactful applications in interactive systems and activity recognition. However, practitioners face significant obstacles when dealing...

Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models

While Automatic Speech Recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to new domains and...

Randomized Algorithms for Precise Measurement of Differentially-private, Personalized

This paper was accepted at The 5th AAAI Workshop on Privacy-Preserving Artificial Intelligence. Personalized recommendations form an important part of today's internet ecosystem, helping...

Merge Vision Foundation Models via Multi-Task Distillation

As the repository of publicly available pre-trained vision foundation models (VFMs) — such as CLIP, DINOv2, and SAM — grows, users face challenges...

Vision-Based Hand Gesture Customization from a Single Demonstration

Hand gesture recognition is becoming a more prevalent mode of human-computer interaction, especially as cameras proliferate across everyday devices. Despite continued progress in...

Moonwalk: Advancing Gait-Based User Recognition on Wearable Devices with Metric Learning

*=Equal Contributors Personal devices have adopted diverse authentication methods, including biometric recognition and passcodes. In contrast, headphones have limited input mechanisms, depending solely on...

Humanizing Word Error Rate for ASR Transcript Readability and Accessibility

Podcasting has grown to be a popular and powerful medium for storytelling, news, and entertainment. Without transcripts, podcasts may be inaccessible to people...

VeCLIP: Improving CLIP Training via Visual-enriched Captions

Paper abstract: Large-scale web-crawled datasets are fundamental for the success of pre-training vision-language models, such as CLIP. However, the inherent noise and potential...

Human Following in Mobile Platforms with Person Re-Identification

Human following serves an important human-robotics interaction feature, while real-world scenarios make it challenging particularly for a mobile agent. The main challenge is...

What Can CLIP Learn From Task-specific Experts?

This paper has been accepted to the UniReps Workshop in NeurIPS 2023. Contrastive language image pretraining has become the standard approach for training vision...

Latest news

- Advertisement -
- Advertisement -