VideoAudioSensor DataAccessibility TechComputer VisionAssistive TechnologySmart Glasses

Egocentric Vision for Accessibility AI

1,400 hours of first-person video from blind and low-vision users with rich accessibility metadata

Clips

5,000

Hours of content

1400

Use cases

Accessibility Tech, Computer Vision, Assistive Technology, Smart Glasses

Pricing

Custom

Overview

Comprehensive egocentric video dataset captured from 5,000+ active users who are blind or have low vision, using Google Glass and Solos smart glasses. Each session includes synchronized audio, OCR text recognition outputs, conversation context transcripts, and head-mounted IMU motion data. Clips feature visible hands and objects in real-world scenarios, making this ideal for training assistive AI systems, accessibility applications, and computer vision models that understand first-person perspectives of visually impaired users.

Highlights

  • Real-world POV from blind/low-vision users across diverse daily scenarios
  • Rich metadata: OCR outputs with bounding boxes, conversation transcripts, and head IMU motion data
  • Hands and objects clearly visible for manipulation understanding and object recognition
  • Temporal alignment: All modalities synchronized with precise timestamps
  • Multi-device coverage: Google Glass and Solos smart glasses for device-agnostic training
  • Active collection pipeline: Can gather targeted scenarios from 5,000+ opt-in users

Deliverables

Files

MP4 egocentric POV (various resolutions by device), Extracted frames with hand/object visibility annotations, Synchronized audio tracks (WAV/MP3), Conversation transcripts with timestamps, JSON files with recognized text, bounding boxes, confidence scores, and timestamps, Head motion sensor data (JSON/CSV)

Labels

hand_visibility, object_detection, scene_classification, activity_labels, device_type