The Hidden Revolution in Medical Documentation: How Smart Glasses Are Changing Healthcare Forever

Björn Runåker
7 min readJan 19, 2025

--

Smart glasses for doctors

Picture a hospital ward where doctors spend more time looking into patients’ eyes than typing on computers. Imagine a clinic where medical notes write themselves, leaving physicians free to focus entirely on patient care. This isn’t science fiction — it’s happening right now and transforming healthcare in ways we never thought possible.

The crisis in medical documentation is well-known but poorly understood. When doctors spend two hours documenting every hour of patient care, something has gone fundamentally wrong. But here’s what makes this story fascinating: the solution isn’t coming from where we expected. It’s emerging from an unlikely combination of enterprise-grade smart glasses and an innovative prototyping approach that’s turning traditional medical technology development on its head.

Think of it as a tale of two worlds colliding. On one side, we have the giants Meta, Magic Leap, and Microsoft HoloLens, whose medical-grade smart glasses cost as much as a small car. On the other hand, we have Brilliant Labs’ Frame glasses, the scrappy underdog that is changing how we approach medical innovation.

But here’s where it gets interesting: What if we could combine the best of both worlds? What if we could use affordable, hackable smart glasses to prototype revolutionary ideas and then scale the successful ones to enterprise-grade solutions? This approach isn’t just clever—it’s transforming how we think about medical technology development.

In this deep dive, we’ll explore the counterintuitive strategy that allows healthcare organizations to innovate faster and smarter than ever. We’ll examine real-world examples of this dual approach's success and the technical architecture that makes it all possible. Most importantly, we’ll show you why this matters not just for healthcare providers but for everyone who has ever sat in a doctor’s office watching their physician type instead of talk.

The future of medical documentation is being written right now, and it does not look like what we expected. Let’s explore how this revolution unfolds hands-on by building our affordable proof-of-concept!

The Brilliant Labs’ Frame Glasses

The picture above shows the glasses with “Mr Power”, the charger, connected to them. It is, of course, removed before you can wear the glasses. Notice the prism in the right eye. When you wear the glasses, they will project a few lines of text or an image before you.

Step-by-Step guide to medical documentation revolution

This will guide you in getting going with Brilliant Labs’ Frame glasses using a Windows computer with an AI accelerator. If you try this, please join the discussion at GitHub.

To effectively utilize the Frame Assistant for medical documentation, follow these detailed steps based on the project’s README.md:

Set Up the Development Environment

Start with downloading the code from GitHub:
bjornrun/frame-assistant

Install Conda:

If Conda is not already installed on your system, download and install it from the official Miniconda or Anaconda websites. Follow the installation instructions specific to your operating system.

Create a Conda Environment:

Open a terminal or command prompt. Run the following command to create a new environment named frame-dev with Python 3.9:

conda create -n frame-dev python=3.9

Activate the newly created environment with:

conda activate frame-dev

Install Required Python Packages

With the frame-dev conda environment active, install the necessary packages:

pip install frame-sdk
pip install frame-utilities-for-python
pip install frameutils
pip install keyboard

Download Whisper Standalone

The Frame Assistant utilizes the Whisper model for transcription. Download the Whisper standalone executable suitable for your operating system from the official repository.

Use the latest release from https://github.com/Purfview/whisper-standalone-win

Configure the Whisper Executable Path

The Frame Assistant needs to know the path to the Whisper executable. Update the configuration file or environment variable in the project to point to where you placed the Whisper executable.

This line must be changed:

WHISPER_EXE_PATH = r”C:\Users\bjorn\Downloads\Faster-Whisper-XXL_r239.1_windows\Faster-Whisper-XXL\faster-whisper-xxl.exe”

Connect the Frame glasses to your computer.

Ensure your Frame is charged, and then with the charging cradle connected, use a sim card tool and hold the pairing button for 3 seconds. See documentation for details.

On your laptop, go to Bluetooth settings and add a new device. When it is paired this, this is how it is seen on my computer:

Bluetooth paired correctly

If you see this error when running the application, then you have not paired the glasses correctly:

device: Any = filtered_list[0][0]

IndexError: list index out of range

Run the Frame Assistant

  • With all configurations in place, you can start the Frame Assistant by running:
python main.py

The application should now be operational, allowing you to record audio and capture images hands-free, facilitating efficient medical documentation. In version 1, use the spacebar to record audio and the enter key to take a photo.

The photos, audio, transcription, and speaker diarization will be saved in an output directory. Example of transcript (Yes, I am talking to myself):

[00:00.000 → 00:01.800] [SPEAKER_00]: Hello, what is your name?

[00:02.040 → 00:03.300] [SPEAKER_00]: My name is Björn.

[00:03.720 → 00:04.220] [SPEAKER_00]: Goodbye.

The first milliseconds of the audio contain static noise, but the transcription will still be correct.

Some static noise in the beginning

However, the photos are not good, but I’m told they will be better in a future firmware upgrade of the glasses:

Contributing

License

  • The Frame Assistant is distributed under the MIT License. For more details, review the license information provided in the repository.

Preview: An Enhanced Medical Documentation Workflow with Streaming Audio and Video

In the next phase of this journey, we’ll take the proof of concept using Brilliant Labs’ Frame glasses to a whole new level. The upcoming iteration will focus on integrating streaming audio and video capabilities and expanding compatibility beyond smart glasses to include a variety of recording devices, such as webcams, smartphones, and other audio/video equipment. This enhanced version will create a seamless, scalable live transcription and documentation system.

Here’s a sneak peek at what you can expect in the follow-up article on Medium:

Expanding the Ecosystem: From Glasses to a Versatile Workflow

While the Brilliant Labs Frame glasses demonstrated the power of wearable technology for hands-free medical documentation, many medical professionals use other recording tools already available in their clinics, such as webcams, smartphones, or specialized medical imaging equipment. The next version of the system will integrate these tools to offer:

  • Live Transcription of Streaming Audio and Video: Real-time transcription of patient interactions or medical procedures.
  • Cross-Device Compatibility: Use any RTMP-compatible device (e.g., OBS Studio setups, smartphones) to stream to a central processing server.
  • Scalable and Modular Design: Built with modular components like GStreamer and modified whisper.cpp, making it adaptable to various clinical workflows.
  • Range of video analytical tools to add

Technology Stack and Features

The improved solution will combine the following technologies:

  1. OBS Studio for Video and Audio Capture
    OBS Studio will serve as a versatile video/audio recording tool. Medical professionals can set up OBS on a desktop or laptop to capture high-quality streams from webcams, smart glasses, or other devices.
  2. Simple RTMP Server for Streaming
    A lightweight RTMP server will be deployed to receive and manage live audio/video streams. This server will act as the bridge between the recording devices and the processing pipeline.
  3. GStreamer-Enhanced Whisper Transcription
    The whisper.cpp library has been modified to include a GStreamer plugin. This enhancement will enable real-time transcription of audio streams, providing immediate documentation of conversations or procedures.
  4. Utilize multi-modal models to analyze images in real time.

How It Will Work

Here’s a high-level overview of the workflow:

Setup and Capture with OBS Studio:

  • Install and configure OBS Studio on your recording device.
  • Stream audio and video from smart glasses, webcams, or smartphones to the RTMP server.

Streaming to an RTMP Server:

  • Deploy a simple RTMP server locally or in the cloud to handle incoming streams.
  • Configure OBS or other compatible devices to stream to the server.

Transcription with GStreamer and Whisper:

  • Audio streams from the RTMP server will be piped into GStreamer, where the modified Whisper plugin will process the audio in real-time.
  • Transcriptions will be output to text files or fed directly into electronic health record (EHR) systems for instant documentation.

What’s Next?

In the follow-up article, we’ll provide a step-by-step guide for setting up this enhanced system, including:

  1. Installing and configuring OBS Studio for RTMP streaming.
  2. Deploying a lightweight RTMP server on your local machine or in the cloud.
  3. Modifying and using the enhanced whisper.cpp GStreamer plugin for live audio transcription.
  4. Best practices for adapting this workflow to various medical settings.

This next iteration will unlock new possibilities for medical documentation, making it even more accessible, flexible, and scalable. Whether you’re a tech-savvy practitioner or a developer looking to innovate in healthcare, this system will be an invaluable tool.

Stay tuned for the follow-up article, where we’ll take you through this exciting new chapter in more brilliant, faster, and more efficient clinical documentation. The future of medical workflow optimization is here — and it’s just getting started!

--

--

Björn Runåker
Björn Runåker

Written by Björn Runåker

Software developer into deep learning in combination of Big Data and security

No responses yet