[Research]: Explore solutions MQTT Publish camera, microphone and in-call state #1941

IsmaelMartinez · 2025-11-12T06:24:45Z

Investigation for issue #1938 exploring approaches to publish camera, microphone, and in-call state to MQTT for home automation.

Key findings:

WebRTC MediaStream monitoring recommended (proven pattern)
Leverages existing disableAutogain.js technique
Future enhancement with Graph API when [Feat]: Strategic Initiative: Microsoft Graph API Integration - Enhance Teams for Linux with Official APIs #1832 is implemented
Low maintenance, high stability using standard Web APIs

Recommended 2-phase approach:

WebRTC foundation (immediate, solves [Feat]: Extended MQTT status fields #1938)
Graph API enhancement (future, aligns with [Feat]: Strategic Initiative: Microsoft Graph API Integration - Enhance Teams for Linux with Official APIs #1832)

Investigation for issue #1938 exploring approaches to publish camera, microphone, and in-call state to MQTT for home automation. Key findings: - WebRTC MediaStream monitoring recommended (proven pattern) - Leverages existing disableAutogain.js technique - Future enhancement with Graph API when #1832 is implemented - Low maintenance, high stability using standard Web APIs Recommended 2-phase approach: 1. WebRTC foundation (immediate, solves #1938) 2. Graph API enhancement (future, aligns with #1832)

gemini-code-assist · 2025-11-12T06:24:57Z

Summary of Changes

Hello @IsmaelMartinez, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request presents a thorough investigation into enhancing the existing MQTT integration to provide more detailed Teams status information for home automation. It proposes a robust, two-phase solution: first, implementing WebRTC MediaStream monitoring for accurate camera, microphone, and in-call states, and second, a future enhancement leveraging the Microsoft Graph API for richer meeting context. The goal is to move beyond basic presence and enable more sophisticated automation based on actual device usage.

Highlights

Problem Addressed: The current MQTT integration only publishes basic Teams presence. This PR investigates solutions to provide more granular status, including camera on/off, microphone on/off, and in-call state, for enhanced home automation.
Recommended Approach (Phase 1): The primary recommendation is to implement WebRTC MediaStream monitoring. This involves intercepting getUserMedia() calls and listening to MediaStreamTrack events (like mute, unmute, ended) to accurately determine the real-time state of the camera and microphone. This approach leverages a proven pattern already used in disableAutogain.js.
Key Advantages: This WebRTC-based solution uses stable Web APIs, directly measures actual device usage, is immune to Teams UI changes, and works reliably with keyboard shortcuts, providing high stability and low maintenance.
Future Enhancement (Phase 2): A second phase is planned to enhance the MQTT payload with richer context by integrating with the Microsoft Graph API (once [Feat]: Strategic Initiative: Microsoft Graph API Integration - Enhance Teams for Linux with Official APIs #1832 is implemented). This would provide official presence status, calendar event details, and confidence scoring by cross-validating multiple data sources.
Rejected Approaches: Several alternatives were considered and rejected, including network traffic analysis (too complex, encrypted), native OS integration (platform-specific, overengineered), and keyboard shortcut interception (fragile, can desync).
Implementation Plan: A detailed two-phase implementation plan is outlined, including the creation of a new module (mqttExtendedStatus.js), IPC communication, configuration options for MQTT topics, and a comprehensive testing strategy.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is a very well-written and thorough investigation document for adding extended MQTT status. The proposed two-phase approach using WebRTC and then enhancing with the Graph API is solid. The document clearly outlines the problem, different approaches, and a detailed implementation plan.

My review includes a few suggestions on the technical design, mainly focusing on the WebRTC monitoring logic and the example IPC handler implementation, to make the solution more robust and efficient. Overall, great work on this research.

docs-site/docs/development/research/mqtt-extended-status-investigation.md

Added critical section explaining interaction with existing screen sharing code that disables audio to prevent echo/feedback. Key findings: - Teams creates SEPARATE streams for calls vs screen sharing - Screen sharing streams have audio disabled (injectedScreenSharing.js) - Regular call streams (camera + mic) are NOT affected - Solution: Use same detection logic to filter screen shares Implementation approach: - Detect screen share streams (chromeMediaSource === "desktop") - Skip monitoring screen share streams (no audio tracks) - Only monitor regular call streams for mic/camera state - Both interceptors coexist without interference Updated code examples to include isScreenShare filtering. Addresses concerns about issues #1871 and #1896.

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

CRITICAL FIX: MediaStreamTrack.enabled property changes do NOT fire mute/unmute events. This is the standard way web apps control track state programmatically (e.g., UI button clicks, keyboard shortcuts). Problem: - Teams UI buttons likely use track.enabled = false/true - This does NOT fire mute/unmute events - Event-only monitoring would miss most state changes Solution - Hybrid Monitoring Approach: 1. Event listeners (mute/unmute/ended) - immediate response 2. Poll track.enabled property at 500ms intervals - catch all changes 3. Cleanup intervals when track ends - prevent memory leaks Why 500ms polling: - Fast enough for human perception - Low overhead (~4-6 checks/second for typical call) - Negligible CPU impact Updated investigation document: - Added "Critical: track.enabled vs mute/unmute Events" section - Updated code examples with hybrid monitoring - Added monitorTrackEnabled() helper function - Updated implementation checklist - Added to Open Questions with recommendation Credit: Issue identified by Gemini Code review

Address two critical implementation details for MQTT publishing: 1. MQTT Payload Format: - MQTT payloads are strings/buffers, not JavaScript primitives - Boolean values must be explicitly converted with String() - Home Assistant expects "true"/"false" strings, not booleans - Updated all examples to show String(data.camera) etc. 2. Publishing Efficiency: - Individual topic publishes are independent operations - Use Promise.all() for parallel publishing (not sequential) - Sequential: 200-400ms total latency (4x ~50-100ms) - Parallel: 50-100ms total (all complete simultaneously) - 3-4x performance improvement Changes: - Updated main process IPC handler with String() conversion - Changed sequential await to Promise.all([...]) - Added new "MQTT Payload Format and Publishing Efficiency" section - Updated MQTT topics section to clarify string format - Updated implementation checklist with both requirements - Added detailed performance comparison examples Credit: Issues identified by Gemini Code review

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

Focused investigation on what user ACTUALLY requested: - Camera on/off (Red LED) - Microphone on/off (Orange LED) - In-call state (Yellow LED) REMOVED (over-engineering): - "Approach 2: Graph API Enhancement" - user didn't ask for this - "Approach 3: DOM Fallback" - unnecessary complexity - Phase 1 and Phase 2 - just one simple implementation - Full-state JSON topic - user just needs three boolean topics - Excessive publishing efficiency discussion - Multiple "Open Questions" - simplified to essential details KEPT (essential): - WebRTC stream monitoring (single, focused solution) - Hybrid track monitoring (events + track.enabled polling) - Screen sharing filtering (necessary for existing code) - MQTT string conversion (necessary) - Three simple topics: camera, microphone, in-call - Implementation checklist - Testing steps ADDED: - "Future Expansion Opportunities" section (brief, at end) - Clear path to add Graph API later WITHOUT refactoring - Home Assistant example for user's use case Result: Document went from ~400 lines with 3 approaches and 2 phases to ~240 lines with 1 clear solution. Delivers exactly what user wants, nothing more, with expansion path for future if needed.

github-actions · 2025-11-15T18:37:35Z

📦 PR Build Artifacts

✅ Build successful! Download artifacts:

teams-for-linux-windows-x64 (103.79 MB)
teams-for-linux-macos-x64 (122.60 MB)
teams-for-linux-linux-armv7l (402.56 MB)
teams-for-linux-linux-arm64 (426.08 MB)
teams-for-linux-linux-x64 (435.92 MB)

View workflow run

Created testing spike to verify critical assumptions before implementation: WHAT WE'RE TESTING: 1. getUserMedia interception works alongside injectedScreenSharing.js 2. Teams uses track.enabled (not mute events) for UI buttons 3. Screen sharing detection logic correctly identifies streams 4. Track state changes are detectable via polling FILES ADDED: - app/browser/tools/mqttExtendedStatusSpike.js - Temporary verification tool (DO NOT USE IN PRODUCTION) - Intercepts getUserMedia calls - Logs all track state changes (enabled, muted, readyState) - Polls track.enabled every 500ms to catch UI button clicks - Tests screen sharing detection logic - MQTT_EXTENDED_STATUS_SPIKE_TESTING.md - Comprehensive testing guide with 7 test scenarios - Expected outputs for each test - Results checklist - Decision framework for next steps - Updated app/browser/preload.js to load spike module HOW TO USE: 1. Add "mqttExtendedStatusSpike": true to config.json 2. Run npm start 3. Open DevTools console 4. Follow testing guide in MQTT_EXTENDED_STATUS_SPIKE_TESTING.md 5. Join test call and observe [MQTT_SPIKE] logs CRITICAL TEST: Test 3 (Toggle Microphone) will tell us if Teams uses: - track.enabled = false (our assumption) ← Will see "PROPERTY CHANGE" - mute/unmute events (alternative) ← Will see "EVENT" This determines if our hybrid approach is correct. Next: Run spike tests, document results, then implement full solution

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

Replace .forEach() with for...of loops to match project code style. Changes: - stream.getVideoTracks().forEach() → for...of videoTracks.entries() - stream.getAudioTracks().forEach() → for...of audioTracks.entries() Maintains index for track naming (camera-0, microphone-0) while following project's ESLint preferences.

Comprehensive analysis of how extended status integrates with existing MQTT architecture using YAGNI and KISS principles. KEY FINDINGS: Current Architecture: - Browser: mqttStatusMonitor.js → IPC user-status-changed - Main: userStatusChangedHandler() → mqttClient.publishStatus() - MQTTClient has single method publishStatus() for presence only RECOMMENDED APPROACH (KISS + YAGNI): 1. Add Generic publish() Method: - publish(topic, payload, options) - Supports strings, objects (auto-stringify) - Deduplication with optional key - Retain and QoS options - Backward compatible (existing publishStatus() unchanged) 2. Separate IPC Channel: - Keep 'user-status-changed' for presence - Add 'mqtt-extended-status-changed' for camera/mic/call - Clear separation of concerns (no conditional logic) 3. Nested Config Structure: - mqtt.presence { enabled, topic, checkInterval } - mqtt.extendedStatus { enabled, topics { camera, microphone, inCall } } - Backward compatible with migration REJECTED APPROACHES: ❌ Specialized methods (publishCameraState, etc) - method explosion ❌ Event publisher/adapter pattern - over-engineered ❌ Flat config - too messy ❌ Reused IPC channel - mixed concerns REQUIRED SPIKES: Spike 1: Generic publish() method (verify no breaking changes) Spike 2: IPC integration pattern (verify messages reach broker) Spike 3: Config backward compatibility (verify migration works) Next: Run 3 integration spikes before implementing production code

Changed config structure from generic "extendedStatus" to clear semantic categories: camera, microphone, call. PROBLEM WITH "EXTENDED": - "Extended" is a technical term, not a semantic category - Just means "stuff we added later" - Unclear what "extended" contains - Hard to understand at a glance NEW SEMANTIC APPROACH: { "mqtt": { "presence": { enabled, topic }, // Availability status "camera": { enabled, topic }, // Video device "microphone": { enabled, topic }, // Audio device "call": { enabled, topic } // Call state } } BENEFITS: ✅ Clear what each setting controls (no guessing) ✅ Each independently configurable ✅ Self-documenting (camera = camera, not "extended field 1") ✅ Flat structure (simpler than nested groupings) ✅ Easy to add new categories (screen, recording, etc.) IPC CHANNEL: - Renamed: mqtt-extended-status-changed → mqtt-media-status-changed - Rationale: Single IPC for all stream data (camera/mic/call detected together) - Handler selectively publishes based on config FILES UPDATED: - MQTT_INTEGRATION_ARCHITECTURE_ANALYSIS.md - docs-site/docs/development/research/mqtt-extended-status-investigation.md This is more KISS-compliant: each thing is named for what it is.

Document how semantic category pattern scales to future use cases (notifications, calendar, messages, recording, reactions, etc.) FUTURE CATEGORIES ANALYZED: Messages & Notifications: - messageCount, newMessage, mentions - Detection: DOM title monitoring (already exists!) - Topics: teams/messages/unread, teams/messages/mentions Calendar & Meetings: - nextMeeting, meetingStarting, meetingDuration - Detection: Graph API (wait for #1832) or DOM - Topics: teams/calendar/next (with JSON payload) Screen Sharing: - screenSharing - Detection: IPC events (already implemented!) - Topics: teams/screen-sharing Reactions & Engagement: - handRaised, reactions - Detection: DOM monitoring - Topics: teams/hand-raised, teams/reactions/latest Recording & Privacy: - recording, transcription - Detection: DOM (recording indicator) - Topics: teams/recording (privacy use case) Participants: - participantCount - Detection: DOM roster panel - Topics: teams/participants/count PATTERN SCALES PERFECTLY: ✅ Each category = what it represents (not grouped by tech) ✅ Independently configurable (enable what you need) ✅ Self-documenting (camera = camera, not "media field 1") ✅ Privacy-friendly (opt-in per category) ✅ Generic publish() supports all future categories IMPLEMENTATION PRIORITY: High: Screen sharing, message count (easy - already detected) Medium: Calendar (wait for Graph API #1832), recording Low: Reactions, participant count (wait for user requests) DETECTION STRATEGY: Prefer stable APIs (WebRTC, IPC, Graph) over fragile DOM scraping. Add features ONLY when users request (YAGNI). This validates our semantic category decision - it scales!

Consolidated MQTT semantic categories expansion roadmap into the main mqtt-extended-status-investigation.md research document. CHANGES: - Removed separate MQTT_SEMANTIC_CATEGORIES_EXPANSION.md file - Expanded "Future Expansion Opportunities" section with comprehensive details: - Messages & Notifications (messageCount, newMessage, mentions) - Calendar & Meetings (nextMeeting, meetingStarting) - Screen Sharing (screenSharing) - Recording & Privacy (recording, transcription) - Reactions & Engagement (handRaised, reactions) - Participant Count (participantCount) - Added detection strategy table showing fragility levels - Added implementation priority guidance (YAGNI) - Added rationale for semantic categories scaling BENEFITS: ✅ All research in one place (easier to find) ✅ Comprehensive future planning (but YAGNI - don't build yet) ✅ Clear implementation priorities (stable APIs first) ✅ Validates semantic category decision (it scales!) Keep all research for a feature together, not scattered.

sonarqubecloud · 2025-11-16T13:56:04Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

IsmaelMartinez changed the title ~~Explore solutions for GitHub issue 1938~~ [Research]: Explore solutions for GitHub issue 1938 Nov 12, 2025

IsmaelMartinez added this to 2.x Nov 12, 2025

IsmaelMartinez moved this to In Review in 2.x Nov 12, 2025

IsmaelMartinez moved this from In Review to In Progress in 2.x Nov 12, 2025

IsmaelMartinez marked this pull request as draft November 12, 2025 07:02

IsmaelMartinez moved this from In Progress to Todo in 2.x Nov 12, 2025

claude added 5 commits November 12, 2025 14:16

Merge remote-tracking branch 'origin/main' into claude/ultrathink-iss…

9ff7322

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

Merge remote-tracking branch 'origin/main' into claude/ultrathink-iss…

45b73c2

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

claude added 7 commits November 15, 2025 22:00

Merge remote-tracking branch 'origin/main' into claude/ultrathink-iss…

f6d081b

…ue-1938-011CV3Pcndc5ZDDaQPrHPAeh

IsmaelMartinez moved this from Todo to In Progress in 2.x Nov 16, 2025

IsmaelMartinez changed the title ~~[Research]: Explore solutions for GitHub issue 1938~~ [Research]: Explore solutions MQTT Publish camera, microphone and in-call state Nov 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Research]: Explore solutions MQTT Publish camera, microphone and in-call state #1941

[Research]: Explore solutions MQTT Publish camera, microphone and in-call state #1941

IsmaelMartinez commented Nov 12, 2025

Uh oh!

gemini-code-assist bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 15, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Research]: Explore solutions MQTT Publish camera, microphone and in-call state #1941

Are you sure you want to change the base?

[Research]: Explore solutions MQTT Publish camera, microphone and in-call state #1941

Conversation

IsmaelMartinez commented Nov 12, 2025

Uh oh!

gemini-code-assist bot commented Nov 12, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📦 PR Build Artifacts

Uh oh!

sonarqubecloud bot commented Nov 16, 2025

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Nov 15, 2025 •

edited

Loading