Skip to content

Customizable Context Menu for PrintScreen (Merge "Text Extractor" + "Snipping Tool" + "Translate Text" + "Ask AI About..." + Etc) #25197

Open
@mdrejhon

Description

Customizable Context Menu for PrintScreen

Audience: Mainstream AND Advanced

  1. mainstream use cases
    e.g. everyday preinstalled menu items such as "Text Extractor" and "Translate" and "Magnify" etc

  2. advanced/niche use cases
    e.g. advanced users optionally adding extra menu items to PrintScreen context menu via context menu editor

Superset of both "Text Extractor" / "Snipping Tool"

Possible mockups of PrintScreen context menu (or preferred screen capture hotkey) that appears immediately after selecting rectangle:

image or image

Proposed menu would be optionally customizable by advanced users. Hotkeys are displayed as part of the menu, as a kind of a cheat sheet, for users who want to skip the context menu next time.

Proposed menu customizability by advanced users can be initially via registry/configuration, and later via easy menu editor utility (in Settings).

Customizability covers special user-specific needs such as "Copy as Text (No Line Feeds)" vs "Copy as RTF" vs "Run LaTeX-OCR" vs "Translate to French", allowing advanced users to streamline their workflows, with the use of your desired third party image-processor etc.

NOTE: This could be either (A) a new PowerToy or (B) a modification to Text Extractor / Sniping Tool. I don't know if this is a "New PowerToy" or a "Modified PowerToy" idea, or a "New Parent PowerToy to control Similar Child PowerToys" system. But I think this is such a dramatic feature request that this deserves to be a "New PowerToy" which might be a fork of the Text Extractor codebase, but still chains to existing version of Text Extractor and Snipping Tool. Alternatively, if a modification of Text Extractor, it could be a optional context menu that is activated in the Text Extractor Settings.


Long Description

Copying any onscreen text

  1. Select rectangle
  2. Optional context menu pops up (with default "Copy as text..." already selected)
  3. Hit Enter (or click on "Copy as text...")

Translating any onscreen text

  1. Select rectangle
  2. Optional context menu pops up (including a "Translate to..." option)
  3. You choose "Translate to..." option
  4. An easy translator UI appears (or autolaunch of user-specified URL such as Google Translate) and the translated text appears in another pane (with one-button copy-to-clipboard)

Snipping Tool Integraton

  1. Select rectangle
  2. Optional context menu pops up
  3. Select "Copy as image."
  4. Upon selection, automatically launches Snipping Tool with image already copied

These would be the obvious common ones. Possibly more niche context menu items (that can be shown/hidden from menu) can be added later:

Casual screenreading

  1. Select rectangle
  2. Optional context menu pops up
  3. Select "Speak Text"
  4. This is useful for both assistive and non-assistive -- such as dyslexia and eye-off-screen situations, like playing a web HOWTO while trying to repair something (some people work better this way)

Artificial Intelligence Integration, once they accept images

  1. Select rectangle
  2. Optional context menu pops up
  3. Select "Ask Bing Chat About..." or "Ask ChatGPT About..."
  4. I would be asked to type/speak a query. Possible useful queries could be:
  • "...Where did this image originally come from?..."
  • "...Is this fact true?..."
  • "...How do I make this text bigger in the menus of this specific app?..."
  • "...What's the best way for me to test this shader example...?"
  • "...I don't understand this strange command line error, do you know why is this happening?..."
  • "...This popup error is new. Is there a security issue?..."
  • "...Explain this command line compiler error..."
  • "...Please change the color the background to purple, and put a teddy bear on top of this application window, and a Happy Birthday message on it, I want to surprise my kid with a fancied up version of this screenshot of this new game download..."

Keep in mind that AI capable of understanding screenshots, is already working in the laboratory:

image

One Main Hotkey for all rectangle selection features

I include Snipping Tool because I hate memorizing too many hotkeys for similar-function behaviors.
Easier to have one main hotkey (e.g. the conveniently aptly named PrintScreen) for Snipping Tool, Translate, and Text Extractor.
Could be plugin-API capable, in theory.

Avoids Disruption To Existing Users: Context menu can still have "hotkeys" as reference sheet

A context menu is an intuitive reference manual for helping remember additional hotkeys for more frequently used functions!

  • Full keypress shortcut to bypass the context menu next time (Shift+Win+T for Text Extractor, Shift+Win+S for Snipping Tool)
  • Or underscored letter in a menu (e.g. Copy as raphics);

Here is a possible mock-up example popup context menus, with default item already selected (activates on hitting Enter):

image

image

Scenarios when this would be used?

There are tons of use cases, you can imagine -- but one use case I am missing is Translate. I list multiple scenarios below:

Instant Translation of Onscreen Text

AI translators are great nowadays.
Digital nomadism have boomed.
More WFH, more people working in different countries. I am a Canadian who also has a "Work from home" office from Mexico winter home. With that, comes increased demand for easier integrated translation that works with any apps that doesn't have translation.

In addition, I am a deaf person who use chatting more often than audio. I chat to many people in multiple languages.
In addition, I now remotely work for multiple clients who are in Taiwan, Korea, and various parts of Europe.

Consequently, I often have to use Text Extractor + Google Translate, in a somewhat cumbersome way.

iPadOS Already Has a defacto Translate Screenshot Feature

The nice iPadOS Live Text button appears when you screenshot, and upon selecting text, has a built-in Translate context menu. On my iPad, I can screenshot any screen, and it pops up the Live Text feature that has a Translate context menu!! I already use this Apple iOS feature all the time in my chatting apps, e.g. chatting to friends who write me in Spanish, etc.

(P.S. As a deaf person, it's very hard for me to learn new spoken languages, and using the Translate Screenshot feature in any translate-unsupported chat app, is highly self-educational in a sort of an accidental "immersion learning" feature)

Text Extractor is amazing!

It would be more amazing if I could instantly choose what I wanted to do, including Translate. I almost wonder, why isn't this a Windows PowerToy already? Wink, wink...

Currently, the onerous Windows workflow is I have to open Google Translate, paste into Google Translate, manipulate the website, then copy back into chat sometimes. It would be nice to be able to skip exiting the chat app, for both translating other people's texts and my own texts -- in all chat apps that didn't support translation. Whether it be a multi-country business text chat, or a personal WhatsApp Chat with a Mexican friend.

Consistent with Precedent of the old Right Click context menu

The famous right-click context menu. Cut, Copy, Paste, etc.

Except this PowerToy is a context menu specifically for doing something smart with a screen crop rectangle.

I suspect that this hotkey could in theory become so useful to power users, that some people may actually assign PrintScreen key to this. PrintScreen might be in theory become the universal "do something with this screenshotted rectangle" context menu of the future -- it's already part of Snipping Tool, so it's a natural hotkey.

This idea wasn't practical before, but it is today

This was not practical in the past until AI came by to do AI-based OCR (which Text Extractor uses), AI-based translate (now good enough for chat), etc.

So what wasn't a good idea in the past, is now a (possibly) extremely fantastic idea today. AI-OCR, AI-translate, AI-speech, Ask-AI, etc.

Some mainstream assistive / accessibility potential too

Rationale of why I added "Speak Text" and "Magnify Text" as possible everyday occasional-use features that many can appreciates existing

In gigantic numbers over 100x+ more common than fully blind, are people who struggle to read tiny text (e.g. grandma/grandpa, or when I grow a bit older). The kind of text that appears on the screen in some random app only once every few hours;

Assistive features that be added ("Speak Text", "Magnify") to the context menu for the partially vision impaired, such as older computer geeks who can't read tiny text but can easily aim a rectangle around them. Yes, we Gen-Xer programmer eyes are alas, aging -- and some situations can pop up where something nonzoomable on screen appears, that we wish we could read better. You know, that one time that tiny text appears in one of our apps, like a tiny settings screen of some fancy card game app we just installed that didn't respect the DPI zoom setting, and then...

Or maybe we're trying to troubleshoot a problem with some object in our hands, and want to just listen to a screen reader of a few paragraphs from a repair HOWTO. Even if we don't have dyslexia (though that helps; some people listen 5x faster than read text).

Not all accessibility features need to be blatantly accessibility features -- much like iPhone vibrate mode is a mainstreamed notification accessibility feature for the deaf -- it's an everyday thing now that doesn't need an accessibility logo. I

Yes -- it can still be called an accessibility feature, but it's an "full time accessibility feature that stays out of the way and doesn't interfere with users who don't need the accessibility feature" -- much like the vibrator mode of a modern smartphone, fantastically useful for the deaf but is not thought of as an accessibility feature anymore;

A full time screen reader is very annoying to mainstream users (creates annoying visible behaviors), so it's never used by many of us even though we occasionally sometimes wanted it. On the other hand, a part-time screen reader is just like a "DIY Audiobook" spontaneous convenience -- like using a phone vibrator instead of a phone ringer.

Rheoretically, one asks oneself; why pigeonhole all accessibility features to only accessibility? Certain useful features like a "Speak Text" that we only sometimes need, if we're just looking away from the monitor during some simple eyes-off-the-screen stuff like textbook study or object repair; etc? Many listen to music while studying textbooks, and love audiobooks, so screen-rectangleing and selecting "Speak Text" is an unobtrusive convenience feature for everyday users!

Or semi-accessibility needs. The times where 98% of time we are just about to read the screen without glasses, but the glasses are downstairs... Etc. We people don't always want a full time screen magnifier who we often forget the hotkey, or get annoyed by an accidental activation of an accessibility hotkey that doesn't have an obvious cancel feature (like a "Speak Text" window would have a very clear Cancel button). UX workflowing is much more mainstream-unobtrusive (while still providing accessibility) through my suggestion. Y'know, like a smartphone vibrate feature isn't an "accessibility-only feature" anymore these days. But want an easy-to-remember one hotkey that has all the important helper features. Especially our agin' computer geek brains, y'know;

Every need of each person is different; but a universal context menu for "do something about this rectangle" -- is probably pretty darn useful. Now you're getting the idea!

Potential / Suggested Method of Configurability: URI system

(including local URIs to installed apps, or URIs to websites).

This is just a suggestion, that could actually turn this into a very easy-to-create PowerToy that is highly configurable by advanced users who want to aberrate away from a standard/included context menu.

Although more integration is ideal, keeping this tool simple may require some thought on the correct kind of API to use in this situation. This could be a custom configurable context menu, with editable text + editable URI + editable menu hotkey (e.g. the underscored letter in a pulldown menu);

Defaults (Copy as text, Copy as image, Translate) can be preinstalled when users download and install this PowerToy. But would also be operated by a modifiable configuration (on disk, in registry, etc) that more advanced tweakers can do to add/remove context menu items and/or change the default selected context menu item (that executes on hitting Enter). A possible rudimentary "Configure Context Menu" at the bottom of the context menu (simple menu reorder / show menu item / hide menu item), since users may submit popular context menu items to be added to future versions of the powertoy but other users might want to hide from the context menu.

  • Most Translate websites supports URI formats (e.g. launch Google Translate)
  • Snipping Tool supports URI ms-screenlip: (see https://learn.microsoft.com/en-us/windows/uwp/launch-resume/launch-screen-snipping ...)
  • Bing Chat AI may be able to support URI to start the chat (once Bing Chat gets image-recognition support).
  • Other URIs can be added by advanced end users
  • The most popular URIs can be included in this extended menu system.

Anything that does not support URI, some geek can create the appropriate "glue" app as needed, and provide the URI to the app.

The configurability should in theory be flexible; e.g. a menu item that just copy image to clipboard immediately, or wanting to launch directly to Snipping Tool (with the image already showing)

For the URI (either website or command line), it could include template inserts for arguments for both text and images. The path to the screen crop, or the text of the OCR'd text. Such as {$text} and {$imagepath}. Preferably both made available to context menu editors, to let us brainstorm how to create our dream context menu. Some websites might need the image as a POST URL, but I'm not sure the best way to provide such configurability, so let's start simple with just an image path, and let app authors create the necessary glue apps to POST the image to URL (e.g. an AI that understands image+text, for the particular use case of asking an AI about the content of the screenshot crop).

For those URIs using {$imagepath} ... When launching a local app or website that needs the image, the crop would autosave instantly to disk upon selection of screen crop image -- for the associated app/website of the selected context menu item.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area-Context MenuRelated to context menu bugs or enhancements. Usually this is the File Explorer context menuIdea-New PowerToySuggestion for a PowerToyNeeds-TriageFor issues raised to be triaged and prioritized by internal Microsoft teams

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions