Skip to content

Feature request: Add a settings page for managing installed OCR languages #200

Open
@gexgd0419

Description

@gexgd0419

Motivation

Some users might get confused when they see so few available OCR languages and have no idea how to install a new language.

The official way to install a new OCR language would be going to the Language page in Windows settings and adding the language in the Preferred languages list.

However, this method has several downsides.

  • Not every installable language is a valid OCR language. In other words, not every language supported by Windows is supported by the Windows OCR engine. On my system there's only 35 valid OCR languages, but the installable languages in "Preferred languages" is much more.
  • OCR language modification can require a reboot to take effect, but Windows setting won't tell you this. So it's possible that your newly installed language won't get shown in the available OCR languages until you happen to reboot your machine.
  • Installed "preferred languages" can add a new item in your input language/IME list, or the list that brings up when you press Win+Space. If you switch your input language often, having many input language selections just to be able to OCR them is annoying. However, you cannot remove a "preferred language" without removing its corresponding OCR language.

So, I think that having a separate settings page for managing only OCR languages would be helpful, but unfortunately Windows settings doesn't have such a page. As Text Grab is a tool that utilizes this maybe-not-so-known feature of Windows, including such a settings page would be appreciated.

How to manage installed OCR languages using PowerShell

I'm not sure how to do this using C# code as for now, but here's some PowerShell code.
Note that all of the following operations require elevated (Administrators) privileges.

Get a list of all valid OCR languages on your system

Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~")}

It will return a list of OCR-related Windows capabilities, which are just the installable OCR languages.

PowerShell output on my system
Name  : Language.OCR~~~ar-SA~0.0.1.0
State : Installed

Name  : Language.OCR~~~bg-BG~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~bs-LATN-BA~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~cs-CZ~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~da-DK~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~de-DE~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~el-GR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~en-GB~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~en-US~0.0.1.0
State : Installed

Name  : Language.OCR~~~es-ES~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~es-MX~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~fi-FI~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~fr-CA~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~fr-FR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~hr-HR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~hu-HU~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~it-IT~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~ja-JP~0.0.1.0
State : Installed

Name  : Language.OCR~~~ko-KR~0.0.1.0
State : Installed

Name  : Language.OCR~~~nb-NO~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~nl-NL~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~pl-PL~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~pt-BR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~pt-PT~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~ro-RO~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~ru-RU~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~sk-SK~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~sl-SI~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~sr-CYRL-RS~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~sr-LATN-RS~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~sv-SE~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~tr-TR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~zh-CN~0.0.1.0
State : Installed

Name  : Language.OCR~~~zh-HK~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~zh-TW~0.0.1.0
State : NotPresent

Get a list of only installed OCR languages on your system

Just filter the items by its State property value.

Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~") -and $_.State -eq [Microsoft.Dism.Commands.PackageFeatureState]::Installed}

Get the corresponding capability of a specific OCR language

All OCR-related capabilities have names like Language.OCR~~~<language>~<version>, so you can change the filter criterion to just match a single capability. An example would be:

Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")}

Then you can pass it to Add-WindowsCapability or Remove-WindowsCapability to install/uninstall the OCR language.

Install an OCR language

For example zh-CN:

Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")} | Add-WindowsCapability -Online

Its result will tell you whether a restart is needed or not.

Uninstall an OCR language

For example zh-CN:

Get-WindowsCapability -Online | where {$_.Name.StartsWith("Language.OCR~~~zh-CN")} | Remove-WindowsCapability -Online

How those can be used in Text Grab

Of course you can try to invoke PowerShell in the program.

As those PowerShell scripts use the DISM APIs under the hood, you can invoke those APIs directly as well.

Other notes

  • You can add an OCR language independently of the "preferred languages" setting.
  • However, it seems that if you modify the "preferred languages" setting, the system can install/uninstall OCR languages to match your "preferred languages" list, so "additional" OCR languages may get removed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions