You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allows you to easily add voice recognition and synthesis to any web app with minimal code.
8
7
9
8
### Built for Browsers
9
+
10
10
This library is primarily intended for use in web browsers. Check out [watson-developer-cloud](https://www.npmjs.com/package/watson-developer-cloud) to use Watson services (speech and others) from Node.js.
11
11
12
12
However, a **server-side component is required to generate auth tokens**. The `examples/` folder includes example Node.js and Python servers, and SDKs are available for [Node.js](https://github.com/watson-developer-cloud/node-sdk#authorization), [Java](https://github.com/watson-developer-cloud/java-sdk), [Python](https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/authorization_v1.py), and there is also a [REST API](https://cloud.ibm.com/docs/services/watson?topic=watson-gs-tokens-watson-tokens).
13
13
14
-
15
14
### Installation - standalone
16
15
17
16
Pre-compiled bundles are available from on GitHub Releases - just download the file and drop it into your website: https://github.com/watson-developer-cloud/speech-javascript-sdk/releases
@@ -61,19 +60,23 @@ See [CHANGELOG.md](CHANGELOG.md) for a complete list of changes.
61
60
## Development
62
61
63
62
### Use examples for development
63
+
64
64
The provided examples can be used to test developmental code in action:
65
-
*`cd examples/`
66
-
*`npm run dev`
65
+
66
+
-`cd examples/`
67
+
-`npm run dev`
67
68
68
69
This will build the local code, move the new bundle into the `examples/` directory, and start a new server at `localhost:3000` where the examples will be running.
69
70
70
71
Note: This requires valid service credentials.
71
72
72
73
### Testing
74
+
73
75
The test suite is broken up into offline unit tests and integration tests that test against actual service instances.
74
-
*`npm test` will run the linter and the offline tests
75
-
*`npm run test-offline` will run the offline tests
76
-
*`npm run test-integration` will run the integration tests
76
+
77
+
-`npm test` will run the linter and the offline tests
78
+
-`npm run test-offline` will run the offline tests
79
+
-`npm run test-integration` will run the integration tests
77
80
78
81
To run the integration tests, a file with service credentials is required. This file must be called `stt-auth.json` and must be located in `/test/resources/`. There are tests for usage of both CF and RC service instances. For testing CF, the required keys in this configuration file are `username` and `password`. For testing RC, a key of either `iam_acess_token` or `iam_apikey` is required. Optionally, a service URL for an RC instance can be provided under the key `rc_service_url` if the service is available under a URL other than `https://stream.watsonplatform.net/speech-to-text/api`.
The basic API is outlined below, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/
5
4
6
5
See several basic examples at http://watson-speech.mybluemix.net/ ([source](https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/))
7
6
8
7
See a more advanced example at https://speech-to-text-demo.mybluemix.net/
9
8
10
-
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
9
+
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
11
10
(See https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/ for a couple of basic examples in Node.js and Python.)
12
11
13
-
_NOTE_: The `token` parameter only works for CF instances of services. For RC services using IAM for authentication, the `access_token` parameter must be used.
12
+
_NOTE_: The `token` parameter only works for CF instances of services. For RC services using IAM for authentication, the `accessToken` parameter must be used.
Copy file name to clipboardExpand all lines: docs/SPEECH-TO-TEXT.md
+28-28
Original file line number
Diff line number
Diff line change
@@ -8,47 +8,47 @@ The core of the library is the [RecognizeStream] that performs the actual transc
8
8
9
9
_NOTE_ The RecognizeStream class lives in the Watson Node SDK. Any option available on this class can be passed into the following methods. These parameters are documented at http://watson-developer-cloud.github.io/node-sdk/master/classes/recognizestream.html
*`keepMicrophone`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
15
-
*`mediaStream`: Optionally pass in an existing media stream rather than prompting the user for microphone access.
16
-
* Other options passed to [RecognizeStream]
17
-
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
18
-
* Other options passed to [FormatStream] if `options.format` is not set to false
19
-
* Other options passed to [WritableElementStream] if `options.outputElement` is set
13
+
Options:
20
14
21
-
Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
15
+
-`keepMicrophone`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
16
+
-`mediaStream`: Optionally pass in an existing media stream rather than prompting the user for microphone access.
17
+
- Other options passed to [RecognizeStream]
18
+
- Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
19
+
- Other options passed to [FormatStream] if `options.format` is not set to false
20
+
- Other options passed to [WritableElementStream] if `options.outputElement` is set
21
+
22
+
Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
22
23
Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features
23
24
24
25
No more data will be set after `.stop()` is called on the returned stream, but additional results may be recieved for already-sent data.
Can recognize and optionally attempt to play a URL, [File](https://developer.mozilla.org/en-US/docs/Web/API/File) or [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob)
30
30
(such as from an `<input type="file"/>` or from an ajax request.)
31
31
32
-
Options:
33
-
*`file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
34
-
*`play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
35
-
* Other options passed to [RecognizeStream]
36
-
* Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
37
-
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
38
-
* Other options passed to [FormatStream] if `options.format` is not set to false
39
-
* Other options passed to [WritableElementStream] if `options.outputElement` is set
32
+
Options:
40
33
41
-
`play` requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
34
+
-`file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
35
+
-`play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
36
+
- Other options passed to [RecognizeStream]
37
+
- Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
38
+
- Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
39
+
- Other options passed to [FormatStream] if `options.format` is not set to false
40
+
- Other options passed to [WritableElementStream] if `options.outputElement` is set
41
+
42
+
`play` requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
42
43
Will emit an `UNSUPPORTED_FORMAT` error on the RecognizeStream if playback fails. This error is special in that it does not stop the streaming of results.
43
44
44
-
Playback will automatically stop when `.stop()` is called on the returned stream.
45
+
Playback will automatically stop when `.stop()` is called on the returned stream.
45
46
46
47
For Mobile Safari compatibility, a URL must be provided, and `recognizeFile()` must be called in direct response to a user interaction (so the token must be pre-loaded).
Speaks the supplied text through an automatically-created `<audio>` element.
7
+
Speaks the supplied text through an automatically-created `<audio>` element.
8
8
Currently limited to text that can fit within a GET URL (this is particularly an issue on [Internet Explorer before Windows 10](http://stackoverflow.com/questions/32267442/url-length-limitation-of-microsoft-edge)
9
9
where the max length is around 1000 characters after the token is accounted for.)
10
10
11
-
Options:
12
-
* text - the text to speak
13
-
* url - the Watson Text to Speech API URL (defaults to https://stream.watsonplatform.net/text-to-speech/api)
14
-
* voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
15
-
* customization_id - GUID of a custom voice model - omit to use the voice with no customization.
16
-
* autoPlay - set to false to prevent the audio from automatically playing
11
+
Options:
12
+
13
+
- text - the text to speak
14
+
- url - the Watson Text to Speech API URL (defaults to https://stream.watsonplatform.net/text-to-speech/api)
15
+
- voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
16
+
- customization_id - GUID of a custom voice model - omit to use the voice with no customization.
17
+
- autoPlay - set to false to prevent the audio from automatically playing
17
18
18
19
Relies on browser audio support: should work reliably in Chrome and Firefox on desktop and Android. Edge works with a little help. Safari and all iOS browsers do not seem to work yet.
0 commit comments