Java Speech API Frequently Asked Questions


This collection of frequently asked questions (FAQ) provides brief answers to many common questions about the Java Speech API (JSAPI).


Question Index

Download Questions

API Questions

Implementation Questions

Question Index

Where can I get the Java Speech API (JSAPI)?

The Java Speech API (JSAPI) is not part of the JDK and Sun does not ship an implementation of JSAPI. Instead, we work with third party speech companies to encourage the availability of multiple implementations.

Question Index

What is the Java Speech API (JSAPI)?

The Java Speech API allows Java applications to incorporate speech technology into their user interfaces. It defines a cross-platform API to support command and control recognizers, dictation systems and speech synthesizers.

Question Index

When will the JSAPI specification be available?

The Java Speech 1.0 API specification was released on October 26, 1998, and is available here

Question Index

What does the Java Speech API specification include?

The Java Speech API specification includes the Javadoc-style API documentation for the approximately 70 classes and interfaces in the API. The specification also includes a detailed Programmer's Guide which explains both introductory and advanced speech application programming with JSAPI. Two companion specifications are available: JSML and JSGF.

The specification is not yet provided with the .class files needed to compile applications with JSAPI.

Question Index

What are JSML and JSGF?

The Java Speech API Markup Language (JSML) and the Java Speech API Grammar Format (JSGF) are companion specifications to the Java Speech API. JSML (currently in beta) defines a standard text format for marking up text for input to a speech synthesizer. JSGF version 1.0 defines a standard text format for providing a grammar to a speech recognizer. Both specifications are available here

Question Index

How was the JSAPI specification developed?

Sun Microsystems, Inc. worked in partnership with leading speech technology companies to define the initial specification of the Java Speech API, JSML and JSGF. Sun is grateful for the contributions of:

Question Index

How does JSAPI relate to other Java APIs?

The Java Speech API is part of a family of APIs that work together as a suite to provide customers with enhanced graphics and extended communications capabilities. These APIs include the

Question Index

What JSAPI implementations are now available?

The Java Speech API is a freely available specification and therefore anyone is welcome to develop an implementation. The following implementations are known to exist.

Note: Sun Microsystems, Inc. makes no representations or warranties about the suitability of the software listed here, either express or implied, including but not limited to the implied warranties of mechantability, fitness for a particular purpose, or non-infringement. The implementations listed here have not been tested with regard to compliance to the JSAPI specification, nor does their appearance on this page imply any form of endorsement of compliance on the part of Sun.

FreeTTS on SourceForge Logo

  • Description: Open source speech synthesizer written entirely in the Java programming language.
  • Requirements: JDK 1.4. Read about more requirements on the FreeTTS web site.
IBM's "Speech for Java"
  • Description: Implementation based on IBM's ViaVoice product, which supports continuous dictation, command and control and speech synthesis. It supports all the European language versions of ViaVoice -- US & UK English, French, German, Italian and Spanish -- plus Japanese.
  • Requirements: JDK 1.1.7 or later or JDK 1.2 on Windows 95 with 32MB, or Windows NT with 48MB. Both platforms also require an installation ViaVoice 98.
IBM's "Speech for Java" on Linux
  • Description: Beta version of "Speech for Java" on Linux. Currently only supports speech recognition.
  • Requirements: Red Hat Linux 6.0 with 32MB, and Blackdown JDK 1.1.7 with native thread support.
The Cloud Garden
  • Description: Implementation for use with any recognition/TTS speech engine compliant with Microsoft's SAPI5 (with SAPI4 support for TTS engines only). An additional package allows redirection of audio data to/from Files, Lines and remote clients (using the javax.sound.sampled package). Some examples demonstrate its use in applets in Netscape and IE browsers.
  • Requirements: JDK 1.1 or better, Windows 98, Me, 2000 or NT, and any SAPI 5.1, 5.0 or 4.0 compliant speech engine (some of which can be downloaded from Microsoft's web site).

Lernout & Hauspie's TTS for Java Speech API

  • Description: Implementations based upon ASR1600 and TTS3000 engines, which support command and control and speech synthesis. Supports 10 different voices and associated whispering voices for the English language. Provides control for pitch, pitch range, speaking rate, and volume.
  • Requirements: Sun Solaris OS version 2.4 or later, JDK 1.1.5. Sun Swing package (free download) for graphical Type-n-Talk demo.
  • More information: Contact Edmund Kwan, Director of Sales, Western Region Speech and Language Technologies and Solutions (ekwan@lhs.com)

Conversa Web 3.0

  • Description: Conversa Web is a voice-enabled Web browser that provides a range of facilities for voice-navigation of the web by speech recognition and text-to-speech. The developers of Conversa Web chose to write a JSAPI implementation for the speech support.
  • Requirements: Windows 95/98 or NT 4.0 running on Intel Pentium 166 MHz processor or faster (or equivalent). Minimum of 32 MB RAM (64 MB recommended). Multimedia system: sound card and speakers. Microsoft Internet Explorer 4.0 or higher.

Festival

  • Description: Festival is a general multi-lingual speech synthesis system developed by the Centre for Speech Technology Research at the University of Edinburgh. It offers a full text to speech system with various APIs, as well an environment for development and research of speech synthesis techniques. It is written in C++ with a Scheme-based command interpreter for general control and provides a binding to the Java Speech API. Supports the English (British and American), Spanish and Welsh languages.
  • Requirements: Festival runs on Suns (SunOS and Solaris), FreeBSD, Linux, SGIs, HPs and DEC Alphas and is portable to other Unix machines. Preliminary support is available for Windows 95 and NT. For details and requirements see the Festival download page.

Elan Speech Cube

  • Description: Elan Speech Cube is a Multilingual, multichannel, cross-operating system text-to-speech software component for client-server architecture. Speech Cube is available with 2 TTS technologies (Elan Tempo : diphone concatenation and Elan Sayso : unit selection), covering 11 languages. Speech Cube native Java client supports JSAPI/JSML.
  • Requirements: JDK 1.3 or later on Windows NT/2000/XP, Linux or Solaris 2.7/2.8, Speech Cube V4.2 and higher.
  • About Elan Speech: Elan Speech is an established worldwide provider of text-to-speech technology (TTS). Elan TTS transforms any IT generated text into speech and reads it out loud.

Question Index

How do I use JSAPI in an applet?

It is possible to use JSAPI in an applet. In order to do this, users will need the Java Plug-in (see here). The reason for this is that JSAPI implementations require access to the AWT EventQueue, and the built-in JDK support in the browsers we've worked with denies any applet access to the AWT EventQueue. The Java Plug-in doesn't have this restriction, and users can configure the Java Plug-in to grant or deny applet access to the AWT Queue.

If you are using JRE 1.1:

Have your users follow these steps if your applet is based upon JDK 1.1:

  • Obtain a JDK 1.1.7 or better Java Runtime Environment (JRE). The reason for this is we have had problems with applet security being denied with JDK 1.1.6. Please note that the user needs the JRE and not the JDK. The JRE is freely available for download from the following URL:

    link

  • Before running the browser, have the user modify their CLASSPATH environment variable to include the supporting classes for JSAPI. For example, if the user has IBM's Speech for Java, have the user include the ibmjs.jar file in CLASSPATH.
  • Make sure any shared libraries for the JSAPI support are in the user's PATH. For example, if the user has IBM's Speech For Java, have the user include the ibmjs lib directory in their PATH (e.g., c:\ibmjs\lib).
  • Have the user copy the speech.properties to their home directory. A user can determine their home directory by enabling the console for the Java Plug-in. When the user accesses a page that uses the Java Plug-in, the Java Plug-in console will tell the user what it thinks the user's home directory is.
  • Use javakey to add your identity to their signature database (i.e., identitydb.obj). This will tell the Java Plug-in to trust applets signed by you.
  • Copy the identitydb.obj that was created or updated in previous step to the user's home directory (the same place where the user copied speech.properties).

Then perform these steps on your applet:

  • Use javakey to both create a signature database for your system and to sign your applet's jar file. This will allow the applet to participate in the security model.
  • Create an HTML page that uses your applet in the Plug-in.
  • If the user experiences a "checkread" exception while attempting to run your applet, it's most likely due to a mismatch between the user's identitydb.obj file and the signature on your applet's jar file. A way to remedy this is to recreate your identitydb.obj and re-sign your jar file.

If you are using JRE 1.2:

The Java 2 platform's security model allows signing as done with JDK 1.1, but it also permits finer grained access control. The following are just some examples, and we recommend you read the Java Security Architecture Specification at the following URL before deciding what to do:

link

For a quick start, have your users do the following if your applet uses the Java 2 (i.e., JDK 1.2) platform:

  • Obtain the JDK 1.2 Plug-in.
  • Before running the browser, have the user modify their CLASSPATH environment variable to include the supporting classes for JSAPI. For example, if the user has IBM's Speech for Java, have the user include the ibmjs.jar file in CLASSPATH.
  • Make sure any shared libraries for the JSAPI support are in the user's PATH. For example, if the user has IBM's Speech For Java, have the user include the ibmjs lib directory in their PATH (e.g., c:\ibmjs\lib).
  • Have the user copy the speech.properties to their home directory. A user can determine their home directory by enabling the console for the Java Plug-in. When the user accesses a page that uses the Java Plug-in, the Java Plug-in console will tell the user what it thinks the user's home directory is.
  • Have the user modify their java.policy file in the Java Plug-in's security directory appropriately. There are many ways to do this. The following are just a few possibilities:
    • Grant all applets the AllPermission property. This is extremely dangerous and is only provided as an example. To do this, have the user modify their java.policy file to contain only the following lines:
      grant {
       permission java.security.AllPermission;
      }
       
      
    • Grant permissions to a particular URL (e.g., the URL containing your applet). To do this, have the user add the following lines to their java.policy file:
      grant codeBase "http://your.url.here" {
       permission java.security.AllPermission;
      }
       
      

The information in this FAQ is not meant to be a complete tutorial on the JDK 1.1 and JDK 1.2 architecture. Instead, it is meant to be hopefully enough to get you started with running JSAPI applets in a browser. We suggest you visit the following URLs to obtain more information on the Java Security models:

Java Security Home Page:
link

Tutorial on JDK 1.1 Security:
link

Tutorial on JDK 1.2 Security:
link

Question Index

Why does Netscape Navigator or Internet Explorer throw a security exception when I use JSAPI in an applet?

JSAPI implementations require access to the AWT EventQueue. The built-in Java platform support in the browsers we've worked with denies an applet access to the AWT EventQueue. As a result, JSAPI implementations will be denied access to the AWT EventQueue. In addition, we are not aware of a way to configure the built-in Java platform support in these environments to allow access to the AWT EventQueue.

The Java Plug-in (see link), however, can be configured to allow an applet the necessary permissions it needs to use an implementation of JSAPI. As a result, we currently recommend using the Java Plug-in for applets that use JSAPI.

Question Index

I'm concerned about JSAPI applets "bugging" my office. What are the plans for JSAPI and security on JDK 1.2?

The JSAPI 1.0 specification includes the SpeechPermission class that currently only supports one SpeechPermission: javax.speech. When that permission is granted, an application or applet has access to all the capabilities provided by installed speech recognizers and synthesizers. Without that permission, an application or applet has no access to speech capabilities.

As speech technology matures it is anticipated that a finer-grained permission model will be introduced to provide access by applications and applets to some, but not all, speech capabilities.

Before granting speech permission, developers and users should consider the potential impact of the grant.

Question Index

Does JSAPI allow me to control the audio input source of a recognizer or redirect the audio output of a speech synthesizer?

This support is currently not in JSAPI. We plan to use the Java Sound API to help provide this support in the future. We purposely left room for expansion in the javax.speech.AudioManager interface and will further investigate this support after the Java Sound API is finalized.

Question Index

Left Curve
Java SDKs and Tools
Right Curve
Left Curve
Java Resources
Right Curve
JavaOne Banner
Java 8 banner (182)