Skip to main content

SSML Support

SSML (Speech Synthesis Markup Language) lets you add richer speech patterns to audio flashcards, like pauses, emphasis, phonetic hints, and number or word interpretation. SSML is XML-based, which gives you more control over how text is spoken.

Important: Native TTS engines on phones vary in how much SSML they support. Try these examples on your own device to see what works. Support on iOS and Android is limited and engine-dependent, so some tags may be ignored.

Basic SSML structure

All SSML content should be wrapped in a top-level <speak> element:

<speak>
Hello, this is spoken text.
</speak>

TTS engines read everything inside <speak> as speech instructions.

On some devices the markup can be omitted.

1. Pauses (breaks)

Description: Adds a pause in speech.

<speak>
Think... <break time="500ms"/> and then speak.
</speak>

Useful to pace flashcards like "Term... definition here". Many engines support <break> even when other SSML tags are ignored.

2. Prosody (speaking style)

Description: Adjusts rate, pitch, and volume of speech.

<speak>
Normal pace.
<prosody rate="slow" pitch="+10%">
Slower and slightly higher pitch here.
</prosody>
</speak>

Try this to make important definitions sound different. Some mobile voices may not fully honor these controls, so experiment to see what your device actually does.

3. Phoneme (pronunciation hints)

Description: Tell the TTS engine how to pronounce a word using phonetics.

<speak>
Here's a custom pronunciation:
<phoneme alphabet="x-sampa" ph="foUni:m">phoneme</phoneme>
</speak>

Useful for names, technical terms, or foreign words. Some mobile TTS engines (especially native Android or iOS) may ignore this and read the word normally.

4. "Say as" (interpretation)

Description: Tell the engine how to interpret numbers, dates, or characters.

Spell out letters:

<speak>
Here's spelled out:
<say-as interpret-as="characters">FLASH</say-as>
</speak>

Say a number as ordinal:

<speak>
This is the <say-as interpret-as="ordinal">1</say-as> card.
</speak>

Great for flashcards involving codes, numbers, and structured data. Some native engines may not reliably interpret all <say-as> formats.

How to use in flashcards

You can embed SSML directly in your flashcard text where supported. For example:

<speak>
<say-as interpret-as="characters">API</say-as> stands for
<break time="300ms"/>
Application Programming Interface.
</speak>

This will pause, spell out API, then speak the full phrase.

Try it on your device

SSML support is not uniform across mobile platforms:

  • iOS: Some SSML (like <break> and simple prosody) is accepted when using native speech APIs.
  • Android: Basic <break> may work, but many tags are ignored by the default TTS engine.

Best practice: Try these SSML examples in your flashcards and listen on both iOS and Android. If a tag produces no effect, the system likely does not support it but will not throw an error.

Summary

SSML featureTypical support on mobile
<break>Usually works
<prosody>Partly supported (depends on engine)
<phoneme>Often ignored
<say-as>Variable support
Advanced SSML tagsLikely ignored