A small fast portable speech synthesis system

Related tags

Audio flite
Overview
     Flite: a small run-time speech synthesis engine
                  version 2.1-release
      Copyright Carnegie Mellon University 1999-2018
                  All rights reserved
                  http://cmuflite.org
          https://github.com/festvox/flite

Flite is an open source small fast run-time text to speech engine. It is the latest addition to the suite of free software synthesis tools including University of Edinburgh's Festival Speech Synthesis System and Carnegie Mellon University's FestVox project, tools, scripts and documentation for building synthetic voices. However, flite itself does not require either of these systems to compile and run.

The core Flite library was developed by Alan W Black [email protected] (mostly in his so-called spare time) while employed in the Language Technologies Institute at Carnegie Mellon University. The name "flite", originally chosen to mean "festival-lite" is perhaps doubly appropriate as a substantial part of design and coding was done over 30,000ft while awb was travelling, and (usually) isn't in meetings.

The voices, lexicon and language components of flite, both their compression techniques and their actual contents were developed by Kevin A. Lenzo [email protected] and Alan W Black [email protected].

Flite is the answer to the complaint that Festival is too big, too slow, and not portable enough.

o Flite is designed for very small devices, such as PDAs, and also for large server machines which need to serve lots of ports.

o Flite is not a replacement for Festival but an alternative run time engine for voices developed in the FestVox framework where size and speed is crucial.

o Flite is all in ANSI C, it contains no C++ or Scheme, thus requires more care in programming, and is harder to customize at run time.

o It is thread safe

o Voices, lexicons and language descriptions can be compiled (mostly automatically for voices and lexicons) into C representations from their FestVox formats

o All voices, lexicons and language model data are const and in the text segment (i.e. they may be put in ROM). As they are linked in at compile time, there is virtually no startup delay.

o Although the synthesized output is not exactly the same as the same voice in Festival they are effectively equivalent. That is, flite doesn't sound better or worse than the equivalent voice in festival, just faster, smaller and scalable.

o For standard diphone voices, maximum run time memory requirements are approximately less than twice the memory requirement for the waveform generated. For 32bit architectures this effectively means under 1M.

o The flite program supports, synthesis of individual strings or files (utterance by utterance) to direct audio devices or to waveform files.

o The flite library offers simple functions suitable for use in specific applications.

Flite is distributed with a single 8K diphone voice (derived from the cmu_us_kal voice), a pruned lexicon (derived from cmulex) and a set of models for US English. Here are comparisons with Festival using basically the same 8KHz diphone voice

            Flite    Festival
core code    60K      2.6M
USEnglish    100K     ??
lexicon      600K     5M
diphone      1.8M     2.1M
runtime      <1M      16-20M

On a 500Mhz PIII, a timing test of the first two chapters of "Alice in Wonderland" (doc/alice) was done. This produces about 1300 seconds of speech. With flite it takes 19.128 seconds (about 70.6 times faster than real time) with Festival it takes 97 seconds (13.4 times faster than real time). On the ipaq (with the 16KHz diphones) flite synthesizes 9.79 time faster than real time.

Requirements:

o A good C compiler, some of these files are quite large and some C
  compilers might choke on these, gcc is fine.  Sun CC 3.01 has been
  tested too.  Visual C++ 6.0 is known to fail on the large diphone
  database files.  We recommend you use GCC Windows Subsystem for Linux
  Cygwin or mingw32 instead.

o GNU Make

o An audio device isn't required as flite can write its output to 
  a waveform file. 

Supported platforms:

We have successfully compiled and run on

o Various Intel Linux systems (and iPaq Linux), under various versions
  of GCC (2.7.2 to 6.x)

o Mac OS X

o Various Android devices

o Various openwrt devices

o FreeBSD 3.x and 4.x

o Solaris 5.7, and Solaris 9

o Windows 2000/XP and later under Cygwin 1.3.5 and later

o Windows 10 with Windows Subsystem for Linux

o Successfully compiles and runs under 64Bit Linux architectures

o OSF1 V4.0 (gives an unimportant warning about sizes when compiled cst_val.c)

o WASI has experimental support (see below for details)

Previously we supported PalmOS and Windows CE but these seem to be rare nowadays so they are no longer actively supported.

Other similar platforms should just work, we have also cross compiled on a Linux machine for StrongARM. However note that new byte order architectures may not work directly as there is some careful byte order constraints in some structures. These are portable but may require reordering of some fields, contact us if you are moving to a new architecture.

Cross-compiling to WASI (experimental)

In order to successfully cross-compile to WASI, firstly head over to CraneStation/wasi-sdk and install the WASI toolchain.

Afterwards, you can cross-compile to WASI as follows:

./configure --host=wasm32-wasi \
CC=/path/to/wasi-sdk/bin/clang \
AR=/path/to/wasi-sdk/bin/llvm-ar \
RANLIB=/path/to/wasi-sdk/bin/llvm-ranlib

It is important to correctly specify ar and ranlib that is bundled with the WASI clang. Otherwise, you will most likely experience missing symbols during linking, plus you may experience weird llvm errors such as

LLVM ERROR: malformed uleb128, extends past end

When cross-compiling from macOS, you might have to manually specify the sysroot. You can do this by tweaking the CC variable as follows:

CC="/path/to/wasi-sdk/bin/clang --sysroot=/path/to/wasi-sdk/share/sysroot"

After the configure step is successful, simply run as usual:

make

The generated WASI binary can then be found in bin/ directory:

file bin/flite
> bin/flite: WebAssembly (wasm) binary module version 0x1 (MVP)

News

New in 2.2 (Oct 2018) o Better grapheme support (Wilderness Languages) hundreds of new languages

New in 2.1 (Oct 2017)

o Improved Indic front end support (thanks to Suresh Bazaj @ Hear2Read)

o 18 English Voices (various accents)

o 12 Indian Voices (Bengali, Gujarati, Hindi, Kannada, Marathi, Panjabi
  Tamil and Telugu) usually with bilingual (with English) support
  
o Can do byteswap architectures [again] (ar9331 yun arduino, zsun etc)

o flitecheck front-end test suite

o grapheme based festvox builds give working flitevox voices

o SAPI support for CG voices (thanks to Alok Parlikar @ Cobalt Speech and
  Language INC)
  
o gcc 6.x support

o .flitevox files (and models) 40% of previous size, but same quality

New in 2.0.0 (Dec 2014) o Indic language support (Hindi, Tamil and Telugu)

o SSML support

o CG voices as files accessilble by file:/// and http://
  (and set of 13 voices to load)
  
o random forest (multimodel support) improves voice quality

o Supports diffrent sample rates/mgc order to tune for speed

o Kal diphone 500K smaller

o Fixed lots of API issues

o thread safe (again) [after initialization]

o Generalized tokenstreams (used in Bard Storyteller)

o simple-Pulseaudio support

o Improved Android support

o Removed PalmOS support from distribution

o Companion multilingual ebook reader Bard Storyteller 
   https://github.com/festvox/bard

New in 1.4.1 (March 2010) o better ssml support (actually does something)

o better clunit support (smaller)

o Android support

New in 1.4 (December 2009) o crude multi-voice selection support (may change)

o 4 basic voices are included 3 clustergen (awb, rms and slt) plus
  the kal diphone database
  
o CMULEX now uses maximum onset for syllabification

o alsa support

o Clustergen support (including mlpg with mixed excitation) 
  But is still slow on limited processors
  
o Windows support with Visual Studio (specifically for the Olympus 
    Spoken Dialog System)
    
o WinCE support is redone with cegcc/mingw32ce with example
    example TTS app: Flowm: Flite on Windows Mobile
    
o Speed-ups in feature interpretation limiting calls to alloc

o Speed-ups (and fixes) for converting clunits festvox voices

New in 1.3-release (October 2005) o fixes to lpc residual extraction to give better quality output

o An updated lexicon (festlex_CMU from festival-2.0.95) and better
  compression its about 30% of the previous size, with about
  the same accuracy
o Fairly substantial code movements to better support PalmOS and
  multi-platform cross compilation builds
  
o A PalmOS 5.0 port with an small example talking app ("flop")

o runs under ix86_64 linux

New in 1.2-release (February 2003) o A build process for diphone and clunits/ldom voices FestVox voices can be converted (sometimes) automatically

o Various bug fixes

o Initial support for Mac OS X (not talking to audio device yet)
  but compiles and runs
  
o Text files can be synthesize to a single audio file

o (optional) shared library support (Linux)

Compilation

In general

tar zxvf flite-2.1-current.tar.gz

cd flite-2.1-current
./configure 
make
make get_voices

Where tar is gnu tar (gtar), and make is gnu make (gmake).

Or

git clone http://github.com/festvox/flite
cd flite
./configure
make
make get_voices

Configuration should be automatic, but maybe doesn't work in all cases especially if you have some new compiler. You can explicitly set the compiler in config/config and add any options you see fit. Configure tries to guess these but it might be unable to guess for cross compilation cases Interesting options there are

-DWORDS_BIGENDIAN=1  for bigendian machines (e.g. Sparc, M68x, ar9331)
-DNO_UNION_INITIALIZATION=1  For compilers without C 99 union inintialization
-DCST_AUDIO_NONE     if you don't need/want audio support

There are different sets of voices and languages you can select between them (and your own sets if you make config/XXX.lv). For example

./configure --with-langvox=transtac

Will use the languages and voices defined in config/transtac.lv

Usage:

The ./bin/flite binary contains all supported voices and you may choose between the voices with the -voice flag and list the supported voices with the -lv flag. Note the kal (diphone) voice is a different technology from the others and is much less computationally expensive but more robotic. For each voice additional binaries that contain only that voice are created in ./bin/flite_FULLVOICENAME, e.g. ./bin/flite_cmu_us_awb. You can also refer to external clustergen .flitevox voice via a pathname argument with -voice (note the pathname must contain at least one "/")

If it compiles properly a binary will be put in bin/, note by default -g is on so it will be bigger than is actually required

./bin/flite "Flite is a small fast run-time synthesis engine" flite.wav

Will produce an 8KHz riff headered waveform file (riff is Microsoft's wave format often called .WAV).

./bin/flite doc/alice

Will play the text file doc/alice. If the first argument contains a space it is treated as text otherwise it is treated as a filename. If a second argument is given a waveform file is written to it, if no argument is given or "play" is given it will attempt to write directly to the audio device (if supported). if "none" is given the audio is simply thrown away (used for benchmarking). Explicit options are also available.

./bin/flite -v doc/alice none

Will synthesize the file without playing the audio and give a summary of the speed.

./bin/flite doc/alice alice.wav

will synthesize the whole of alice into a single file (previoous versions would only give the last utterance in the file, but that is fixed now).

An additional set of feature setting options are available, these are debug options, Voices are represented as sets of feature values (see lang/cmu_us_kal/cmu_us_kal.c) and you can override values on the command line. This can stop flite from working if malicious values are set and therefor this facility is not intended to be made available for standard users. But these are useful for debugging. Some typical examples are

Use simple concatenation of diphones without prosodic modification

./bin/flite --sets join_type=simple_join doc/intro

Print sentences as they are said

./bin/flite -pw doc/alice

Make it speak slower

./bin/flite --setf duration_stretch=1.5 doc/alice

Make it speak higher pitch

./bin/flite --setf int_f0_target_mean=145 doc/alice

The talking clock is an example talking clode as discussed on http://festvox.org/ldom it requires a single argument HH:MM under Unix you can call it

./bin/flite_time `date +%H:%M`

List the voices linked in directly in this build

./bin/flite -lv

Speak with the US male rms voice (builtin version)

./bin/flite -voice rms -f doc/alice

Speak with the "Scottish" male awb voice (builtin version)

./bin/flite -voice awb -f doc/alice

Speak with the US female slt voice

./bin/flite -voice slt -f doc/alice

Speak with AEW voice, download on the fly from festvox.org

./bin/flite -voice http://festvox.org/flite/packed/flite-2.1/voices/cmu_us_aew.flitevox -f doc/alice

Speak with AHW voice loaded from the local file.

./bin/flite -voice voices/cmu_us_ahw.flitevox -f doc/alice

You can download the available voices into voices/

./bin/get_voices us_voices

and/or

./bin/get_voices indic_voices

Voice quality

So you've eagerly downloaded flite, compiled it and run it, now you are disappointed that it doesn't sound wonderful, sure its fast and small but what you really hoped for was the dulcit tones of a deep baritone voice that would make you desperately hang on every phrase it mellifluously produces. But instead you get an 8Khz diphone voice that sounds like it came from the last millenium.

Well, first, you are right, it is an 8KHz diphone voice from the last millenium, and that was actually deliberate. As we developed flite we wanted a voice that was stable and that we could directly compare with that very same voice in Festival. Flite is an engine. We want to be able take voices built with the FestVox process and compile them for flite, the result should be exactly the same quality (though of course trading the size for quality in flite is also an option). The included voice is just a sample voice that was used in the testing process.

We expect that often voices will be loaded from external files, and we have now set up a voice repository in

http://festvox.org/flite/flite-2.1/voices/*.flitevox

If you visit there with a browser you can hear the examples. You can also download the .flitevox files to you machine so you don't need a network connect everytime you need to load a voice.

We are now actively adding to this list of available voices in English (16) and other languages.

Bard Storyteller: https://github.com/festvox/bard

Bard is a companion app that reads ebooks, both displaying them and actually reading them to you out loud using flite. Bard supports a wide range of fonts, and flite voices, and books in text, html and epub format. Bard is used as a evaluation of flite's capabilities and an example of a serious application using flite.

Comments
  • Built-in voice loading functions?

    Built-in voice loading functions?

    Hi, I'm using flite in my linux c++ project, and I'm trying to use the built-in voice loading function

    extern "C"
    {
        cst_voice *cmu_us_slt(); // built in function
    }
    

    But there's a link error, should I add more link flags besides -lflite?
    Also, is the function name I'm using right?

    opened by teamclouday 3
  • [2.1] symbols removed but no soname bump

    [2.1] symbols removed but no soname bump

    While trying to package flite version 2.1 for Debian¹, I noticed that three symbols (cst_read_2d_array, cst_read_array and cst_rx_not_indic) were dropped with respect to version 2.0. I was wondering if bumping of the soname was just forgotten or if there is anything else at stake.

    Can you either bump the soname, or let me know what you think I should do instead?

    ¹ https://www.debian.org/

    opened by paulgevers 3
  • Tutorial explaining flite build process for new languages

    Tutorial explaining flite build process for new languages

    This is a ToDo and I am hoping to get to this the last weekend in October. The idea is to build a tutorial describing the procedure to build a deployable voice in one Indian language that can expose the API capabilities of flite.

    opened by saikrishnarallabandi 3
  • Fix Mingw32 compilation

    Fix Mingw32 compilation

    I fixed flite compilation using mingw32 gcc, bundled with rubyinstaller. This PR includes three commits:

    1. Fix configure when using mingw32 gcc.

      1. Set MINGWPREF by checking the CC variable value.
      2. Include windows.h when checking mmsystem.h header.
      3. Use wince audio driver when mmsystem.h is available.
    2. Fix undefined reference to c99_vsnprintf and ts_utf8_sequence_length when using mingw32 gcc. The mingw32 gcc compiler doesn't treat __inline as Visual C++ does. Use static inline instead of __inline for c99_vsnprintf. Don't use __inline for ts_utf8_sequence_length.

    3. Use DWORD_PTR instead of DWORD when DWORD_PTR is available in au_wince.c.

    The generated 64-bit flite.exe works well on Windows 10.

    opened by kubo 2
  • Why pre-training models have different sizes?

    Why pre-training models have different sizes?

    For different speakers,I found that the sizes of the models were different,as shown below:
    

    image

      It is generally agreed that, if the model structure is determined, the model size should be independent of the training data size.However, I found that the model size is proportional to the data size.For example,the data size of slt is about 1 hour,and the model size is 11MB. The data size of ljm is about 0.5 hour,and the model size is 5.5MB. Does it mean that the more training data, the larger the model size?

    opened by rzy6461 1
  • some questions about stress

    some questions about stress

    ./t2p covina
    pau k ow v iy1 n ax pau
    
    cmudict-0.4.out
    covina nil k ow0 v iy1 n ax0
    

    Hello, I meet some questions about the accent. In the training data, there are ow0, ax0. there is not ow. But when I use ./t2p to predict the words. I found the t2p print ow (not ow0)! Could you help me? I want to know the detail about how the flite deals with the accent? Thanks.

    opened by bringtree 1
  • bugfix: voice builds broken for non-grapheme voices

    bugfix: voice builds broken for non-grapheme voices

    Flite builds for non-grapheme voices were referring to an undefined variable, breaking the builds. This commit uses build flags to only enable the extern declaration for grapheme based clustergen voices.

    Fixes #37

    opened by happyalu 1
  • Indic voice builds broken as of commit e988047

    Indic voice builds broken as of commit e988047

    Some changes that were introduced to the voice templates (mostly for grapheme voices) now break builds of indic voices.

    In particular, this does not get defined in indic voices, since they are part of indic lang.

    opened by happyalu 1
  • Cross-compile for wasm32-wasi target

    Cross-compile for wasm32-wasi target

    This commit adds experimental cross-compilation support for wasm32-wasi target. With this commit added, it is now possible to cross-compile flite to WASI which then can be run using any WASI-compatible runtime such as Wasmtime.

    This PR closes #24.

    opened by kubkon 1
  • GCC 11.2.1

    GCC 11.2.1 "does not match original declaration" warnings with LTO

    When compiling flite-2.2 on Fedora development branch (rawhide/f36), I'm getting the following warnings:

    making ../build/x86_64-linux-gnu/lib/libflite_cmulex.so
    ../../lang/cmulex/cmu_lex.c:49:27: warning: type of 'cmu_lex_phone_table' does not match original declaration [-Wlto-type-mismatch]
       49 | extern const char * const cmu_lex_phone_table[54];
          |                           ^
    ../../lang/cmulex/cmu_lex_entries.c:14:20: note: array types have different bounds
       14 | const char * const cmu_lex_phone_table[57] =
          |                    ^
    ../../lang/cmulex/cmu_lex_entries.c:14:20: note: 'cmu_lex_phone_table' was previously declared here
    ...
    making ../build/x86_64-linux-gnu/lib/libflite_cmu_grapheme_lex.so
    ../../lang/cmu_grapheme_lex/cmu_grapheme_lex.h:47:27: warning: type of 'unicode_sampa_mapping' does not match original declaration [-Wlto-type-mismatch]
       47 | extern const char * const unicode_sampa_mapping[16663][5];
          |                           ^
    ../../lang/cmu_grapheme_lex/grapheme_unitran_tables.c:9:20: note: array types have different bounds
        9 | const char * const unicode_sampa_mapping[16798][5] =
          |                    ^
    ../../lang/cmu_grapheme_lex/grapheme_unitran_tables.c:9:20: note: 'unicode_sampa_mapping' was previously declared here
    
    opened by rathann 0
  • common_make_rules: use $(AR) instead of the native ar command

    common_make_rules: use $(AR) instead of the native ar command

    I was trying to cross compile this for aarch64, but hit a bump in the road. I got an error that the ar command could not be found. This patch ensures that the correct ar is used when cross-compiling.

    Tested cross compilation by applying this patch in nixpkgs and doing a build with nix build .#pkgsCross.aarch64-multiplatform.flite. I've not tried if the binary actually does much, but at least it fully compiles now.

    The native binary still works fine, but shouldn't be affected by this change anyway.

    opened by Mindavi 0
  • 2.2: fails to build with make 4.4

    2.2: fails to build with make 4.4

    building with make 4.3 works, but started to fail with make 4.4 (https://lists.gnu.org/archive/html/info-gnu/2022-10/msg00008.html) with the following error:

    [...]
    making in doc ...
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o find_sts_main.o find_sts_main.c
    making in tools ...
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o flite_sort_main.o flite_sort_main.c
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -o ../bin/flite_sort flite_sort_main.o -L../build/x86_64-linux-gnu/lib -lflite  -Wl,-O1 -Wl,--as-needed -lm -lpulse-simple -lpulse 
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -o ../bin/find_sts find_sts_main.o -L../build/x86_64-linux-gnu/lib -lflite  -Wl,-O1 -Wl,--as-needed -lm -lpulse-simple -lpulse 
    making in main ...
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o flite_main.o flite_main.c
    making ../build/x86_64-linux-gnu/lib/libflite.so
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o t2p_main.o t2p_main.c
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o compile_regexes.o compile_regexes.c
    x86_64-pc-linux-gnu-cc -march=native -O2 -pipe -Wall     -march=native -O2 -pipe  -I../include  -c -o flitevox_info_main.o flitevox_info_main.c
    make[1]: *** No rule to make target 'flite_voice_list.c', needed by 'all'.  Stop.
    make[1]: *** Waiting for unfinished jobs....
    make: *** [config/common_make_rules:133: build/x86_64-linux-gnu/obj//.make_build_dirs] Error 2
    

    complete build.log: flite-2.2-make-4.4-build-error.log

    opened by tgurr 0
  • Rust bindings?

    Rust bindings?

    I'm attempting to create rust bindings for flite, but it is quickly becoming apparent that my brain is woefully smooth as every time i try to use them, it gives me a voice not found. My repo is randomairborne/flite-rs.

    opened by randomairborne 0
  • W32: prevent

    W32: prevent "dllexport" from being added to globalvardefs when building for static linking

    This PR introduces an FLITE_STATIC define.

    If FLITE_STATIC is defined at build-time, GLOBALVARDEF decorator will not use __declspec(dllexport) (on Windows). This allows using a static flite on Windows (other platforms don't have this problem).

    The FLITE_STATIC define must be set manually by whoever wants to use flite in this way.

    Closes: https://github.com/festvox/flite/issues/83

    opened by umlaeute 0
  • allow static linking on Windows

    allow static linking on Windows

    it seems that it's impossible to create a dll that statically links against flite.

    the problem, is that for static builds you shouldn't export variables with __declspec(dllexport), as found in https://github.com/festvox/flite/blob/6c9f20dc915b17f5619340069889db0aa007fcdc/include/flite.h#L68-L73 and https://github.com/festvox/flite/blob/6c9f20dc915b17f5619340069889db0aa007fcdc/src/synth/flite.c#L47-L52

    i think the canonical way to handle this is to use a define indicating a static build (and don't use the __declspec(dllexport) decorator if it is set), like so:

    #ifdef FLITE_STATIC
    # define GLOBALVARDEF 
    #else
    # ifdef WIN32
    /* For Visual Studio 2012 global variable definitions */
    #  define GLOBALVARDEF __declspec(dllexport)
    # else
    #  define GLOBALVARDEF
    # endif
    #endif
    

    obviously, whoever wants to link flite statically, will then have to define FLITE_STATIC.

    opened by umlaeute 0
  • Problems with lex_lookup

    Problems with lex_lookup

    I'm trying to install lex_lookup on Cygwin but I keep getting an error message: $ cd testsuite $ make lex_lookup make: *** No rule to make target 'lex_lookup'. Stop.

    How can I get this to work?

    opened by isobely79797 0
  • Android support?

    Android support?

    I recently discovered this project on F-droid. The problem is that it is really old.

    https://github.com/happyalu/Flite-TTS-Engine-for-Android

    https://f-droid.org/en/packages/edu.cmu.cs.speech.tts.flite

    Is there a newer version?

    opened by Darin755 1
Releases(v2.2)
  • v2.2(Aug 13, 2020)

    Better grapheme support for hundreds of new languages as part of Wilderness project (http://www.festvox.org/cmu_wilderness/)

    Also includes updated G2P rules for indic

    Source code(tar.gz)
    Source code(zip)
Owner
CMU Festvox Project
Speech Synthesis Tools
CMU Festvox Project
Easy and efficient audio synthesis in C++

Tonic Fast and easy audio synthesis in C++. Prefer coding to patching? Love clean syntax? Care about performance? That's how we feel too, and why we m

null 482 Dec 26, 2022
A synthesis flow for hybrid processing-in-RRAM modes

reram-synthesis A synthesis flow for hybrid processing-in-ReRAM modes This project contains three parts: digital-synthesis: a synthesis flow for the d

Feng Wang 7 Nov 12, 2021
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 7, 2023
Facebook AI Research's Automatic Speech Recognition Toolkit

wav2letter++ Important Note: wav2letter has been moved and consolidated into Flashlight in the ASR application. Future wav2letter development will occ

Facebook Research 6.2k Jan 3, 2023
eSpeak NG is a compact open source software text-to-speech synthesizer for Linux, Windows, Android and other operating systems

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

null 1.7k Jan 9, 2023
Let’s Create a Speech Synthesizer

Speech Synthesizer Series Material for my video series about creating a peculiar English-language speech synthesizer with Finnish accent. Playlist: ht

Joel Yliluoma 94 Jan 1, 2023
Linear predictive coding (LPC) is an algorithm used to approximate audio signals like human speech

lpc.lv2 LPC analysis + synthesis plugin for LV2 About Linear predictive coding (LPC) is an algorithm used to approximate audio signals like human spee

null 11 Dec 17, 2022
Libsio - A runtime library for Speech Input (stt) & Output (tts)

libsio A runtime library for Speech Input (stt) & Output (tts) Speech To Text unified CTC and WFST decoding via beam search online(streaming) decoding

null 26 Nov 24, 2022
PortAudio is a portable audio I/O library designed for cross-platform support of audio

PortAudio is a cross-platform, open-source C language library for real-time audio input and output.

PortAudio 786 Jan 1, 2023
HamMessenger is a portable device that uses a ham radio and the APRS protocol as a medium to send and receive text messages.

HamMessenger is a portable, battery powered device that runs on a microcontroller and interfaces with an inexpensive ham radio to send and receive text messages and provide position updates using the APRS protocol. Messages and position updates sent via HamMessenger can be viewed on sites such as aprs.fi. HamMessenger messages are NOT encrypted!

null 210 Dec 13, 2022
The Synthesis ToolKit in C++ (STK) is a set of open source audio signal processing and algorithmic synthesis classes written in the C++ programming language.

The Synthesis ToolKit in C++ (STK) By Perry R. Cook and Gary P. Scavone, 1995--2021. This distribution of the Synthesis ToolKit in C++ (STK) contains

null 832 Jan 2, 2023
TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop

TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop, aimed at increasing accessibility to such technology.

null 143 Dec 15, 2022
This speech synthesizer is actually the SAM speech synthesizer in an ESP8266

SSSSAM Serial Speech Synthesizer SAM This speech synthesizer is actually the SAM speech synthesizer in an ESP8266. Where SAM was a software applicatio

Jan 12 Oct 4, 2022
Very portable voice recorder with speech recognition.

DictoFun Small wearable voice recorder. NRF52832-based. Concept Device was initiated after my frustration while using voice recorder for storing ideas

Roman 5 Dec 14, 2022
Skylark Edit is a customizable text/hex editor. Small, Portable, Fast.

Skylark Edit is written in C, a high performance text/hex editor. Embedded Database-client/Redis-client/Lua-engine. You can run Lua scripts and SQL files directly.

hua andy 265 Dec 30, 2022
Mbedcrypto - a portable, small, easy to use and fast c++14 library for cryptography.

mbedcrypto mbedcrypto is a portable, small, easy to use, feature rich and fast c++14 library for cryptography based on fantastic and clean mbedtlsnote

amir zamani 38 Nov 22, 2022
LibreSSL Portable itself. This includes the build scaffold and compatibility layer that builds portable LibreSSL from the OpenBSD source code.

LibreSSL Portable itself. This includes the build scaffold and compatibility layer that builds portable LibreSSL from the OpenBSD source code.

OpenBSD LibreSSL Portable 1.2k Jan 5, 2023
Easy and efficient audio synthesis in C++

Tonic Fast and easy audio synthesis in C++. Prefer coding to patching? Love clean syntax? Care about performance? That's how we feel too, and why we m

null 482 Dec 26, 2022
A synthesis flow for hybrid processing-in-RRAM modes

reram-synthesis A synthesis flow for hybrid processing-in-ReRAM modes This project contains three parts: digital-synthesis: a synthesis flow for the d

Feng Wang 7 Nov 12, 2021
Model synthesis is a technique for generating 2D and 3D shapes from examples.

Model Synthesis Model synthesis is a technique for generating 2D and 3D shapes from examples. It is inspired by texture synthesis. Model synthesis was

Paul Merrell 82 Jan 4, 2023