My Computer related stuff...

2007-09-17

keyboard and mouse sharing over network

hmmm,

I'm having a laptop and a desktop, and I want both to be controllable by one set of keyboard+mouse. I discovered synergy (client existing for Microsoft Windows, linux, mac os x, ...(?)). You can find it at sourceforge if you don't find it with your distribution.
Now I configured my laptop to be client, my desktop to be server. The server accepting connection from my laptop.
Now I can access my laptop using the keyboard & mouse of my desktop for both.

Configuration of my windows (Server):

Start synergy (on windows machine)
select "share this computer's keyboard and mouse"
press "configure" (see image right).
press the + button in the screens-section (once for all machines you wish to configure, add the server like this AND all the clients.
in the "Links" section, configure where which display is "logically positioned", so when leaving the display (using the mouse-cursor) at one side, where should it enter (at what side).
Also provide (if needed) a "return"-link:
(DON'T THINK that if you configure to leave machine-A through your left-screen-side for the right-screen-side of machine-B, you'll automatically configure the right-side of machine-B to return to the left-side of machine-A, because you don't).

configuring the client:

Now you don't have to configure your client, only to launch it and tell it what server to connect. [code]synergyc server[/code] (see man synergyc for extra options).

Now having it working, I set the synergy software to launch automatically

Windows:

server to launch automatically (using the "autostart" button on the windows synergy application).

Linux (Ubuntu Feisty Fawn):
implementing suggestions on (amongst others Ubuntu-forums, I also launch synergyc automatically).

I created a script synergyc_start with the instructions to execute:
#!/bin/sh synergyc servername
and added an invocation of that script in /etc/gdm/Init/Default, before the sysmodmap=-line.
and added another invocation of that script in /etc/gdm/PreSession/Default before the XSETROOT=-line.

Labels: hardware, linux, network

2007-06-13

Changing MAC address for ethernet card (in linux)

I have a program that checks its license with the MAC address of the (an) ethernet adapter.
Now the ethernet card was fried (last friday, lightning stroke in the neighbourhood and not only the NIC (Network Interface Card) was fried. Luckilly not the harddisk.
So now I replaced it (and the power supply), but - of course - the license would not register.
Now in Linux you can - if the driver supports it - alter your MAC address:
sudo ifconfig eth0 hw internet 00:01:02:03:04:05 where you set as eth0 the NIC reference and as 00:01:02:03:04:05 the MAC address you want your NIC to use henceforth.

Ah well, I found an nice page with an overview of how to do 'it' in different OS's (see link under title of this blog-item).

Labels: hardware, linux, network

2007-05-24

ScanSoft RealSpeak 3.51 for Linux - problem with Debian ETCH (ext3 ?)

hi,

I updated my (quick-and-dirty) post originally called "Text To Speech (ScanSoft RealSpeak)". Hoping it would become less obscure.

Today I encountered some problems with another installation (being Debian ETCH, kernel 2.6.18-4, with root ext3)... for which I found a workaround.

It seems that ScanSoft RealSpeak 3.5.1 does not function (anymore) in this distribution. The standard application (as used in aforementioned Posting), "hangs" just after outputting "Initialize".

When using strace it looks like one of the internal functions of RealSpeak looses itself in a loop... always continuing looking into (digging deeper and reaallly deeper!!!) into directory "." (so actually it's not digging deeper, ... but it's not comming back anyway).

This is probably caused by the (erroneous ? obsolete ?) assumption of the (a) programmer that a readdir(3) (or but probably not getdents(1)) will return its first two items being "." and "..". Thus not needing to "check" them out, and jumping to number three.

The code seems to go down into all directories of the given engine directory.

But if the result is not ordered with these . and .. upfront, but on some other position 3 (or 4), these special directory entries seem to be treated like any regular directory and descended into... And if - like all good stories - the story repeats itself (very probable if e.g. we're examining . as a regular directory, thus descending into it, actually staying put), this could take a while...

Here some logging:


open("/opt/scansoft/tts/engine/../api/server"...
...
open("/opt/scansoft/tts/engine/../api/./server"...
...
open("/opt/scansoft/tts/engine/../api/././server"...
...
open("/opt/scansoft/tts/engine/../api/./././././././././server"...
...

This snippet can be explained thus:

First when examining engine directory, getdir returns .. on some later-than-second place. This directory is treated as any other directory and since the code is - probably - descending recursively into all directories, looking for all languages available for instance (?), the code examines that directory.

Then Apparently getdir of that directory .. directory (actually the directory /opt/scansoft/tts) returns directory api (being absolutely /opt/scansoft/tts/api, but by the faulty procedure now considered part of the engine-tree.

Then second Murphy: the directory api apparently ahs it's . reference as first directory entry, (later-than-second place), so the anxious procedure is going there,... . But now we arrive absolutely again in the same spot (directory api, which again gives . as directory.... return the same result, and we're off for a nice long trip into ././././././././././... and so on

Proof for(or making understandable/acceptable of) this assumption ?

Well, dir -f returns the list of files in order of the filesystem, so I executed that:
and got:

$ ls -af /opt/scansoft/tts/engine
libicudata.so.22  tts4sml.so  libicuuc.so.22  dec_encrypt.so  ttsengine.so  ..
dub               .           headers         libxml4c.so.50  xlit_1252.so

On other distributions/versions/systems (where the ScanSoft software does work), this instruction returns:

$ ls -af /opt/scansoft/tts/engine
.   dec_encrypt.so    headers     dub             libxml4c.so.50  xlit_1252.so
..  libicudata.so.22  tts4sml.so  libicuuc.so.22  ttsengine.so

P.S.

Thanks to my collegue Erik Devriendt for helping me finding the origin of the failure.

Solution

A clean solution to the problem would be a bugfix by ScanSoft (Nuance now) seems to be in its place...

A workaround has been composed:

In the process of finding a workaround, I mounted (using sshfs) the engine-directory of a working system over the engine directory of a non-working, and the system functioned...
More fine-tuning of this discovered that especially the header directory of the working system was important. The other files and directories could be used within the local system (using symlinks).

Anyway... In that process, I tried to compose an image-file and mounting it through loopback. Then I discovered that when an ext2 image was extended with a journal (tune2fs -j /dev/loop123) to ext3, and subsequently mounted, the directory got scrambled again. Mounting as ext2 did not solve this.

But when using an (ext2) image mounted on the engine directory, the Initialization procedure indeed continued (did not loose itself in that nasty recursion), but apparently the system then did not find the required files..., because the Initialization returned with Error 23 (in api/inc/lh_err.h, we find TTS_E_NO_MATCH_FOUND 23 . This is the same behaviour as when you request an unknown language-code...

In a last desperate attempt, I created an ext2-image (205M), mounted it as the /opt/scansoft and reinstalled all (rs-api and rs-<lang> packages).

This worked !!!

Although this system now works, I think it more safe just to use another distribution, c.q. version that does not have this problem.

Procedure for workaround:

In this procedure, we assume it being executed by a user with sudo privileges on dd, losetup, mkfs.ext2, mount, mkdir, dpkg, OR execute it in the unsafe way: as superuser.

Create the image-filesystem (first on arbitrary location)


$ img=/opt/scansoft.img;
$ dev=$(/sbin/losetup -f);
$ sudo dd if=/dev/zero of=${img} bs=$(( 1024*1024 )) count=205;
$ sudo /sbin/losetup ${dev} ${img};
$ sudo /sbin/mkfs.ext2 ${dev};

mount it and install the software:

$ mnt=/opt/scansoft;
$ sudo mkdir -p ${mnt};
$ sudo mount ${dev} ${mnt} -t ext2;
$ sudo dpkg -i rs-*.deb;

test the directory-order in the image:$ ls -af ${mnt}/tts/engine/headersand (at least I did) revel at the result:

.   dec_encrypt.so    headers     dub             libxml4c.so.50  xlit_1252.so
..  libicudata.so.22  tts4sml.so  libicuuc.so.22  ttsengine.so

Now finaly, test the standard test-program:

cd /tmp;
echo "This is a simple test">test.txt;
/opt/scansoft/tts/api/demos/standard/standard 0 0 /opt/scansoft/tts/engine test.txt;

You should now see

Initialize
Process
Uninitialize

after which the current directory should hold a standard.pcm file (of filesize > 0).

To allow automatically (and user-requested) mounting (c.q. unmounting) of the ScanSoft system, we added a line to /etc/fstab:

/opt/scansoft.img /opt/scansoft ext2 loop,ro,users,exec 0 2

Labels: bug, debian etch, ext3, getdir, linux, Nuance, ScanSoft RealSpeak

2006-06-29

Complete changes to CAIVIAR for use with Scansoft RealSpeak

Changes in RealSpeak.h and RealSpeak.cpp allow extra keyword "realspeak.engine" in ivr.properties to reach the Scansoft Realspeak engine.

RealSpeak.h (added member "engine"e;):


/* RealSpeak.h

Header file for RealSpeak.cpp

Part of the caiviar package. 

Copyright (c) 2002 MobileX AG, http://www.mobilexag.de
Copyright (c) 2002 Peter Dikant <peter.dikant@mobilexag.de>
Copyright (c) 2002 Matthias Kramm <kramm@quiss.org>

This file is distributed under the GPL, see file COPYING for details. */

#ifdef WIN32
#include <windows.h>
#endif
#include "/gttsso_types.h"
#include "/glh_ttsso.h"
#include "/g../src/Log.h"
#include "/g../src/os.h"

#ifndef __realspeak_h__
#define __realspeak_h__

#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000

class RealSpeak: public TextToSpeech
{
public:
 static int initialize(Logger*log);
 static void finalize(Logger*log);
 
 static int setParams (const char *key, const char* value);

 virtual void text2Stream (const char *text, unsigned char **outbuffer, unsigned long *size);
 virtual void text2File (const char *text, const char *filename);
 virtual void text2Audio (const char *text);
 RealSpeak(Logger *log);
 virtual ~RealSpeak();

private:
 static TTSRETVAL sourceCallback(void *pAppData, void *databuffer, U32 buffersize, U32 *datasize);
 static VOID *destCallback(void *pAppData, U16 datatype, VOID *data, U32 datasize, U32 *buffersize);
 static TTSRETVAL CbTtsEventNotify(void *pAppData, void *buffer, U16 datasize, U16 event);
 void loadDictionary(const char* dictionary);
 HTTSINSTANCE hInst;
 TTSPARM ttsParm;
 HTTSDICT userdict;
 char* text;
 char* textstart;
 unsigned char *output;
 long outputPos;
 long outputSize;
 Mutex mutex;
 long int outputsize;

 static char* dictionary; //set by setParam, used by constructor
 static char* remote_server;
 static char* remote_service;
 static int remote_port;
 static char* language;
 static char* voice;
 static char* engine;
};

#endif

RealSpeak.cpp (introduced currently (RealSpeak version 3.5) supported language constants, and macro-fied some repetitive code to allow easier future changes):


/* 
 * Filename: RealSpeak.cpp
 * Project Caiviar "ISDN-CAPI made easy"
 * Package: RealSpeak(TM) driver.
 
 Part of the caiviar package. 

 Copyright (c) 2002,2003 Matthias Kramm <kramm@quiss.org>
 Copyright (c) 2002 Peter Dikant <peter.dikant@mobilexag.de>
 Copyright (c) 2006 Dieter Demerre <ddemerre AT googles e-mail gmail>

 This file is distributed under the GPL, see file COPYING for details. 
*/

/*===========================================================================**
** INCLUDE FILES                                                        **
**===========================================================================*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "RealSpeak.h"

/*===========================================================================**
** LOCAL MACROS                                                         **
**===========================================================================*/
#define TTS_OUTPUT_BUFFER 1048576 // 1 MByte output buffer

#define TTS_DEFAULT_LANGUAGE "German"
#define TTS_DEFAULT_VOICE "female1"
#if defined(LINUX)
#  define TTS_ENGINEPATH "./engine"
#else /* if defined(LINUX) */
#  define TTS_ENGINEPATH ".\\Engine"
#endif /* if defined(LINUX) - else*/

/* macros to easily configure TTS constant conversion table */
#define TTSCAIVIARRECORD(s,c) (char*)(s),(char*)(#c),(U16)(c)
#define NROFELEMENTS(array) (sizeof(array)/sizeof((array)[0]))
#define NROFLANGS NROFELEMENTS(langConv)
#define NROFVOICES NROFELEMENTS(voiceConv)
#define SEARCHTTSDATA(ttsstr,var,array,max) for(var=0;((var<max)&&(strcmp(ttsstr,array[var].IKnow)));var++);
#define SEARCHLANGUAGE(ttsstr,var) SEARCHTTSDATA(ttsstr,var,langConv,NROFLANGS)
#define SEARCHVOICE(ttsstr,var) SEARCHTTSDATA(ttsstr,var,voiceConv,NROFVOICES)

/**
 * check whether var equals str 
 * and if so, set mmbr member of RealSpeak to val and return 1;
 */
#define CRS_CHECKVAR(var,str,mmbr,val)       {             if (!strcmp((var),(str)))          {               RealSpeak::mmbr=(val) ;            return 1;            }           }

/**
 * log as error all elements of arr referenced by "IKnow" member.
 */
#define LOGKNOWNLIST(ct,arr,max,name)       {             for ((ct)=0;(ct)<(max);(ct)++)         {               log->logf("<error> allowed %s %d.: \"%s\".",(name),(ct)+1,(arr)[ct].IKnow);     }           }


/*===========================================================================**
** LOCAL TYPES                                                          **
**===========================================================================*/
typedef struct {
  char* IKnow;
  char* str;  
  U16 tts_var;
} T_TTS_CaiviarRecord;


// this header will be appended to the output created by the tts system
// to create a complete wav memory structure
static unsigned char wavHeader[] = {
 0x52, 0x49, 0x46, 0x46, 0x00, 0x00, 0x00, 0x00, // |R|I|F|F|4 byte lenght of file - 8|
 0x57, 0x41, 0x56, 0x45, 0x66, 0x6d, 0x74, 0x20, // |W|A|V|E|f|m|t| |
 0x10, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, // |4 byte length of header|2 bytes encoding (01 = pcm)|2 bytes number of channels|
 0x40, 0x1f, 0x00, 0x00, 0x80, 0x3e, 0x00, 0x00, // |4 byte samplesrate|4 byte samplerate * bytes per sample|
 0x02, 0x00, 0x10, 0x00, 0x64, 0x61, 0x74, 0x61, // |2 byte bytes per sample|2 byte bit per sample|d|a|t|a|
 0x00, 0x00, 0x00, 0x00 };

const T_TTS_CaiviarRecord langConv[] = {
/* records for compatibility with possible earlier caiviar
   ivr.properties configurations */
  TTSCAIVIARRECORD("english",TTS_LANG_US_ENGLISH),
  TTSCAIVIARRECORD("french",TTS_LANG_FRENCH),
  TTSCAIVIARRECORD("german",TTS_LANG_GERMAN),
  TTSCAIVIARRECORD("dutch",TTS_LANG_BELGIAN_DUTCH),
/* records with new, more readable names for language variables
   containing ALL constants known to Scansoft RealSpeech 3.5 */
  TTSCAIVIARRECORD("American English",TTS_LANG_US_ENGLISH),
  TTSCAIVIARRECORD("Spanish",TTS_LANG_SPANISH),
  TTSCAIVIARRECORD("French",TTS_LANG_FRENCH),
  TTSCAIVIARRECORD("Dutch Dutch",TTS_LANG_NETHERLANDS_DUTCH),
  TTSCAIVIARRECORD("Dutch",TTS_LANG_DUTCH),
  TTSCAIVIARRECORD("British English",TTS_LANG_BRITISH_ENGLISH),
  TTSCAIVIARRECORD("German",TTS_LANG_GERMAN),
  TTSCAIVIARRECORD("Italian",TTS_LANG_ITALIAN),
  TTSCAIVIARRECORD("Japanese",TTS_LANG_JAPANESE),
  TTSCAIVIARRECORD("Korean",TTS_LANG_KOREAN),
  TTSCAIVIARRECORD("Egyptian Arabic",TTS_LANG_EGYPTIAN_ARABIC),
  TTSCAIVIARRECORD("Mandarin B5",TTS_LANG_MANDARIN_B5),
  TTSCAIVIARRECORD("Brazilian Portuguese",TTS_LANG_BRAZILIAN_PORTUGUESE),
  TTSCAIVIARRECORD("Russian",TTS_LANG_RUSSIAN),
  TTSCAIVIARRECORD("Mexican Spanish",TTS_LANG_MEXICAN_SPANISH),
  TTSCAIVIARRECORD("Belgian Dutch",TTS_LANG_BELGIAN_DUTCH),
  TTSCAIVIARRECORD("Swedish",TTS_LANG_SWEDISH),
  TTSCAIVIARRECORD("Norwegian",TTS_LANG_NORWEGIAN),
  TTSCAIVIARRECORD("Mandarin GB",TTS_LANG_MANDARIN_GB),
  TTSCAIVIARRECORD("Australian English",TTS_LANG_AUSTRALIAN_ENGLISH),
  TTSCAIVIARRECORD("Canadian French",TTS_LANG_CANADIAN_FRENCH),
  TTSCAIVIARRECORD("Cantonese B5",TTS_LANG_CANTONESE_B5),
  TTSCAIVIARRECORD("Cantonese GB",TTS_LANG_CANTONESE_GB),
  TTSCAIVIARRECORD("Danish",TTS_LANG_DANISH),
  TTSCAIVIARRECORD("Portugal Portuguese",TTS_LANG_PORTUGAL_PORTUGUESE),
  TTSCAIVIARRECORD("Poland Polish",TTS_LANG_POLAND_POLISH),
  TTSCAIVIARRECORD("Armenia Armenian",TTS_LANG_ARMENIA_ARMENIAN),
  TTSCAIVIARRECORD("Ukrainian",TTS_LANG_UKRAINIAN),
  TTSCAIVIARRECORD("Greek",TTS_LANG_GREEK),
  TTSCAIVIARRECORD("Vietnamese",TTS_LANG_VIETNAMESE),
  TTSCAIVIARRECORD("malay",TTS_LANG_MALAY),
  TTSCAIVIARRECORD("Pakistan Urdu",TTS_LANG_PAKISTAN_URDU),
  TTSCAIVIARRECORD("Indonesia Bahasa",TTS_LANG_INDONESIA_BAHASA),
  TTSCAIVIARRECORD("Iran Farsi",TTS_LANG_IRAN_FARSI),
  TTSCAIVIARRECORD("Belarusian",TTS_LANG_BELARUSIAN),
  TTSCAIVIARRECORD("Czech",TTS_LANG_CZECH),
  TTSCAIVIARRECORD("Hungarian",TTS_LANG_HUNGARIAN),
  TTSCAIVIARRECORD("India Tamil",TTS_LANG_INDIA_TAMIL),
  TTSCAIVIARRECORD("Thailand Thai",TTS_LANG_THAILAND_THAI),
  TTSCAIVIARRECORD("Turkish",TTS_LANG_TURKISH),
  TTSCAIVIARRECORD("Taiwanese",TTS_LANG_TAIWANESE),
  TTSCAIVIARRECORD("India Hindi",TTS_LANG_INDIA_HINDI),
  TTSCAIVIARRECORD("Taiwan Mandarin B5",TTS_LANG_TAIWAN_MANDARIN_B5),
  TTSCAIVIARRECORD("Taiwan Mandarain GB",TTS_LANG_TAIWAN_MANDARIN_GB)
};

const T_TTS_CaiviarRecord voiceConv[] = {
  TTSCAIVIARRECORD("female1",TTS_RS_VOICE_FEMALE),
  TTSCAIVIARRECORD("female2",TTS_RS_VOICE_FEMALE2),
  TTSCAIVIARRECORD("female3",TTS_RS_VOICE_FEMALE3),
  TTSCAIVIARRECORD("female4",TTS_3000_VOICE_FEMALE),
  TTSCAIVIARRECORD("male1",TTS_RS_VOICE_MALE),
  TTSCAIVIARRECORD("male2",TTS_RS_VOICE_MALE2),
  TTSCAIVIARRECORD("male3",TTS_RS_VOICE_MALE3),
  TTSCAIVIARRECORD("male4",TTS_3000_VOICE_MALE)
};

/*===========================================================================**
** STATIC MEMBER INITIALIZATION                                         **
**===========================================================================*/
char* RealSpeak::dictionary = 0;
char* RealSpeak::remote_server = 0;
char* RealSpeak::remote_service = 0;
int RealSpeak::remote_port = 0;
char* RealSpeak::language = TTS_DEFAULT_LANGUAGE;
char* RealSpeak::voice = TTS_DEFAULT_VOICE;
char* RealSpeak::engine = TTS_ENGINEPATH;

/*===========================================================================**
** MEMBER FUNCTION IMPLEMENTATION                                       **
**===========================================================================*/
int RealSpeak::initialize(Logger*log)
{
    log->logf("<verbose> Using Realspeak as Text to Speech engine\n");
    return 1;
}

void RealSpeak::finalize(Logger*log)
{
    if(RealSpeak::dictionary)
 free(RealSpeak::dictionary);
}

/**
 * the default constructor will create the basic tts instances and also store a pointer
 * to the logging subsystem.
 * @param log a pointer to the logging object
 */
RealSpeak::RealSpeak(Logger *log)
    :TextToSpeech(log)
{
  int lct;

  log->logf("<debug> Initializing RealSpeak Engine");
  log->logf("<debug> starting with data buffer of %d bytes", TTS_OUTPUT_BUFFER);

  SEARCHLANGUAGE(language,lct);
  if (lct < NROFLANGS) 
  {
    ttsParm.nLanguage = langConv[lct].tts_var;
    log->logf("<notice> Setting language to %s.",langConv[lct].str);
  } else {
    log->logf("<error> Language not known: \"%s\"", language);
    log->logf("<error> I know following languages: ");
    LOGKNOWNLIST(lct,langConv,NROFLANGS,"language");
    log->logf("<error> Switching to default language \"%s\".",TTS_DEFAULT_LANGUAGE);
    SEARCHLANGUAGE(TTS_DEFAULT_LANGUAGE,lct);
    if (lct < NROFLANGS)
    {
      ttsParm.nLanguage = langConv[lct].tts_var;
      log->logf("<notice> Setting language to %s.",langConv[lct].str);  
    } else {
      log->logf("<error> COULD NOT FIND DEFAULT LANGUAGE (%s)",language);
    }
  }
  
  SEARCHVOICE(voice,lct);
  if (lct < NROFVOICES)
  {
    ttsParm.nVoice = voiceConv[lct].tts_var;
    log->logf("<notice> Setting voice to %s",voiceConv[lct].str);
  } else {
    log->logf("<error> Voice not known: \"%s\"",voice);
    log->logf("<error> I know following voices: ");
    LOGKNOWNLIST(lct,voiceConv,NROFVOICES,"voice");
    log->logf("<error> Switching to default voice \"%s\".",TTS_DEFAULT_VOICE);
    SEARCHVOICE(voice,lct);
    if (lct < NROFVOICES)
    {
      ttsParm.nVoice = voiceConv[lct].tts_var;
      log->logf("<notice> Setting language to %s.",voiceConv[lct].str);  
    } else {
      log->logf("<error> COULD NOT FIND DEFAULT VOICE (%s) !!!",voice);
    }
  }

  if(ttsParm.nLanguage != TTS_LANG_US_ENGLISH)
    /* only us-english supports different voices */
    ttsParm.nVoice = TTS_RS_VOICE_FEMALE;

  ttsParm.nOutputType = TTS_LINEAR_16BIT;
  ttsParm.nFrequency = TTS_FREQ_8KHZ;
  ttsParm.nInputDataType = TTS_DATA_TYPE_TEXT;
  ttsParm.nOutputDataType = TTS_DATA_TYPE_PCM;
  ttsParm.cbFuncs.TtsSourceCb = sourceCallback;
  ttsParm.cbFuncs.TtsDestCb = destCallback;
  ttsParm.cbFuncs.TtsEventCb = CbTtsEventNotify;
  ttsParm.cbFuncs.numCallbacks = 3;
  ttsParm.szLibLocation = strdup(engine);

    int ret;
   
    if(RealSpeak::remote_server && RealSpeak::remote_service && RealSpeak::remote_port) {
 
 ttsParm.szLibLocation = NULL; 

 LH_SDK_SERVER* server = new LH_SDK_SERVER;
 server->server_handle = 0;
 strcpy(server->server.IP_Address, RealSpeak::remote_server);
 strcpy(server->server.service, RealSpeak::remote_service);
 server->server.port_number = RealSpeak::remote_port;
 log->logf("<notice> Connecting to remote TTS-server <%s> %s:%d", RealSpeak::remote_service,
  RealSpeak::remote_server, RealSpeak::remote_port);
 ret = TtsCreateEngine(server);
 log->logf("<notice> Connected. Handle=%d Server=%s Service=%s Port=%d", server->server_handle,
  server->server.IP_Address, server->server.service, server->server.port_number);
 if (ret != 0) {
     log->logf("<error> Error connecting to server: %d", ret);
 }
 ret = TtsInitialize(&hInst, server, &ttsParm, (void *)this);
    } else {
 log->logf("<debug> initializing engine directory %s.", ttsParm.szLibLocation);
 ret = TtsInitialize(&hInst, NULL, &ttsParm, (void *)this);
    }
    if (ret != 0) {
 log->logf("<error> Error initializing RealSpeak Engine: %d", ret);
    } else {
 log->logf("<debug> RealSpeak Engine initialized.");
    }
   
    if(RealSpeak::dictionary)
 loadDictionary(RealSpeak::dictionary);

    output = (unsigned char*)malloc(TTS_OUTPUT_BUFFER + 44);
    outputSize = TTS_OUTPUT_BUFFER;
    memcpy(output, wavHeader, 44);
    initMutex(&mutex);
}

TTSRETVAL RealSpeak::CbTtsEventNotify (void *pAppData, void *ppBuffer, U16 nDataSize, U16 event) {
    RealSpeak *me = (RealSpeak*)pAppData;
    //me->log->logf("<debug> Received RealSpeak Event: %d", event);
    return 0;
}

TTSRETVAL RealSpeak::sourceCallback(void *pAppData, void *data, U32 len, U32 *datasize) {
    RealSpeak *me = (RealSpeak*)pAppData;
    
    if (!me->text || !*(me->text)) {
 *datasize = 0;
 if(me->text) {
     free(me->textstart);
     me->text = 0;
 }
 return TTS_ENDOFDATA;
    }
    if (len >= strlen(me->text)) {
 memcpy(data, me->text, strlen(me->text));
 *datasize = strlen(me->text);
 me->text += strlen(me->text);
    } else {
 memcpy(data, me->text, len);
     *datasize = len;
 me->text += len;
    }
    return TTS_SUCCESS;
}

void* RealSpeak::destCallback(void *pAppData, U16 nDatatype, void *data, U32 datasize, U32 *buffersize) {
    RealSpeak *me = (RealSpeak*)pAppData;
    void*ret = 0;
    //me->log->logf("<debug> TTS: DestCallback: datatype:%d data:%08x size:%d", nDatatype, data, datasize);

    while((int)(datasize+TTS_OUTPUT_BUFFER/2) > me->outputSize - me->outputPos) {
 me->outputSize += TTS_OUTPUT_BUFFER;
 void*newdata = realloc(me->output, me->outputSize + 44);
 if(!newdata) {
     // failed, cut off
     me->log->logf("<error> TTS: realloc(%d) failed", me->outputSize);
     datasize = me->outputSize - me->outputPos;
     me->outputSize -= TTS_OUTPUT_BUFFER; //revert
     break;
 } else {
     me->log->logf("<notice> TTS: expanded output buffer to %d", me->outputSize);
     me->output = (unsigned char*)newdata;
 }
    }
    *buffersize = me->outputSize - me->outputPos;

    ret = (void *)&me->output[me->outputPos + 44];
    me->outputPos += (long)datasize;
    //me->log->logf("<debug> TTS: DestCallback: returning memory for %d bytes", *buffersize);
    return ret;
}

/**
 * free used ressources
 */
RealSpeak::~RealSpeak()
{
  log->logf("<debug> Deleting RealSpeak Engine");
  int ret = TtsUninitialize(hInst);
  if (ret != 0) {
    log->logf("<error> Error closing RealSpeak Engine: %d", ret);
  }
  free(output);
  destroyMutex(&mutex);
}

/*
 * load a User Dictionary
 */
void RealSpeak::loadDictionary(const char* dictionary)
{
 int ret;
 if(dictionary) {
     userdict = 0;
     log->logf("<notice> Loading Realspeak dictionary \"%s\"...", dictionary);
     ret = TtsLoadUsrDict(0, &userdict, (char*)dictionary);
     if(ret != 0) {
  log->logf("<error> Couldn't load dictionary \"%s\", error %d/%08x", dictionary, ret, userdict);
     } else {
  ret = TtsEnableUsrDict(hInst, userdict);
  if(ret != 0) {
      log->logf("<error> Couldn't enable dictionary \"%s\", error %d", dictionary, ret);
  } else {
      log->logf("<notice> Dictionary \"%s\" loaded", dictionary);
  }
     }
 }
}

/**
 * set implementation specific parameters
 */
int RealSpeak::setParams(const char *key, const char *value)
{
  /* test which parameters are for us */
  if(!strncmp(key, "realspeak.", 10))
  {
    const char*key2 = &key[10];
    
    CRS_CHECKVAR(key2,"dictionary",dictionary,strdup(value));
    CRS_CHECKVAR(key2,"server",remote_server,strdup(value));
    CRS_CHECKVAR(key2,"service",remote_service,strdup(value));
    CRS_CHECKVAR(key2,"port",remote_port,atoi(value));
    CRS_CHECKVAR(key2,"language",language,strdup(value));
    CRS_CHECKVAR(key2,"voice",voice,strdup(value));
    CRS_CHECKVAR(key2,"engine",engine,strdup(value));
  }
  return 0;
}

/**
 * this method will convert a given text to speech and output this over the microsoft
 * audio mapper, thus creating direct audio output on the soundcard.
 * @param text the text to be converted to speech
 */
void RealSpeak::text2Audio(const char *text)
{
 log->logf("<verbose> text2Audio is not implemented in RealSpeak");
}

/**
 * create speech directly to a wave file
 * @param text the text to be spoken
 * @param filenam name and path of the file to be created
 */
void RealSpeak::text2File(const char *text, const char *filename)
{
    lockMutex(&mutex);
    this->text = this->textstart = strdup(text);

    // start conversion
    int ret = TtsProcess(hInst);
    if (ret != 0) {
 log->logf("<error> TTS: could not process input, got error: %d", ret);
 return;
    }

    log->logf("<debug> TTS: speech complete, generated %d bytes of data.", outputPos);

    unsigned int wavelen = outputPos + 36;
    memcpy(output + 4, &wavelen, 4);
    wavelen = outputPos;
    memcpy(output + 40, &wavelen, 4);

    FILE *fp = fopen(filename, "wb");
    fwrite(output, sizeof(unsigned char), outputPos + 44, fp);
    fclose(fp);
    unlockMutex(&mutex);
}


/**
 * this method will create speech that is stored in a memory buffer. the created speech has the format pcm mono with 
 * 8kHz sampling rate and 16 bit. The memory structure can be directly used for input to the capi
 * subsystem.
 * @param text this text will be spoken
 * @return oubuffer pointer to the output buffer (the memory will be allocated)
 * @return size the size of the allocated memory
 */
void RealSpeak::text2Stream(const char *text, unsigned char **outbuffer, unsigned long *size)
{
    lockMutex(&mutex);
    this->text = this->textstart = strdup(text);

    // start conversion
    outputPos = 0;
    int ret = TtsProcess(hInst);
    if (ret != 0) {
 log->logf("<error> TTS: could not process input, got error: %d", ret);
 *size = 0;
 return;
    }

    log->logf("<debug> TTS: speech complete, generated %d bytes of data.", outputPos);

    
    *outbuffer = (unsigned char*)malloc (outputPos + 44);
    unsigned int wavelen = outputPos + 36;
    memcpy(&output[4], &wavelen, 4);
    wavelen = outputPos;
    memcpy(&output[40], &wavelen, 4);
    memcpy(*outbuffer, output, outputPos + 44);
    *size = outputPos + 44;
    
/*    FILE *fp = fopen("temp.wav", "wb");
    fwrite(*outbuffer, sizeof(unsigned char), outputPos + 44, fp);
    fclose(fp); */
    unlockMutex(&mutex);
}

Labels: caiviar, linux, ScanSoft RealSpeak, telephony, TTS

CAIVIAR capiserver using realspeak

Finally....

The caiviar server source 0.3.5 has some flaws when compiling/using it in relation to Scansoft RealSpeak TTS engine under Linux.

Currently, I can't make the thing connect to a Scansoft TTS-server, but I can use the engine statically linked in.

In what follows, I assumed (change for your needs):

the installation of scansoft realspeak is in REALSPEAKDIR (mine: /opt/scansoft/tts)
REALSPEAKDIR=/opt/scansoft/tts/
the caiviar source is in CAIVIARDIR (mine: /tmp/caiviar-0.3.5)
CAIVIARDIR=/tmp/caiviar-0.3.5/

this is what I did:

fix ${CAIVIARDIR}server/RealSpeak.cpp

I *did* alter more than minimally needed. Most importantly:

change the ".\\Engine" string you find into the directory needed for your system. (I changed it to "/opt/scansoft/tts/engine"
replace tts_language by language
replace stdup by strdup
if needed, change or adjust the code around "Setting language to" to have YOUR constant(s) (for your specific language).
You should consult ${REALSPEAKDIR}api/inc/lh_ttsso.h (or your RealSpeak documentation) for correct values/constants.

Check after installation (see below) the configuration file (probably /etc/ivr.properties the line stating realspeak.language= should have one of the strings caught by that functionality. (default is "german").

incorporate RealSpeak includes and objects (of your RealSpeak obtained installation)

cp ${REALSPEAKDIR}api/lib/* ${CAIVIARDIR}server/TTS
cp ${REALSPEAKDIR}api/inc/* ${CAIVIARDIR}server/TTS

configure and compile caiviar (capiserver)

cd ${CAIVIARDIR} && ./configure --enable_realspeak && make

install caiviar (capiserver)

cd ${CAIVIARDIR} && sudo make install

create sysv-init script:

create /etc/init.d/capiserver


#!/bin/bash
# Filename: capiserver
# Version: 1.0
# Author: Dieter Demerre 
# Description
#   SysV-init script to launch capiserver
#
#   usually known as: /etc/init.d/capiserver
#
NAME="CAIVIAR CAPI Server"
CAIVIARCONFIG=/etc/ivr.properties
CAIVIAR_PIDFILE=/var/run/capiserver.init.pid
CAIVIAR=/usr/local/bin/capiserver
if [ ! -x ${CAIVIAR} ]; then
  echo -n "\nError: Could not find ${CAIVIAR} executable.\n" >&2
  exit 5
fi
test -f ${CAIVIARCONFIG} || echo "WARNING no ${CAIVIARCONFIG} config found\n" >&2
case "$1" in
  start) startproc -f -p ${CAIVIAR_PIDFILE} ${CAIVIAR};;
  stop) killproc -p $CAIVIAR_PIDFILE -TERM ${CAIVIAR};;
  restart) $0 stop ; $0 start;;
  *) echo -n "unknown action.\nknown actions are: (start|stop|restart)\n"; exit ;;
esac

link to /etc/init.d/capiserver from the expected sysV runlevel directories (distribution specific, maybe use system-tools to set the links).


for level in 2 3 5;
do 
  sudo ln -sf ../init.d/capiserver /etc/init.d/rc${level}.d/S99capiserver;
done

Labels: bug, caiviar, compilation, linux, ScanSoft RealSpeak, telephony, TTS

2006-04-13

Creating separate .wav files for different voices in a midi file.

This script will generate .wav files for separate tracks of a midi-file.

Timidity++ is used for this. Timidity should first be configured to be able to play midi files (to be certain your configuration has a working sound-configuration (like using eawpats).

The script uses cat, echo, mkdir, sed and timidity (all but timidity should not be a problem in any posix system).

The script is to be fed with an action-file with lines containing
midi-file-name followed by space-separated voice-directory-names.
e.g. execute
towav.sh --no_all --action config.towav

with config.towav holding:


mymidi/file1.mid solo piano1 piano2 nothing nothing drum
file2.mid sop1 sop2 alt1 alt2 ten bar bas
...

for more info, read the code OR execute
towav.sh --help

Here's the script (I call it towav.sh)


#!/bin/bash
# Filename: towav.sh
# Author: Dieter Demerre
# Revision: $Revision:$
# Project: midi to wav
# Copyright: GPL (http://www.gnu.org/licenses/licenses.html)
# History:
#
# Description:
#   The file script  will produce single-voice .wav files (from midi files)
#
# Requirements:
#  scripts expects cat, echo, mkdir, sed and timidity to be reachable in the path.
#

PROGRAMNAME="MIDI to VOICE WAV"
PROGRAMVERSION="Rel. 1.0 2006-04-13"
# INITIALIZATION SECTION
# This section is NOT in function, to pre-generate global variables.
actionfile=""
traceoption=""
createdirs=0
overwrite=0
createall=1
targetdir="."
usefilevoice=0

CAT="cat"
ECHO="echo"
MKDIR="mkdir -p"
SED="sed"
TIMIDITY="timidity"

# CODE SECTION
# To view or adapt actuall execution, go to end of file to see
# main function call (and just above, the implementation)
function printUsage
{
  ${CAT} <<END_OF_USAGE

Usage:
------
towav <option>+

where <option> being amongst
------
--action <file>
REQUIRED OPTION
Will read filenames to convert (and corresponding voice-names) from <file>
<file> should have lines with space-separated tokens:
e.g.
  <filename1> <voice1> <voice2>
  <filename2> <voice1>

--complete
Will create "complete" voice-files.  This is, when a track starts with "silence" this silence is also put
in the .wav file.  Omit this option to have voice-track start at first note instead.
This option is very usefull if you try to mix the voices afterwards (like with soxmix).

--createdirs
Will try to create required directories.

--help
Will output this display.

--no_all
will prevent the production of a .wav file with ALL voices.

--overwrite
will cause existing (.wav) files to be overwritten.

--targetdir <dir>
<dir> is main output-directory in which directories wherein the output-files are to be created.

--use_filevoice
will create file/voice.wav files i.s.o voice/file.wav

END_OF_USAGE
}

function test_dir
{
  # Usage:
  # test_dir <dir>
  # with <dir> directory to be created if allowed
  if [ $# -ne 1 ]; then ${ECHO} "${FUNCNAME} with invalid nr of arguments $#" >&2; exit; fi
  if [ ! -d "${1}" -a ! -z "${createdirs}" -a 0 -ne ${createdirs} ]; then
    ${MKDIR} "${1}"
  fi
}

function compose_voicefilename
{
  # Usage:
  #   compose_voicefilename <base> <voice>
  if [ $# -ne 2 ]; then ${ECHO} "${FUNCNAME} with invalid nr of arguments $#" >&2; exit; fi
  if [ ${usefilevoice} -ne 0 ]; then
    dir=${targetdir}/${1};
    voicefile=${dir}/${2}.wav;
  else
    dir=${targetdir}/${2};
    voicefile=${dir}/${1}.wav;
  fi
}

function allmidi2wav
{
  # Usage:
  #  allmidi2wav <midi-file> <voice>+
  # with <voice>+ in order they are in midi-file.
  #
  if [ $# -lt 2 ]; then ${ECHO} "${FUNCNAME} with invalid nr of arguments $#" >&2; exit; fi
  midi_file=$1;
  base=`basename ${midi_file}|${SED} 's/\.mid$//;s/\.midi$//;'`
  if [ ! -f ${midi_file} ]; then
    ${ECHO} "Could not find input midi-file ${midi_file}" >&2
  else
    voicenr=1;
    shift 1;
    while [ $# -gt 0 ]; do
      voice=${1};
      compose_voicefilename ${base} ${voice}
      options="${traceoption} --mute=0,-${voicenr} "
      midi_to_wav ${midi_file} ${voicefile} "${options}"
      shift 1;
      voicenr=$[ $voicenr + 1 ]
    done
    if [ ${createall} -ne "0" ]; then
      compose_voicefilename ${base} "all"
      options=" "
      midi_to_wav ${midi_file} ${voicefile} "${options}"
    fi
  fi
}

function midi_to_wav
{
  # Usage:
  #  midi_to_wav <inputfile> <outputfile> <convertoptions>
  local inputfile=$1;
  if [ ! -s "${inputfile}" ]; then
    ${ECHO} "Could not find input-file ${inputfile}." >&2;
    return
  fi
  local outputfile=$2;
  dir=`dirname ${outputfile}`
  test_dir ${dir}
  if [ ! -d ${dir} ]; then
    ${ECHO} "could not find directory ${dir}." >&2;
  elif [ ${overwrite} -eq 0 -a -s ${outputfile} ]; then
    ${ECHO} "Found already existing ${outputfile}."
  else
    local options=$3;
    ${TIMIDITY} ${options} --output-mode=w --output-file=${outputfile} ${inputfile}
  fi
}

function EvaluateArguments
{
  while [ ${#} -gt 0 ]; do
    case "${1}" in
    --action)
      shift
      if [ ${#} -gt 0 -a -f ${1} ]; then
        actionfile=${1}
        ${ECHO} "Will perform actions from ${actionfile}"
      else
        ${ECHO} "Could not find action file"
        actionfile=""
      fi
      ;;
    --complete)
      ${ECHO} "Will produce silence-preceded voice-files"
      traceoption="--trace"
      ;;
    --createdirs)
      ${ECHO} "Will try to create required directories"
      createdirs=1
      ;;
    --help)
      actionfile=""
      return
      ;;
    --no_all)
      ${ECHO} "Will not generate file for all voices"
      createall=0
      ;;
    --overwrite)
      ${ECHO} "Will overwrite existing files"
      overwrite=1
      ;;
    --targetdir)
      shift
      if [ ${#} -gt 0 ]; then
        targetdir=${1}
        ${ECHO} "Will create voice-parts in directory ${targetdir}"
      else
        ${ECHO} "Could not find target directory" >&2
        targetdir=""
      fi
      ;;
    --use_filevoice)
      ${ECHO} "Will produce file/voice.wav files i.s.o voice/file.wav"
      usefilevoice=1
      ;;
    *)
      ${ECHO} "Unknown option $1" >&2
      ;;
    esac
    shift
  done
}

function ExecuteActions
{
  if [ -z "${actionfile}" -o ! -f "${actionfile}" ]; then
    ${ECHO} "no action ${actionfile} file specified or found" >&2
    printUsage
  else
    ${ECHO} "Now evaluating ${actionfile}"
    3<${actionfile}
    read -a line <&3
    while [ ! -z "${line}" ]; do
      if [ ! -f ${line[0]} ]; then
        ${ECHO} "input midi file ${line[0]} was not found" >&2
      else
        allmidi2wav ${line[*]}
      fi
      read -a line <&3
    done
  fi
}

function main
{
  ${ECHO} ${PROGRAMNAME}
  ${ECHO} ${PROGRAMVERSION}
  ${ECHO} "----------------------------"
  EvaluateArguments $@
  test_dir ${targetdir}
  if [ ! -d ${targetdir} ]; then
    ${ECHO} "could not find target directory ${targetdir}" >&2
  else
    ExecuteActions
  fi
}

main $@

Labels: conversion, linux, midi, timidity

2006-03-24

ScanSoft RealSpeak 3.51 for Linux

hi,

A small description about the procedure how I installed it (and use it) on my
Debian Sarge system.

I have been demonstrated that the same procedure worked for Ubuntu 6.06.

Here we go.

Install

The scansoft realspeak host 3.51 provides two .rpm files:

rs-api-<version-stuff>.rpm (for developpers)
rs-<lang>-<version-stuff>.rpm (the language data).

The indication <lang> is different for every language-package. Belgian Dutch for instance gives us "dub".

For ease of use (especially installation) on a .deb based system (like debian, ubuntu,...), we convert these packages using alien:

fakeroot alien rs-api-3.51.00.02-1.i386.rpm
fakeroot alien rs-dub-3.51.00.02-1.i386.rpm

Then follows installation:

sudo dpkg -i rs-api_3.51.00.02-2_i386.deb
sudo dpkg -i rs-dub_3.51.00.02-2_i386.deb

By this, the ScanSoft (RealSpeak) files are put into /opt/scansoft.

To allow applications to reach the libraries, I added to /etc/ld.so.conf a line:

/opt/scansoft/engine

Followed by execution of sudo /sbin/ldconfig to apply these changes.

Test/Usage

ScanSoft RealSpeak comes with a couple of demos, which can be used to test the functioning. One of them, standard just converts a text into a raw file.

cd /tmp
echo "This is the text I want to be converted to speech by Scansoft." >> text.txt
/opt/scansoft/tts/api/demos/standard 0 0 /opt/scansoft/tts/engine /tmp/text.txt

The values 0 and 0 in the example, select American English (first zero) and a Female voice (second zero). The value(s) corresponding to your language-package might differ.

My system having female flemish (belgian dutch) speaker, uses 14 resp. 0. In the code-directory of rs-api, you'll find the correct values. Read /opt/scansoft/tts/api/inc/lh_ttsso.h to find the corresponding values.

Look for a line like:

#define TTS_LANG_<YOURLANGUAGE> <number>

that will give you the language number.

Above code snippet will write in the /tmp directory a file called standard.pcm. This file can be auditioned using:

play --type=raw --channels=1 --rate=8000 -s -2 --endian=little standard.pcm

play is provided by the sox package.
I used a version d.d. 2007-01-31. (version 13.0.0-1).

For an older version (Debian ETCH uses sox 12.17.9-1), you could try:

play -t raw -c 1 -r 8000 -f s -s w standard.pcm

or you could install the alsa-utils-package, and use aplay:

aplay --type=raw --channels=1 --rate=8000 --format=S16_LE standard.pcm

You even might want to convert the audio-file to an mp3-file:

lame -r -m m -s 8 -x standard.pcm standard.mp3

Note that the last 3 seconds of the audio might not be played. This seems to have something to do with the rudimentary way of playing them. Use another player and it might work as expected.

Labels: linux, Nuance, ScanSoft RealSpeak, TTS

2006-03-02

MIDI to WAV conversion

Situation:
I was having a couple of MIDI files that I wanted (needed) to put on music CD's. My way was to convert them first to wav or mp3 files, prior to putting them on CD using a program like Ahead Nero (you can find it if your writer was not providing it).

Tools:
TiMidity++ is a converter that converts some MIDI files into audio-files (like RIFF WAVE).

These Distributions came with an "enough" configured timidity-package out of the box:

SuSE 9.0 (YaST - Software - timidity)
Debian 3.1a (apt-get install freepats timidity timidity-interfaces-extra)

If you have a TiMidity installation that's well-configured (capable of playing a midi-file on your (linux-) system), you'll be able to use it to save wav files.

Short way:

timidity --output-mode=w --output-file=result.wav source.mid

Long way:
You can first check your player, if at least it produces correct audio (through the speakers of the system) by executing

timidity source.mid

If this doesn't produce correct (evt. musical) sound, you should check the configuration of your timidity, prior to continuing here.

Using this command-line instruction, you create the wav file of the a midi file.

timidity --output-mode=w --output-file=result.wav source.mid

I use to reduce size of the files:

timitidy --sampling-freq=16000 --output-mode=w --output-file=result.wav source.mid

If I want only one single voice (or mute some channels), I use:

timidity --mute=0,-4 --output-mode=w --output-file=result.wav source.mid

This previous example mutes all voices (0) and then de-mutes voice 4 (in my case almost always the bass-line).