Speech!

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Dekudude
    New Member
    • Jul 2008
    • 9

    Speech!

    Hi there!

    I'm working on a simple script, and I was wondering if there is some way to use the Microsoft built-in speech SDKs (to voice, and text FROM voice)? I'd like to capture raw_input() data from my microphone whenever the program recognizes that something went through the microphone. I'd also like to have the program use the default text-to-voice synthesizer I have installed... currently Microsoft Mary.

    Is this possible? If so, how might I go about doing it? Are there any simple modules with functions such as, "say()" or "listen()"? Thanks a lot!
  • Dekudude
    New Member
    • Jul 2008
    • 9

    #2
    Hmm, okay. I figured out text to voice. Is there some way to do voice to text, though? I already have the Microsoft speech API, SAPI, but I don't know how to make that work with Python...

    Comment

    • heiro
      New Member
      • Jul 2007
      • 56

      #3
      posting again with code tags
      Last edited by heiro; Jul 3 '08, 06:00 PM. Reason: no code tags

      Comment

      • heiro
        New Member
        • Jul 2007
        • 56

        #4
        Code:
        from win32com.client import constants
        import win32com.client
        import pythoncom
        class SpeechRecognition:
            """ Initialize the speech recognition with the passed in list of words """
            def __init__(self, wordsToAdd):
                # For text-to-speech
                self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
                # For speech recognition - first create a listener
                self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
                # Then a recognition context
                self.context = self.listener.CreateRecoContext()
                # which has an associated grammar
                self.grammar = self.context.CreateGrammar()
                # Do not allow free word recognition - only command and control
                # recognizing the words in the grammar only
                self.grammar.DictationSetState(0)
                # Create a new rule for the grammar, that is top level (so it begins
                # a recognition) and dynamic (ie we can change it at runtime)
                self.wordsRule = self.grammar.Rules.Add("wordsRule",
                                constants.SRATopLevel + constants.SRADynamic, 0)
                # Clear the rule (not necessary first time, but if we're changing it
                # dynamically then it's useful)
                self.wordsRule.Clear()
                # And go through the list of words, adding each to the rule
                [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
                # Set the wordsRule to be active
                self.grammar.Rules.Commit()
                self.grammar.CmdSetRuleState("wordsRule", 1)
                # Commit the changes to the grammar
                self.grammar.Rules.Commit()
                # And add an event handler that's called back when recognition occurs
                self.eventHandler = ContextEvents(self.context)
                # Announce we've started
                self.say("Started successfully")
            def say(self, phrase):
                self.speaker.Speak(phrase)
         
        class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
            def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
                newResult = win32com.client.Dispatch(Result)
                print "You said: ",newResult.PhraseInfo.GetText()
        if __name__=='__main__':
            wordsToAdd = [ "One", "Two", "Three", "Four" ]
            speechReco = SpeechRecognition(wordsToAdd)
            while 1:
                pythoncom.PumpWaitingMessages()
        ##### for text to speech######### #######
        Code:
         
        import sys
        from win32com.client import constants
        import win32com.client
         
        speaker = win32com.client.Dispatch("SAPI.SpVoice")
        while 1:
           try:
              s = raw_input('Type word or phrase: ')
              speaker.Speak(s)
           except:
              if sys.exc_type is EOFError:
                 sys.exit()

        Comment

        • Dekudude
          New Member
          • Jul 2008
          • 9

          #5
          Thanks for your response:

          However, that is not exactly what I want. For starters, I already got the text-to-voice working. As for voice-to-text, I already tried that example, but it doesn't work how I want it to. As far as my beginner eyes can see, there's no way to replicate raw_input() using it, where if I said, "This is a test, hello world", it would enter that.

          Do you understand what I'm saying? If you can show me how to use that code example, that would be great, but I just want a raw_input() style function that enters data via voice-to-text.

          Thanks!

          Comment

          Working...