



2. The user's audio is processed using the STT (Speech To Text) services.
3. The obtained text is sent to AgentBot as input, giving a response in real-time.
4. The response text is converted to audio using TTS services and that audio is played to the user. In this way, Voice understands the intentions and responds through a natural language.