We are seeking a highly skilled Python Developer with a strong background in audio processing and experience with communication APIs, such as Twilio, to lead the development of a cutting-edge proof of concept (POC). This project is designed to implement and evaluate various transcription and synthesis services to identify the most effective solution for natural conversation over the phone. The successful candidate will be responsible for creating an AI-driven system that users can interact with via phone calls, facilitating real-time decision-making based on user input to select the best transcription and synthesis services for a seamless conversational experience. This role requires a blend of technical proficiency in Python, deep understanding of audio processing technologies, and experience integrating with services like OpenAI's Whisper and various synthesis tools.
Key Responsibilities:
Develop and implement a POC using the vocode library, integrating various transcription and synthesis services to evaluate their performance in creating natural phone conversations.
Design an interactive phone system using Twilio that allows users to select different transcription and synthesis options via keypad inputs (1,2,3,4), facilitating direct interaction with OpenAI technologies and other services.
Conduct extensive testing to compare the efficiency, accuracy, and naturalness of conversation using different combinations of transcription and synthesis services.
Establish criteria and benchmarks for evaluating the performance of these services in terms of voice quality, response speed, and overall conversational fluidity.
Collaborate with cross-functional teams to ensure the integration of the POC with existing systems and workflows, enhancing our voice interaction capabilities.
Stay updated with the latest advancements in audio processing, machine learning, and AI technologies to continuously improve the system's performance.
Provide expertise in the deployment of scalable and robust systems capable of handling real-time audio processing and interaction over the phone.
Desirable Skills:
Experience with deploying AI and machine learning models, particularly in the context of natural language processing and speech synthesis.
Familiarity with cloud services and infrastructure that support scalable audio processing applications.
Reference
https://docs.vocode.dev/open-source-quickstart
This job is already closed and no longer accepting applicants, sorry.