Edge AI SDK/GenAIChatbot

From ESS-WIKI
Revision as of 08:45, 20 May 2025 by Will.qiu (talk | contribs)
Jump to: navigation, search
Brief of GenAI Chatbot

Introduction

GenAI Chatbot is a next-generation conversational AI assistant built on the OLLAMA architecture, supporting all models compatible with OLLAMA. Designed for seamless integration with GenAI Studio, it allows users to directly import models that have been fine-tuned within GenAI Studio, enabling easy deployment and immediate use of custom models in the chatbot. At its core, GenAI Chatbot utilizes efficient Small Language Models (SLMs) to provide natural, context-aware interactions. The chatbot features advanced capabilities, including audio processing (Speech-to-Text [STT], Text-to-Speech [TTS]), Retrieval-Augmented Generation (RAG), and an embedded vector database (VectorDB), all within a flexible configuration suitable for diverse application scenarios. It is currently optimized for embedded platforms such as NVIDIA Jetson Orin Nano and Jetson Orin AGX.

SLM Chatbot

The GenAI Chatbot utilizes a Small Language Model (SLM) as its core engine. SLMs provide efficient language understanding and generation capabilities, delivering fast and accurate responses with low computational overhead. This allows the chatbot to operate smoothly on both edge devices and server environments.

Integrate with GenAI Studio

GenAI Chatbot is fully integrated with GenAI Studio, the AI LLM model management platform. Within GenAI Studio, users can:

  • Perform model fine-tuning.
  • Access a variety of model quantization methods for deployment on different hardware.
  • Convert models easily to the formats required by various platforms.
  • Customize and personalize AI models with ease.

This integration significantly simplifies the process of deploying and customizing chatbots for both enterprises and developers.

Audio

  • Speech-to-Text (STT)

GenAI Chatbot supports natural voice input, automatically converting user speech queries into text for processing. The system can flexibly integrate with a variety of Speech-to-Text (STT) engines and providers. By default, it uses the Whisper-base model and supports integration with OpenAI API, Web Browser API, Deepgram API, and Azure AI Speech API, making it easy to connect to different STT services as needed.

  • Text-to-Speech (TTS)

Chatbot responses can be automatically converted to natural-sounding speech using Text-to-Speech (TTS). Users are free to choose different voices and languages. The system supports the Web API by default and can be integrated with OpenAI API, Transformers, ElevenLabs, and Azure AI Speech API, providing flexible options for TTS services.

RAG

GenAI Chatbot supports Retrieval-Augmented Generation (RAG), combining large language models (LLMs) with external knowledge bases to greatly improve the accuracy and depth of answers. This makes it suitable for knowledge-based Q&A, document search, FAQ systems, and various information retrieval scenarios.

  • Embedded
    • The system converts various documents, knowledge sources, or custom content into vector (embedding) representations to enable efficient semantic search.
    • By default, the sentence-transformers/all-MiniLM-L6-v2 model is used for embeddings, balancing computational efficiency and semantic understanding.
    • Supports automatic embedding of multilingual documents and multiple file formats (such as txt, pdf, markdown, html, etc.).
    • Custom embedding models are supported to flexibly meet the needs of different professional domains.
  • VecorDB
    • The generated embeddings are stored in a VectorDB (vector database), supporting large-scale, efficient semantic search and matching.
    • By default, ChromaDB is used as the vector database, offering real-time retrieval and ease of deployment and management.
    • The system can be extended to support other mainstream VectorDBs, such as Pinecone, Weaviate, Qdrant, etc., making it suitable for enterprise or cloud-based deployments.
    • Built-in features such as deduplication, sharding, and data weighting ensure query efficiency and result quality.

How To

  • Download SLM Models from GenAI Studio
Genai-studio-1.png
Genai-studio-2.png
Genai-studio-3.png
Genai-studio-4.png


  • Crate a new chatbot assistant with RAG
  • Speak
    • Audio to Azure AI Speech API

Configuration

- Set Azure TTS Service