Class AssistantService
java.lang.Object
ecmwf.common.ai.AssistantService
AssistantService provides an AI assistant for OpenECPDS documentation.
This service uses a two-model setup:
- FAST model: used exclusively to rewrite user questions into optimized search queries for RAG (Retrieval-Augmented Generation).
- DEEP model: always streams the final answer to the user, based on documentation segments retrieved from the RAG index.
The class supports:
- Streaming AI responses with cancellation support.
- RAG retrieval using vector embeddings and Lucene-based indexing.
- Query rewriting for improved search coverage.
- Segment reranking and hierarchical filtering to produce concise, grounded answers.
Thread safety:
- Multiple threads can safely call
askStreaming(List, String, JsonNode, java.util.function.Consumer, java.util.concurrent.atomic.AtomicBoolean). - The streaming consumer executor handles asynchronous token delivery.
- Index building is single-threaded and performed at initialization.
- Author:
- Laurent Gougeon
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionaskStreaming(List<ChatConversation.Message> history, String question, com.fasterxml.jackson.databind.JsonNode context, Consumer<String> consumer, AtomicBoolean cancelled) Main entry point for streaming answers from the DEEP model.voidshutdown()Shuts down the streaming consumer executor.
-
Field Details
-
DEFAULT_DEEP_PROMPT
- See Also:
-
REWRITE_FAST_PROMPT
- See Also:
-
-
Constructor Details
-
AssistantService
public AssistantService()Default constructor using base configuration fromCnf.Initializes FAST, DEEP, and embedding models and builds RAG index if needed.
-
AssistantService
public AssistantService(String baseUrl, String fastModelName, String deepModelName, String embeddingModelName, Path docsPath, Predicate<Path> filter, String deepSystemPrompt) Full constructor.- Parameters:
baseUrl- base URL for Ollama APIfastModelName- model name for the FAST model (query rewriting)deepModelName- model name for the DEEP model (answer streaming)embeddingModelName- model name for the embedding modeldocsPath- root path to documentation filesfilter- predicate to select which files to indexdeepSystemPrompt- system prompt text to guide DEEP model answers; if null, defaults toDEFAULT_DEEP_PROMPT
-
-
Method Details
-
askStreaming
public String askStreaming(List<ChatConversation.Message> history, String question, com.fasterxml.jackson.databind.JsonNode context, Consumer<String> consumer, AtomicBoolean cancelled) Main entry point for streaming answers from the DEEP model.Automatically rewrites the question using FAST, retrieves relevant documentation segments from RAG, reranks, filters, compresses them, and streams the answer.
Optionally disables rewriting if
REWRITE_PROMPT_WITH_FAST_MODELis false, in which case the original question is used directly for RAG retrieval.- Parameters:
history- multi-turn chat historyquestion- user questioncontext- optional additional user context as JSONconsumer- callback that receives tokens as they are streamedcancelled- atomic flag that can cancel the streaming- Returns:
- full answer as a string
-
shutdown
public void shutdown()Shuts down the streaming consumer executor.Should be called at application shutdown to clean up resources.
-