Model selector
Pick which LLM the bot uses for each conversation.
Pegasus exposes a model selector in the chat input area. Each conversation can use a different model, and the list is limited by your subscription plan.
Choose a model
Click the current model name in the chat input bar. A dropdown shows the models available to your plan. Once you pick one, future messages in that conversation use it. Earlier messages stay associated with whichever model produced them.
Default model
A new conversation starts with your plan's default model. Pegasus picks this default as the best general-purpose option for that plan tier.
Plan gating
- Free - standard model only
- Pro - standard plus faster variants
- Ultra - advanced models that trade more credits for better performance on hard questions
Models you cannot access stay visible but disabled with an upgrade hint.
Model trade-offs
- Faster models cost less and start streaming sooner, but may be less accurate on complex prompts.
- Standard models balance cost and quality.
- Advanced models are slower and use more credits, but are best when answer quality matters more than speed.
If you are unsure, stay with the default.
Model availability
If an upstream provider temporarily disables a model, Pegasus disables it in the picker until it returns.
Max Mode in chat
Max Mode is configured in two places:
- The bot-level toggle in Source settings decides whether the bot can use Max Mode at all.
- The chat-level toggle in the input bar decides whether this conversation uses Max Mode right now.
Prerequisites
- Your plan must be Pro or Ultra. Free users see "Max Mode requires Pro or Ultra plan."
- The bot must already have Max Mode enabled in its own settings. Otherwise the chat toggle shows "Max Mode is not enabled for this bot."
Turn it on for one conversation
Use the Max Mode toggle in the chat input bar. The next message you send will use the MASS-RAG pipeline.
The toggle is per conversation. You can keep one conversation in Max Mode and another in standard mode for the same bot.
What changes when Max Mode is on
- Latency is higher - responses usually begin 2-3x later than standard chat.
- Credit cost is higher - expect roughly 5-7x the daily credit use per message.
- Complex-answer quality is better - especially for multi-step questions or prompts that need synthesis across many document sections.
When to leave Max Mode off
For short factual questions, standard RAG is usually faster, cheaper, and accurate enough. Save Max Mode for the harder prompts where you genuinely need the extra reasoning and retrieval depth.
Background: Max Mode (MASS-RAG)