Models

Configure which AI models your agents use — switch providers without changing code.

What Are Models?

In the Bosca AI system, a model is a configuration that points to a specific LLM provider and model version. It is not the AI model itself — it is a reference that tells the platform which provider to call and which version to use. By managing models as separate configurations, you can swap providers, upgrade to newer versions, or try a different model entirely without changing anything about the agent's prompt, tools, or behavior.

This decoupling is powerful. When a new model version is released, you update the model configuration and every agent using it immediately benefits. When you want to test whether a different provider produces better results for a particular task, you create a new model configuration and point the agent at it — no other changes required.

Supported Providers

The platform supports a broad range of LLM providers, giving you flexibility to choose the right model for each task:

Provider	Available Models
Google	Gemini 2.0 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 3.0 family, and other Gemini variants. Strong for multimodal tasks, long context windows, and fast inference.
OpenAI	GPT-4o, GPT-4o mini, GPT-5 family, Codex, and other OpenAI models. Well-suited for general-purpose tasks, coding assistance, and complex reasoning.
Anthropic	Claude model family. Known for careful, nuanced responses, strong instruction following, and long-context capabilities.
Mistral AI	Mistral and Mixtral models. Offer strong performance at competitive price points, with good multilingual capabilities.
OpenRouter	A gateway to dozens of models from multiple providers through a single integration. Useful for accessing niche or specialized models.
DeepSeek	DeepSeek models. Competitive performance on reasoning and coding tasks at lower cost points.

New providers and models can be added as they become available. The model configuration system is designed to be extensible — adding support for a new provider does not require changes to existing agents or prompts.

Model Properties

Each model configuration has a small set of properties:

Property	Purpose
Key	A stable, human-readable identifier that remains consistent across environments. Use keys to reference models in configurations and automation.
Name	A descriptive name shown in the admin UI. Make it clear which provider and version this configuration represents — for example, "Gemini 2.5 Pro" or "GPT-4o (June 2025)."
Description	An optional description explaining what this model is best suited for, any cost considerations, or other notes for your team.
Configuration	Optional model-specific settings such as temperature, top-p, maximum output tokens, or other provider-specific parameters that fine-tune the model's behavior.

How Models Connect to Agents

Each agent references exactly one model. When you configure an agent in the admin UI, you select which model it should use from the list of available model configurations. The agent sends all requests to that model's provider and version.

This is a simple reference — the agent does not copy the model configuration, it points to it. If you update the model configuration (for example, changing from one model version to a newer one), the agent automatically uses the updated version on its next request.

Switching Models

One of the most practical benefits of the model configuration system is how easy it is to switch models. Common scenarios include:

Trying a new provider. If you want to see how an agent performs with Anthropic Claude instead of OpenAI GPT, create a new model configuration for Claude and change the agent's model reference. The prompt, tools, and everything else stay the same.
Upgrading to a newer version. When a provider releases an improved model, update the model configuration to point to the new version. All agents using that model immediately benefit.
A/B testing models. Create two copies of the same agent, each pointing to a different model. Run both side by side to compare response quality, speed, and cost.
Optimizing for cost. If an agent is handling simple tasks that do not require a top-tier model, switch it to a smaller, less expensive model. Reserve powerful models for tasks that truly need them.

Multi-Provider Support

The platform can route to different providers simultaneously. This means you can have one agent using Google Gemini, another using OpenAI GPT, and a third using Anthropic Claude — all running at the same time within the same system.

This multi-provider capability is valuable for several reasons:

Best tool for the job. Different models excel at different tasks. Use a model that is strong at code generation for your coding agent, and a model that is strong at creative writing for your content agent.
Redundancy. If one provider experiences an outage, agents using other providers continue to function normally.
Cost optimization. Mix expensive high-capability models with affordable lightweight models based on each agent's needs.
Compliance. Some organizations need to use specific providers for certain types of data. Multi-provider support makes it possible to route sensitive workloads to approved providers.

Cost and Capability Tradeoffs

Not every task needs the most powerful model available. Understanding the tradeoffs between model size, speed, capability, and cost helps you make smart choices:

Task Type	Recommended Approach
Simple classification and extraction	Use smaller, faster models. Tasks like categorizing content, extracting entities, or estimating reading time do not need advanced reasoning. Smaller models handle these well at a fraction of the cost.
Content generation and editing	Use mid-tier models. Writing assistance, summarization, and style editing benefit from some reasoning capability but do not require the most powerful models.
Complex analysis and multi-step reasoning	Use the most capable models. Tasks like data analysis, research synthesis, and multi-step problem-solving benefit significantly from advanced reasoning capabilities.
High-volume, low-complexity tasks	Use the smallest model that produces acceptable results. When processing thousands of items (like generating descriptions for a content library), even small cost savings per request add up quickly.

Start with a capable model to establish a quality baseline for your use case, then experiment with smaller, less expensive models to see if they can match that quality. It is easier to step down from a model you know works than to debug quality issues with an underpowered model.