Models
Browse the full LLM model catalog, compare features side by side, and estimate monthly costs across providers.
Overview
The Models page in the AI Gateway gives you a searchable catalog of every LLM model available through the gateway. It pulls metadata from LiteLLM's model registry, so you can browse models across all supported providers without leaving VerifyWise.
The page has three tabs: a full model catalog, a cost calculator for estimating monthly spend and a feature comparison tool for evaluating models side by side.
Accessing the models page
- Click the AI Gateway icon in the sidebar
- Click Models in the secondary sidebar
- The page loads with the All models tab active
The page header shows the total number of models and providers available. This count updates based on the LiteLLM model registry bundled with your AI Gateway installation.
All models tab
The default tab shows a paginated table of every model in the catalog. Each page displays 25 models at a time.
Table columns
The model table shows the following information for each model:
| Column | Description |
|---|---|
| Provider | The LLM provider (OpenAI, Anthropic, Google, Mistral, etc.) shown with a provider icon |
| Model | The model identifier used when making API requests through the gateway |
| Mode | The model type: chat, embedding, image generation, audio transcription or completion |
| Context | Maximum input token window, shown in shorthand (e.g. 128K, 1M) |
| $/1M in | Cost per million input tokens in USD |
| $/1M out | Cost per million output tokens in USD |
| Features | Icons showing supported capabilities: vision, function calling, PDF input, prompt caching |
Filtering models
The filter bar above the table gives you several ways to narrow the list:
- Search: Type in the search field to filter by model name or provider. Results update as you type.
- Provider dropdown: Select a specific provider to show only their models. Defaults to "All providers".
- Mode dropdown: Filter by model type: Chat, Embedding, Image generation, Audio transcription or Completion.
- Feature toggles: Click the Vision, Tools, PDF or Caching buttons to show only models that support those features. Active filters show a green border.
Filters can be combined. For example, you can search for "claude" while filtering by the Chat mode and Vision feature to find all Anthropic chat models with image support.
A results count below the filters shows how many models match your current selection.
Adding a model to an endpoint
Each row in the model table has an Add button on the right side. Clicking it takes you to the Endpoints page with the model and provider pre-filled, so you can quickly create a new endpoint for that model.
Pagination
When the filtered list has more than 25 models, pagination controls appear at the bottom of the table. Use the left and right arrow buttons to move between pages. The current page number and total page count are displayed alongside the controls.
Cost calculator tab
The cost calculator helps you estimate monthly spend across models based on your expected usage patterns. Only chat models with known pricing appear in the results.
Calculator inputs
Enter your expected usage to see cost estimates:
- Requests/day: The number of API requests you expect to make per day
- Avg input tokens: Average number of input tokens per request (prompt length)
- Avg output tokens: Average number of output tokens per request (response length)
- Provider: Optionally filter results to a single provider
Understanding the results
Results are sorted from cheapest to most expensive. The table shows:
| Column | Description |
|---|---|
| Rank | Position in the cost ranking. Top 3 get trophy, medal and award icons. |
| Model | Provider and model name. The cheapest model is tagged with a "cheapest" badge. |
| Context | Maximum input token window |
| $/req | Cost per single request based on your input/output token averages |
| Input | Total daily input cost (requests x avg input tokens x cost per token) |
| Output | Total daily output cost (requests x avg output tokens x cost per token) |
| $/day | Combined daily cost for all requests |
| $/month | Projected 30-day cost (daily cost x 30) |
The top 3 results are highlighted with distinct background colors for quick identification. By default, the calculator shows the top 50 results. If more models match, a button appears to show all results.
Feature comparison tab
The feature comparison tab lets you select up to 5 models and compare their capabilities side by side in a table.
Selecting models to compare
There are two ways to add models to the comparison:
- Popular models: Click any of the pre-populated model buttons (GPT-4o, Claude Sonnet, Gemini Flash, Mistral Large, Grok and others). Selected models show a green border.
- Search: Type in the search field to find any model by name. Click a result to add it to the comparison. Up to 5 models can be compared at once.
Three models are pre-selected by default (GPT-4o, Claude Sonnet 4 and Gemini 2.0 Flash) so you see results right away. Click any selected model's button again to deselect it.
Compared features
The comparison table shows the following attributes for each selected model:
- Provider: Which company offers the model
- Mode: The model type (chat, embedding, etc.)
- Max input tokens: Maximum context window size
- Max output tokens: Maximum response length
- Input $/1M tokens: Cost per million input tokens
- Output $/1M tokens: Cost per million output tokens
- Vision: Whether the model can process images
- Function calling: Whether the model supports tool/function calling
- Parallel tools: Whether the model can call multiple tools in a single turn
- PDF input: Whether the model accepts PDF files directly
- Prompt caching: Whether the model supports caching repeated prompt prefixes
- Response schema: Whether the model supports structured output schemas
- System messages: Whether the model accepts system-level instructions
Best-value highlighting
The comparison table automatically highlights the best value in each row with a green background. For cost rows, the lowest price wins. For token limits and feature support, the highest value wins. This makes it easy to spot which model leads in each category.
Boolean features (vision, function calling, etc.) show a green checkmark for "Yes" and a gray X for "No".
Removing models from comparison
To remove a model from the comparison, click the small trash icon next to its name in the column header. You can also click its button in the popular models row to deselect it.
Supported providers
The AI Gateway supports models from all providers in the LiteLLM registry. The catalog is bundled with the gateway, so it doesn't require any API keys to browse. Common providers include:
OpenAI
GPT-4o, GPT-4o Mini, o1, o3 and more
Anthropic
Claude Opus, Sonnet, Haiku families
Gemini Pro, Flash, Nano models
Mistral
Mistral Large, Medium, Small
xAI
Grok models
Meta
Llama models via various hosts
Cohere
Command and Embed models
Amazon Bedrock
All Bedrock-hosted models
Azure OpenAI
Azure-hosted OpenAI models
Model modes
Models in the catalog are classified by their mode, which describes what kind of task they perform:
- Chat: Conversational models that accept messages and return text responses. This is the most common mode.
- Embedding: Models that convert text into vector representations for semantic search and similarity matching.
- Image generation: Models that create images from text prompts (DALL-E, Stable Diffusion, etc.).
- Audio transcription: Models that convert spoken audio into text (Whisper, etc.).
- Completion: Legacy text completion models that predict the next tokens in a sequence.
Feature icons reference
The Features column in the model table uses icons to indicate model capabilities:
| Icon | Feature | What it means |
|---|---|---|
| Eye | Vision | Model can analyze images sent alongside text prompts |
| Wrench | Function calling | Model can call external tools and functions through structured tool-use APIs |
| FileText | PDF input | Model accepts PDF files directly without pre-processing |
| Database | Prompt caching | Provider supports caching repeated prompt prefixes to reduce cost and latency |
Typical workflow
Here's how most teams use the Models page as part of their gateway setup:
- Browse the catalog: Use filters to find models that match your requirements (chat mode, vision support, etc.)
- Compare candidates: Switch to the Feature comparison tab to evaluate your shortlist side by side
- Estimate costs: Use the Cost calculator to project monthly spend based on your expected usage
- Add to endpoints: Click the Add button on your chosen model to create a gateway endpoint for it
- Test in playground: Use the AI Gateway Playground to test the endpoint before routing production traffic
Troubleshooting
If the models page shows an error or no data:
- AI Gateway not running: The page shows "Failed to load model catalog. Is the AI Gateway running?" if it can't reach the gateway service. Make sure the AI Gateway is running on port 8100.
- Empty catalog: If the gateway is running but the catalog is empty, the LiteLLM model registry may not have loaded. Restart the AI Gateway service.
- Missing models: The catalog shows models from LiteLLM's built-in registry. Custom or self-hosted models won't appear here but can still be used in endpoints by entering the model name manually.