AI Gateway

Models

Browse the full LLM model catalog, compare features side by side, and estimate monthly costs across providers.

Overview

The Models page in the AI Gateway gives you a searchable catalog of every LLM model available through the gateway. It pulls metadata from LiteLLM's model registry, so you can browse models across all supported providers without leaving VerifyWise.

The page has three tabs: a full model catalog, a cost calculator for estimating monthly spend and a feature comparison tool for evaluating models side by side.

Accessing the models page

  1. Click the AI Gateway icon in the sidebar
  2. Click Models in the secondary sidebar
  3. The page loads with the All models tab active

The page header shows the total number of models and providers available. This count updates based on the LiteLLM model registry bundled with your AI Gateway installation.

All models tab

The default tab shows a paginated table of every model in the catalog. Each page displays 25 models at a time.

Table columns

The model table shows the following information for each model:

ColumnDescription
ProviderThe LLM provider (OpenAI, Anthropic, Google, Mistral, etc.) shown with a provider icon
ModelThe model identifier used when making API requests through the gateway
ModeThe model type: chat, embedding, image generation, audio transcription or completion
ContextMaximum input token window, shown in shorthand (e.g. 128K, 1M)
$/1M inCost per million input tokens in USD
$/1M outCost per million output tokens in USD
FeaturesIcons showing supported capabilities: vision, function calling, PDF input, prompt caching

Filtering models

The filter bar above the table gives you several ways to narrow the list:

  • Search: Type in the search field to filter by model name or provider. Results update as you type.
  • Provider dropdown: Select a specific provider to show only their models. Defaults to "All providers".
  • Mode dropdown: Filter by model type: Chat, Embedding, Image generation, Audio transcription or Completion.
  • Feature toggles: Click the Vision, Tools, PDF or Caching buttons to show only models that support those features. Active filters show a green border.

Filters can be combined. For example, you can search for "claude" while filtering by the Chat mode and Vision feature to find all Anthropic chat models with image support.

A results count below the filters shows how many models match your current selection.

Adding a model to an endpoint

Each row in the model table has an Add button on the right side. Clicking it takes you to the Endpoints page with the model and provider pre-filled, so you can quickly create a new endpoint for that model.

The Add button is a shortcut. You can also create endpoints directly from the Endpoints page and type in the model name manually.

Pagination

When the filtered list has more than 25 models, pagination controls appear at the bottom of the table. Use the left and right arrow buttons to move between pages. The current page number and total page count are displayed alongside the controls.

Cost calculator tab

The cost calculator helps you estimate monthly spend across models based on your expected usage patterns. Only chat models with known pricing appear in the results.

Calculator inputs

Enter your expected usage to see cost estimates:

  • Requests/day: The number of API requests you expect to make per day
  • Avg input tokens: Average number of input tokens per request (prompt length)
  • Avg output tokens: Average number of output tokens per request (response length)
  • Provider: Optionally filter results to a single provider

Understanding the results

Results are sorted from cheapest to most expensive. The table shows:

ColumnDescription
RankPosition in the cost ranking. Top 3 get trophy, medal and award icons.
ModelProvider and model name. The cheapest model is tagged with a "cheapest" badge.
ContextMaximum input token window
$/reqCost per single request based on your input/output token averages
InputTotal daily input cost (requests x avg input tokens x cost per token)
OutputTotal daily output cost (requests x avg output tokens x cost per token)
$/dayCombined daily cost for all requests
$/monthProjected 30-day cost (daily cost x 30)

The top 3 results are highlighted with distinct background colors for quick identification. By default, the calculator shows the top 50 results. If more models match, a button appears to show all results.

Cost estimates are based on published per-token pricing from each provider. Actual costs may differ if your provider offers volume discounts, committed use pricing or if token counts vary from your estimates.

Feature comparison tab

The feature comparison tab lets you select up to 5 models and compare their capabilities side by side in a table.

Selecting models to compare

There are two ways to add models to the comparison:

  • Popular models: Click any of the pre-populated model buttons (GPT-4o, Claude Sonnet, Gemini Flash, Mistral Large, Grok and others). Selected models show a green border.
  • Search: Type in the search field to find any model by name. Click a result to add it to the comparison. Up to 5 models can be compared at once.

Three models are pre-selected by default (GPT-4o, Claude Sonnet 4 and Gemini 2.0 Flash) so you see results right away. Click any selected model's button again to deselect it.

Compared features

The comparison table shows the following attributes for each selected model:

  • Provider: Which company offers the model
  • Mode: The model type (chat, embedding, etc.)
  • Max input tokens: Maximum context window size
  • Max output tokens: Maximum response length
  • Input $/1M tokens: Cost per million input tokens
  • Output $/1M tokens: Cost per million output tokens
  • Vision: Whether the model can process images
  • Function calling: Whether the model supports tool/function calling
  • Parallel tools: Whether the model can call multiple tools in a single turn
  • PDF input: Whether the model accepts PDF files directly
  • Prompt caching: Whether the model supports caching repeated prompt prefixes
  • Response schema: Whether the model supports structured output schemas
  • System messages: Whether the model accepts system-level instructions

Best-value highlighting

The comparison table automatically highlights the best value in each row with a green background. For cost rows, the lowest price wins. For token limits and feature support, the highest value wins. This makes it easy to spot which model leads in each category.

Boolean features (vision, function calling, etc.) show a green checkmark for "Yes" and a gray X for "No".

Removing models from comparison

To remove a model from the comparison, click the small trash icon next to its name in the column header. You can also click its button in the popular models row to deselect it.

Supported providers

The AI Gateway supports models from all providers in the LiteLLM registry. The catalog is bundled with the gateway, so it doesn't require any API keys to browse. Common providers include:

OpenAI

GPT-4o, GPT-4o Mini, o1, o3 and more

Anthropic

Claude Opus, Sonnet, Haiku families

Google

Gemini Pro, Flash, Nano models

Mistral

Mistral Large, Medium, Small

xAI

Grok models

Meta

Llama models via various hosts

Cohere

Command and Embed models

Amazon Bedrock

All Bedrock-hosted models

Azure OpenAI

Azure-hosted OpenAI models

Browsing the model catalog doesn't require provider API keys. Keys are only needed when you create an endpoint and start routing requests through the gateway.

Model modes

Models in the catalog are classified by their mode, which describes what kind of task they perform:

  • Chat: Conversational models that accept messages and return text responses. This is the most common mode.
  • Embedding: Models that convert text into vector representations for semantic search and similarity matching.
  • Image generation: Models that create images from text prompts (DALL-E, Stable Diffusion, etc.).
  • Audio transcription: Models that convert spoken audio into text (Whisper, etc.).
  • Completion: Legacy text completion models that predict the next tokens in a sequence.

Feature icons reference

The Features column in the model table uses icons to indicate model capabilities:

IconFeatureWhat it means
EyeVisionModel can analyze images sent alongside text prompts
WrenchFunction callingModel can call external tools and functions through structured tool-use APIs
FileTextPDF inputModel accepts PDF files directly without pre-processing
DatabasePrompt cachingProvider supports caching repeated prompt prefixes to reduce cost and latency

Typical workflow

Here's how most teams use the Models page as part of their gateway setup:

  1. Browse the catalog: Use filters to find models that match your requirements (chat mode, vision support, etc.)
  2. Compare candidates: Switch to the Feature comparison tab to evaluate your shortlist side by side
  3. Estimate costs: Use the Cost calculator to project monthly spend based on your expected usage
  4. Add to endpoints: Click the Add button on your chosen model to create a gateway endpoint for it
  5. Test in playground: Use the AI Gateway Playground to test the endpoint before routing production traffic

Troubleshooting

If the models page shows an error or no data:

  • AI Gateway not running: The page shows "Failed to load model catalog. Is the AI Gateway running?" if it can't reach the gateway service. Make sure the AI Gateway is running on port 8100.
  • Empty catalog: If the gateway is running but the catalog is empty, the LiteLLM model registry may not have loaded. Restart the AI Gateway service.
  • Missing models: The catalog shows models from LiteLLM's built-in registry. Custom or self-hosted models won't appear here but can still be used in endpoints by entering the model name manually.
PreviousPrompts
NextMCP Gateway overview
Models - AI Gateway - VerifyWise User Guide