AI Gateway

Models

Browse the full LLM model catalog, compare features side by side, and estimate monthly costs across providers.

Overview

The Models page in the AI Gateway gives you a searchable catalog of every LLM model available through the gateway. It pulls metadata from LiteLLM's model registry, so you can browse models across all supported providers without leaving VerifyWise.

The page has three tabs: a full model catalog, a cost calculator for estimating monthly spend and a feature comparison tool for evaluating models side by side.

Accessing the models page

Click the AI Gateway icon in the sidebar
Click Models in the secondary sidebar
The page loads with the All models tab active

The page header shows the total number of models and providers available. This count updates based on the LiteLLM model registry bundled with your AI Gateway installation.

All models tab

The default tab shows a paginated table of every model in the catalog. Each page displays 25 models at a time.

Table columns

The model table shows the following information for each model:

Column	Description
Provider	The LLM provider (OpenAI, Anthropic, Google, Mistral, etc.) shown with a provider icon
Model	The model identifier used when making API requests through the gateway
Mode	The model type: chat, embedding, image generation, audio transcription or completion
Context	Maximum input token window, shown in shorthand (e.g. 128K, 1M)
$/1M in	Cost per million input tokens in USD
$/1M out	Cost per million output tokens in USD
Features	Icons showing supported capabilities: vision, function calling, PDF input, prompt caching

Filtering models

The filter bar above the table gives you several ways to narrow the list:

Search: Type in the search field to filter by model name or provider. Results update as you type.
Provider dropdown: Select a specific provider to show only their models. Defaults to "All providers".
Mode dropdown: Filter by model type: Chat, Embedding, Image generation, Audio transcription or Completion.
Feature toggles: Click the Vision, Tools, PDF or Caching buttons to show only models that support those features. Active filters show a green border.

Filters can be combined. For example, you can search for "claude" while filtering by the Chat mode and Vision feature to find all Anthropic chat models with image support.

A results count below the filters shows how many models match your current selection.

Adding a model to an endpoint

Each row in the model table has an Add button on the right side. Clicking it takes you to the Endpoints page with the model and provider pre-filled, so you can quickly create a new endpoint for that model.

The Add button is a shortcut. You can also create endpoints directly from the Endpoints page and type in the model name manually.

Pagination

When the filtered list has more than 25 models, pagination controls appear at the bottom of the table. Use the left and right arrow buttons to move between pages. The current page number and total page count are displayed alongside the controls.

Cost calculator tab

The cost calculator helps you estimate monthly spend across models based on your expected usage patterns. Only chat models with known pricing appear in the results.

Calculator inputs

Enter your expected usage to see cost estimates:

Requests/day: The number of API requests you expect to make per day
Avg input tokens: Average number of input tokens per request (prompt length)
Avg output tokens: Average number of output tokens per request (response length)
Provider: Optionally filter results to a single provider

Understanding the results

Results are sorted from cheapest to most expensive. The table shows:

Column	Description
Rank	Position in the cost ranking. Top 3 get trophy, medal and award icons.
Model	Provider and model name. The cheapest model is tagged with a "cheapest" badge.
Context	Maximum input token window
$/req	Cost per single request based on your input/output token averages
Input	Total daily input cost (requests x avg input tokens x cost per token)
Output	Total daily output cost (requests x avg output tokens x cost per token)
$/day	Combined daily cost for all requests
$/month	Projected 30-day cost (daily cost x 30)

The top 3 results are highlighted with distinct background colors for quick identification. By default, the calculator shows the top 50 results. If more models match, a button appears to show all results.

Cost estimates are based on published per-token pricing from each provider. Actual costs may differ if your provider offers volume discounts, committed use pricing or if token counts vary from your estimates.

Feature comparison tab

The feature comparison tab lets you select up to 5 models and compare their capabilities side by side in a table.

Selecting models to compare

There are two ways to add models to the comparison:

Popular models: Click any of the pre-populated model buttons (GPT-4o, Claude Sonnet, Gemini Flash, Mistral Large, Grok and others). Selected models show a green border.
Search: Type in the search field to find any model by name. Click a result to add it to the comparison. Up to 5 models can be compared at once.

Three models are pre-selected by default (GPT-4o, Claude Sonnet 4 and Gemini 2.0 Flash) so you see results right away. Click any selected model's button again to deselect it.

Compared features

The comparison table shows the following attributes for each selected model:

Provider: Which company offers the model
Mode: The model type (chat, embedding, etc.)
Max input tokens: Maximum context window size
Max output tokens: Maximum response length
Input $/1M tokens: Cost per million input tokens
Output $/1M tokens: Cost per million output tokens
Vision: Whether the model can process images
Function calling: Whether the model supports tool/function calling
Parallel tools: Whether the model can call multiple tools in a single turn
PDF input: Whether the model accepts PDF files directly
Prompt caching: Whether the model supports caching repeated prompt prefixes
Response schema: Whether the model supports structured output schemas
System messages: Whether the model accepts system-level instructions

Best-value highlighting

The comparison table automatically highlights the best value in each row with a green background. For cost rows, the lowest price wins. For token limits and feature support, the highest value wins. This makes it easy to spot which model leads in each category.

Boolean features (vision, function calling, etc.) show a green checkmark for "Yes" and a gray X for "No".

Removing models from comparison

To remove a model from the comparison, click the small trash icon next to its name in the column header. You can also click its button in the popular models row to deselect it.

Supported providers

The AI Gateway supports models from all providers in the LiteLLM registry. The catalog is bundled with the gateway, so it doesn't require any API keys to browse. Common providers include:

OpenAI

GPT-4o, GPT-4o Mini, o1, o3 and more

Anthropic

Claude Opus, Sonnet, Haiku families

Google

Gemini Pro, Flash, Nano models

Mistral

Mistral Large, Medium, Small

xAI

Grok models

Cohere

Command and Embed models

Amazon Bedrock

All Bedrock-hosted models

Azure OpenAI

Azure-hosted OpenAI models

Browsing the model catalog doesn't require provider API keys. Keys are only needed when you create an endpoint and start routing requests through the gateway.

Model modes

Models in the catalog are classified by their mode, which describes what kind of task they perform:

Chat: Conversational models that accept messages and return text responses. This is the most common mode.
Embedding: Models that convert text into vector representations for semantic search and similarity matching.
Image generation: Models that create images from text prompts (DALL-E, Stable Diffusion, etc.).
Audio transcription: Models that convert spoken audio into text (Whisper, etc.).
Completion: Legacy text completion models that predict the next tokens in a sequence.

Feature icons reference

The Features column in the model table uses icons to indicate model capabilities:

Icon	Feature	What it means
Eye	Vision	Model can analyze images sent alongside text prompts
Wrench	Function calling	Model can call external tools and functions through structured tool-use APIs
FileText	PDF input	Model accepts PDF files directly without pre-processing
Database	Prompt caching	Provider supports caching repeated prompt prefixes to reduce cost and latency

Typical workflow

Here's how most teams use the Models page as part of their gateway setup:

Browse the catalog: Use filters to find models that match your requirements (chat mode, vision support, etc.)
Compare candidates: Switch to the Feature comparison tab to evaluate your shortlist side by side
Estimate costs: Use the Cost calculator to project monthly spend based on your expected usage
Add to endpoints: Click the Add button on your chosen model to create a gateway endpoint for it
Test in playground: Use the AI Gateway Playground to test the endpoint before routing production traffic

Troubleshooting

If the models page shows an error or no data:

AI Gateway not running: The page shows "Failed to load model catalog. Is the AI Gateway running?" if it can't reach the gateway service. Make sure the AI Gateway is running on port 8100.
Empty catalog: If the gateway is running but the catalog is empty, the LiteLLM model registry may not have loaded. Restart the AI Gateway service.
Missing models: The catalog shows models from LiteLLM's built-in registry. Custom or self-hosted models won't appear here but can still be used in endpoints by entering the model name manually.

Endpoints

Create gateway endpoints for the models you've chosen

Playground

Test your endpoints interactively before going to production

Getting started

Set up the AI Gateway from scratch

Virtual keys

Generate API keys for developers to access the gateway

PreviousPrompts

NextMCP Gateway overview