Rate Limits and Costs

Understanding and managing API usage is crucial for a smooth and cost-effective experience with Roo Code. This section explains how to track your token usage and costs. Rate limits, which default to 0 (disabled) and typically don't need adjustment, are now configured per profile; see the API Configuration Profiles documentation for details on how to set them if needed.

Token Usage

Roo Code interacts with AI models using tokens. Tokens are essentially pieces of words. The number of tokens used in a request and response affects both the processing time and the cost.

Input Tokens: These are the tokens in your prompt, including the system prompt, your instructions, and any context provided (e.g., file contents).
Output Tokens: These are the tokens generated by the AI model in its response.

You can see the number of input and output tokens used for each interaction in the chat history.

Cost Calculation

Most AI providers charge based on the number of tokens used. Pricing varies depending on the provider and the specific model.

Roo Code automatically calculates the estimated cost of each API request based on the configured model's pricing. This cost is displayed in the chat history, next to the token usage.

Note:

The cost calculation is an estimate. The actual cost may vary slightly depending on the provider's billing practices.
Some providers may offer free tiers or credits. Check your provider's documentation for details.
Some providers offer prompt caching which greatly lowers cost.

Limiting Auto-Approved Requests

To further help manage API costs and prevent unexpected expenses, Roo Code includes a "Max Requests" setting for auto-approved actions. This allows you to define a specific limit on how many consecutive API calls Roo Code can make without requiring your explicit re-approval during a task.

How it works: If you set a limit (e.g., 5 requests), Roo Code will perform up to 5 auto-approved API calls. Before making the 6th call, it will pause and prompt you to "Reset and Continue," as shown below. Notification when the auto-approved request limit is met.
Configuration: This limit is configured within the "Auto-approve actions" settings. You can set a specific number or choose "Unlimited." For detailed steps on configuring this and other auto-approval settings, see the Auto-Approving Actions documentation. Setting the "Max Requests" for auto-approved actions.

This feature provides an additional safeguard, particularly for complex or long-running tasks where multiple API calls might be involved.

Tips for Optimizing Token Usage

Be Concise: Use clear and concise language in your prompts. Avoid unnecessary words or details.
Provide Only Relevant Context: Use context mentions (@file.ts, @folder/) selectively. Only include the files that are directly relevant to the task.
Break Down Tasks: Divide large tasks into smaller, more focused sub-tasks.
Use Custom Instructions: Provide custom instructions to guide Roo Code's behavior and reduce the need for lengthy explanations in each prompt.
Choose the Right Model: Some models are more cost-effective than others. Consider using a smaller, faster model for tasks that don't require the full power of a larger model.
Use Modes: Different modes can access different tools, for example Architect can't modify code, which makes it a safe choice when analyzing a complex codebase, without worrying about accidentally allowing expensive operations.
Disable MCP If Not Used: If you're not using MCP (Model Context Protocol) features, consider disabling it in the MCP settings to significantly reduce the size of the system prompt and save tokens.

By understanding and managing your API usage, you can use Roo Code effectively and efficiently.

Token Usage​

Cost Calculation​

Limiting Auto-Approved Requests​

Tips for Optimizing Token Usage​

Token Usage

Cost Calculation

Limiting Auto-Approved Requests

Tips for Optimizing Token Usage