Designing effective agents

Understand and calculate agent costs

When designing your agent, it is important to understand and be able to calculate the cost of running the agent in production. In this section, you will learn the principles of how licensing and consumption costs work in Copilot Studio.

The cost of running your agent in Copilot Studio is calculated using the currency of “messages.” Everything the agent does—such as responding to a user question, retrieving information from a knowledge source, or performing an action using a tool—consumes messages. These messages are metered at runtime.

Messages are consumed at different rates depending on what the agent is doing. The main factors that affect the consumption of messages and the cost of your agent are as follows:

  • The capability being invoked

  • The orchestration model used

  • Tenant graph grounding, the channel, and the user license

The capability being invoked

The number of messages consumed depends on the amount of computing effort required for the capability being invoked. The more an agent grounds, reasons, or calls other AI tools, the more messages the interaction consumes. Using topics (classic answers) for predictable FAQs and reserving the higher-cost AI capabilities for the parts of the process that genuinely need them can help you manage these costs effectively.

  • The lightest capability is when the agent uses “classic answers”—that is, static, authored responses in topics. In such a case, the agent doesn’t need to use any AI or intelligence to produce this response to a user input. Classic answers are available only when using classic orchestration, but using topics with generative orchestration can still reduce message consumption compared with using generative answers for everything.

  • Generative answers—where the agent is grounded in knowledge such as a website, document, or enterprise system—require the agent to use the large language model to generate a dynamic response. These answers consume more messages than using topics or classic answers.

  • Specialized tools that extend the agent’s capabilities—such as AI Builder prompts, code execution, document processing, and reasoning—consume the most messages because they run dedicated models.

The orchestration model used

As described earlier in this chapter, you can use either classic or generative orchestration for your agent. Classic orchestration uses lower-cost topics for classic answers. Generative orchestration uses the more powerful AI capabilities that are metered at a higher rate. Using classic orchestration for agents that handle narrow or predictive scenarios will prevent unnecessary costs here. Use the generative orchestration capabilities for agents with a high value and return on investment.

When you build an agent that uses generative orchestration, there is additional message consumption by the orchestrator as it reasons about what it should do and when. This is separate from (and in addition to) the consumption cost for the actual action or answer provided by the agent described previously.

For instance, if the agent reasons over a grounded knowledge source to find an answer, there is a consumption charge for the generative answer. If the agent also uses generative orchestration to decide which knowledge source to use, there is an extra consumption charge for that reasoning step.

Tenant graph grounding, the channel, and the user license

When you have at least one Microsoft 365 Copilot license in your tenant, you have access to an additional premium feature that improves the way your agent searches in SharePoint and Microsoft 365 data sources. Deploying Microsoft 365 Copilot activates semantic search on the tenant. In Copilot Studio, this is referred to as “tenant graph grounding with semantic search.” This option is enabled by default in the settings of your agent if you have a Microsoft 365 Copilot license in the tenant.

This feature is primarily used in Copilot Studio for extending Microsoft 365 Copilot. However, you need to be aware of and careful about consumption costs if you are building an agent for users who do not have that license, or if you are publishing an agent using this capability outside a Microsoft 365 surface. A Microsoft 365 Copilot license includes tenant graph grounding with semantic search for that licensed user when the agent is published in a Microsoft 365 surface (e.g., Microsoft 365 Copilot, Teams, SharePoint).

If you build an agent grounded on SharePoint or any Microsoft 365 data source with this setting enabled, and then publish it in any of the following ways, you will incur additional message consumption costs based on usage:

  • Microsoft 365 Copilot Chat, Teams, and SharePoint for users who do not have a Microsoft 365 Copilot license

  • An internal channel that is not a Microsoft 365 surface (e.g., an intranet or custom app), even if the user has a Microsoft 365 Copilot license

  • A channel for external users

The message consumption rate for this feature is relatively high and will add up quickly if you build an agent that is used very frequently. Weigh the cost of a Microsoft 365 Copilot license for the users who will work with the agent (this license includes these costs as well as a wealth of other features) against these consumption costs for an individual agent.

Purchasing message packs for your agent

There are two main ways to license Copilot Studio:

  • Prepaid packs of messages, billed by month for the tenant. This is the better option when your consumption is high and relatively regular or predictable. You stack prepaid packs to get the volume of messages you need. Messages expire at the end of the month, and do not roll over.

  • Post-paid “pay as you go” billing, where you pay for what you consume each month. The rate per message is higher than for the prepaid packs, but this option will be more cost-effective for agents with unpredictable, irregular, or seasonal usage.

Check the latest licensing guidelines from Microsoft for accurate and up-to-date cost and consumption calculations.

Using the Copilot Studio agent usage estimator

Microsoft provides an estimator to help you calculate the likely message consumption of your Copilot Studio agent. You can use this tool to estimate the costs and understand the impacts on message consumption of the different decisions you make in your agent design.

You can access and use the Copilot Studio agent usage estimator online at https://microsoft.github.io/copilot-studio-estimator/.