Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Counting Tokens for Azure Code and Update in Redis #63100

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

arafatkatze
Copy link
Contributor

@arafatkatze arafatkatze commented Jun 5, 2024

Description:

This PR introduces support for counting tokens within the Azure code and updating these counts in Redis. The token counting logic is embedded directly in the Azure code rather than using a standardized point for all token counting logic.

Reasoning:

•	Azure does not currently support obtaining token usage from their streaming endpoint, unlike OpenAI.
•	To enable immediate functionality, the token counting logic is placed within the Azure code itself.
•	The implementation supports GPT-4o.

Future Considerations:

•	When Azure eventually adds support for token usage from the streaming endpoint, we will migrate to using Azure’s built-in capabilities.
•	This will ensure full utilization of Azure OpenAI features as they achieve parity with OpenAI.

Changes:

•	Added token counting logic to the Azure code.
•	Updated Redis with the token counts.

Testing:

•	Verified the implementation works with GPT-4o.

Conclusion:

This is a temporary solution to enable token counting in Azure. We will adapt our approach as Azure enhances its feature set to include token usage from their streaming endpoint.

Test plan

Tested locally with debugger

Changelog

@cla-bot cla-bot bot added the cla-signed label Jun 5, 2024
@arafatkatze arafatkatze changed the title Add azure tokenization and token counting support Add Support for Counting Tokens for Azure Code and Update in Redis Jun 5, 2024
"gpt-4"
]
},
"azureCompletionModel": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate adding extra parameters to the siteconfig and requestparams. However, I had to add these parameters because Azure does not support naming the model directly. Instead, you provide the deployment name. The deployment name can be different from the model name. Because of this discrepancy, I had to introduce extra config parameters to specify the chat model name and the completion model name.

This necessity arises because Azure doesn’t allow retrieval of the model name straightforwardly. The deployment name, which is used in requests, can map to any underlying model. For accurate token counting, the correct model name is required. Although there are ways to retrieve the model name from the deployment, they involve complex logic. Therefore, this approach, despite being somewhat inelegant, is the simplest solution given the constraints.

github.com/pkoukk/tiktoken-go v0.1.6
github.com/pkoukk/tiktoken-go-loader v0.0.1
github.com/pkoukk/tiktoken-go v0.1.7
github.com/pkoukk/tiktoken-go-loader v0.0.2-0.20240522064338-c17e8bc0f699
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to update to support gpt4o

@slimsag
Copy link
Member

slimsag commented Jun 6, 2024

Will review this tomorrow, apologies for the delay

logger.Warn("Failed to count tokens with the token manager %w ", log.Error(err))
logger.Warn("Failed to count output tokens with the token manager %w ", log.Error(err))
}
if inputTokens != 0 && outputTokens != 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check again

@arafatkatze
Copy link
Contributor Author

arafatkatze commented Jun 6, 2024

Added the validation check also in this commit
image

inputTokens, err := NumTokensFromAzureOpenAiMessages(requestParams.Messages, requestParams.AzureCompletionModel)
if err != nil {
logger.Warn("Failed to count input tokens with the token manager %w ", log.Error(err))
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you positive we want these cases returning token counting errors? I'm unsure.

On the one hand, if token counting is failing that is bad. But on the other hand, if errors are returned here because token counting is failing... then Cody is entirely 100% broken for users.

Seems to me that we're better logging these errors and simply not recording token counts should there be an issue here.


if hasAzure && !(hasAzureChatModel && hasAzureCompletionModel) {
invalid(NewSiteProblem(`when using azure-openai provider its mandatory to set both completions.azureChatModel and completions.azureCompletionModel for proper LLM Token usage`))
}
Copy link
Member

@slimsag slimsag Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally no validation logic would live in this centralized function, as then it becomes too large.

You should add a contributed validator instead; e.g. inside an init function in azureopenai/openai.go. For example like this

@slimsag
Copy link
Member

slimsag commented Jun 10, 2024

two minor comments, will be LGTM once those are addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants