-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Counting Tokens for Azure Code and Update in Redis #63100
base: main
Are you sure you want to change the base?
Conversation
"gpt-4" | ||
] | ||
}, | ||
"azureCompletionModel": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate adding extra parameters to the siteconfig and requestparams. However, I had to add these parameters because Azure does not support naming the model directly. Instead, you provide the deployment name. The deployment name can be different from the model name. Because of this discrepancy, I had to introduce extra config parameters to specify the chat model name and the completion model name.
This necessity arises because Azure doesn’t allow retrieval of the model name straightforwardly. The deployment name, which is used in requests, can map to any underlying model. For accurate token counting, the correct model name is required. Although there are ways to retrieve the model name from the deployment, they involve complex logic. Therefore, this approach, despite being somewhat inelegant, is the simplest solution given the constraints.
github.com/pkoukk/tiktoken-go v0.1.6 | ||
github.com/pkoukk/tiktoken-go-loader v0.0.1 | ||
github.com/pkoukk/tiktoken-go v0.1.7 | ||
github.com/pkoukk/tiktoken-go-loader v0.0.2-0.20240522064338-c17e8bc0f699 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to update to support gpt4o
Will review this tomorrow, apologies for the delay |
logger.Warn("Failed to count tokens with the token manager %w ", log.Error(err)) | ||
logger.Warn("Failed to count output tokens with the token manager %w ", log.Error(err)) | ||
} | ||
if inputTokens != 0 && outputTokens != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check again
Added the validation check also in this commit |
561199a
to
52b92b9
Compare
inputTokens, err := NumTokensFromAzureOpenAiMessages(requestParams.Messages, requestParams.AzureCompletionModel) | ||
if err != nil { | ||
logger.Warn("Failed to count input tokens with the token manager %w ", log.Error(err)) | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you positive we want these cases returning token counting errors? I'm unsure.
On the one hand, if token counting is failing that is bad. But on the other hand, if errors are returned here because token counting is failing... then Cody is entirely 100% broken for users.
Seems to me that we're better logging these errors and simply not recording token counts should there be an issue here.
|
||
if hasAzure && !(hasAzureChatModel && hasAzureCompletionModel) { | ||
invalid(NewSiteProblem(`when using azure-openai provider its mandatory to set both completions.azureChatModel and completions.azureCompletionModel for proper LLM Token usage`)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally no validation logic would live in this centralized function, as then it becomes too large.
You should add a contributed validator instead; e.g. inside an init function in azureopenai/openai.go
. For example like this
two minor comments, will be LGTM once those are addressed |
Description:
This PR introduces support for counting tokens within the Azure code and updating these counts in Redis. The token counting logic is embedded directly in the Azure code rather than using a standardized point for all token counting logic.
Reasoning:
Future Considerations:
Changes:
Testing:
Conclusion:
This is a temporary solution to enable token counting in Azure. We will adapt our approach as Azure enhances its feature set to include token usage from their streaming endpoint.
Test plan
Tested locally with debugger
Changelog