Optional
fields: GoogleVertexAIChatInput<WebGoogleAuthOptions>Help the model understand what an appropriate response is
Maximum number of tokens to generate in the completion.
Model to use
Sampling temperature to use
Top-k changes how the model selects tokens for output.
A top-k of 1 means the selected token is the most probable among all tokens in the model’s vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
Top-p changes how the model selects tokens for output.
Tokens are selected from most probable to least until the sum of their probabilities equals the top-p value.
For example, if tokens A, B, and C have a probability of .3, .2, and .1 and the top-p value is .5, then the model will select either A or B as the next token (using temperature).
Creates an instance of the Google Vertex AI chat model.
The messages for the model instance.
A new instance of the Google Vertex AI chat model.
Static
convertConverts a prediction from the Google Vertex AI chat model to a chat generation.
The prediction to convert.
The converted chat generation.
Static
convert
Enables calls to the Google Cloud's Vertex AI API to access Large Language Models in a chat-like fashion.
This entrypoint and class are intended to be used in web environments like Edge functions where you do not have access to the file system. It supports passing service account credentials directly as a "GOOGLE_VERTEX_AI_WEB_CREDENTIALS" environment variable or directly as "authOptions.credentials".
Example