Defines the input to the Google Vertex AI chat model.

Type Parameters

  • AuthOptions

Hierarchy

  • GoogleVertexAIBaseLLMInput<AuthOptions>
    • GoogleVertexAIChatInput

Properties

apiVersion?: string

The version of the API functions. Part of the path.

authOptions?: AuthOptions
cache?: boolean | BaseCache<Generation[]>
callbackManager?: CallbackManager

Deprecated

Use callbacks instead

callbacks?: Callbacks
concurrency?: number

Deprecated

Use maxConcurrency instead

context?: string

Instructions how the model should respond

endpoint?: string

Hostname for the API call

examples?: ChatExample[]

Help the model understand what an appropriate response is

location?: string

Region where the LLM is stored

maxConcurrency?: number

The maximum number of concurrent calls that can be made. Defaults to Infinity, which means no limit.

maxOutputTokens?: number

Maximum number of tokens to generate in the completion.

maxRetries?: number

The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 6.

metadata?: Record<string, unknown>
model?: string

Model to use

onFailedAttempt?: FailedAttemptHandler

Custom handler to handle failed attempts. Takes the originally thrown error object as input, and should itself throw an error if the input error is not retryable.

tags?: string[]
temperature?: number

Sampling temperature to use

topK?: number

Top-k changes how the model selects tokens for output.

A top-k of 1 means the selected token is the most probable among all tokens in the model’s vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).

topP?: number

Top-p changes how the model selects tokens for output.

Tokens are selected from most probable to least until the sum of their probabilities equals the top-p value.

For example, if tokens A, B, and C have a probability of .3, .2, and .1 and the top-p value is .5, then the model will select either A or B as the next token (using temperature).

verbose?: boolean

Generated using TypeDoc