Interface: LlamaCppCompletionModelSettings<CONTEXT_WINDOW_SIZE>
Type parameters
Name | Type |
---|---|
CONTEXT_WINDOW_SIZE | extends number | undefined |
Hierarchy
-
↳
LlamaCppCompletionModelSettings
Properties
api
• Optional
api: ApiConfiguration
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:44
cachePrompt
• Optional
cachePrompt: boolean
Save the prompt and generation for avoid reprocess entire prompt if a part of this isn't change (default: false)
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:173
contextWindowSize
• Optional
contextWindowSize: CONTEXT_WINDOW_SIZE
Specify the context window size of the model that you have loaded in your Llama.cpp server.
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:50
frequencyPenalty
• Optional
frequencyPenalty: number
Repeat alpha frequency penalty (default: 0.0, 0.0 = disabled).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:112
grammar
• Optional
grammar: string
Set grammar for grammar-based sampling (default: no grammar)
See
https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:142
ignoreEos
• Optional
ignoreEos: boolean
Ignore end of stream token and continue generating (default: false).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:153
logitBias
• Optional
logitBias: [number
, number
| false
][]
Modify the likelihood of a token appearing in the generated text completion. For example, use "logit_bias": [[15043,1.0]] to increase the likelihood of the token 'Hello', or "logit_bias": [[15043,-1.0]] to decrease its likelihood. Setting the value to false, "logit_bias": [[15043,false]] ensures that the token Hello is never produced (default: []).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:162
maxGenerationTokens
• Optional
maxGenerationTokens: number
Specifies the maximum number of tokens (words, punctuation, parts of words) that the model can generate in a single response. It helps to control the length of the output.
Does nothing if the model does not support this setting.
Example: maxGenerationTokens: 1000
Inherited from
TextGenerationModelSettings.maxGenerationTokens
Defined in
packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:28
minP
• Optional
minP: number
The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:70
mirostat
• Optional
mirostat: number
Enable Mirostat sampling, controlling perplexity during text generation (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:125
mirostatEta
• Optional
mirostatEta: number
Set the Mirostat learning rate, parameter eta (default: 0.1).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:135
mirostatTau
• Optional
mirostatTau: number
Set the Mirostat target entropy, parameter tau (default: 5.0).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:130
nKeep
• Optional
nKeep: number
Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. By default, this value is set to 0 (meaning no tokens are kept). Use -1 to retain all tokens from the prompt.
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:77
nProbs
• Optional
nProbs: number
If greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:168
numberOfGenerations
• Optional
numberOfGenerations: number
Number of texts to generate.
Specifies the number of responses or completions the model should generate for a given prompt. This is useful when you need multiple different outputs or ideas for a single prompt. The model will generate 'n' distinct responses, each based on the same initial prompt. In a streaming model this will result in both responses streamed back in real time.
Does nothing if the model does not support this setting.
Example: numberOfGenerations: 3
// The model will produce 3 different responses.
Inherited from
TextGenerationModelSettings.numberOfGenerations
Defined in
packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:55
observers
• Optional
observers: FunctionObserver
[]
Observers that are called when the model is used in run functions.
Inherited from
TextGenerationModelSettings.observers
Defined in
packages/modelfusion/src/model-function/Model.ts:8
penalizeNl
• Optional
penalizeNl: boolean
Penalize newline tokens when applying the repeat penalty (default: true).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:102
penaltyPrompt
• Optional
penaltyPrompt: string
| number
[]
This will replace the prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (default: null = use the original prompt).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:119
presencePenalty
• Optional
presencePenalty: number
Repeat alpha presence penalty (default: 0.0, 0.0 = disabled).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:107
promptTemplate
• Optional
promptTemplate: TextGenerationPromptTemplateProvider
<LlamaCppCompletionPrompt
>
Prompt template provider that is used when calling .withTextPrompt()
, withInstructionPrompt()
or withChatPrompt()
.
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:184
repeatLastN
• Optional
repeatLastN: number
Last n tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:97
repeatPenalty
• Optional
repeatPenalty: number
Control the repetition of token sequences in the generated text (default: 1.1).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:92
seed
• Optional
seed: number
Set the random number generator (RNG) seed (default: -1, -1 = random seed).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:148
slotId
• Optional
slotId: number
Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot (default: -1)
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:179
stopSequences
• Optional
stopSequences: string
[]
Stop sequences to use. Stop sequences are an array of strings or a single string that the model will recognize as end-of-text indicators. The model stops generating more content when it encounters any of these strings. This is particularly useful in scripted or formatted text generation, where a specific end point is required. Stop sequences not included in the generated text.
Does nothing if the model does not support this setting.
Example: stopSequences: ['\n', 'END']
Inherited from
TextGenerationModelSettings.stopSequences
Defined in
packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:41
temperature
• Optional
temperature: number
Adjust the randomness of the generated text (default: 0.8).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:55
tfsZ
• Optional
tfsZ: number
Enable tail free sampling with parameter z (default: 1.0, 1.0 = disabled).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:82
topK
• Optional
topK: number
Limit the next token selection to the K most probable tokens (default: 40).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:60
topP
• Optional
topP: number
Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:65
trimWhitespace
• Optional
trimWhitespace: boolean
When true, the leading and trailing white space and line terminator characters are removed from the generated text.
Default: true.
Inherited from
TextGenerationModelSettings.trimWhitespace
Defined in
packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:63
typicalP
• Optional
typicalP: number
Enable locally typical sampling with parameter p (default: 1.0, 1.0 = disabled).
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:87