Interface: LlamaCppCompletionModelSettings<CONTEXT_WINDOW_SIZE>

Type parameters

Name	Type
`CONTEXT_WINDOW_SIZE`	extends `number` \| `undefined`

Hierarchy

TextGenerationModelSettings

↳ LlamaCppCompletionModelSettings

Properties

api

• Optional api: ApiConfiguration

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:44

cachePrompt

• Optional cachePrompt: boolean

Save the prompt and generation for avoid reprocess entire prompt if a part of this isn't change (default: false)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:173

contextWindowSize

• Optional contextWindowSize: CONTEXT_WINDOW_SIZE

Specify the context window size of the model that you have loaded in your Llama.cpp server.

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:50

frequencyPenalty

• Optional frequencyPenalty: number

Repeat alpha frequency penalty (default: 0.0, 0.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:112

grammar

• Optional grammar: string

Set grammar for grammar-based sampling (default: no grammar)

See

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:142

ignoreEos

• Optional ignoreEos: boolean

Ignore end of stream token and continue generating (default: false).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:153

logitBias

• Optional logitBias: [number, number | false][]

Modify the likelihood of a token appearing in the generated text completion. For example, use "logit_bias": [[15043,1.0]] to increase the likelihood of the token 'Hello', or "logit_bias": [[15043,-1.0]] to decrease its likelihood. Setting the value to false, "logit_bias": [[15043,false]] ensures that the token Hello is never produced (default: []).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:162

maxGenerationTokens

• Optional maxGenerationTokens: number

Specifies the maximum number of tokens (words, punctuation, parts of words) that the model can generate in a single response. It helps to control the length of the output.

Does nothing if the model does not support this setting.

Example: maxGenerationTokens: 1000

Inherited from

TextGenerationModelSettings.maxGenerationTokens

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:28

minP

• Optional minP: number

The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:70

mirostat

• Optional mirostat: number

Enable Mirostat sampling, controlling perplexity during text generation (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:125

mirostatEta

• Optional mirostatEta: number

Set the Mirostat learning rate, parameter eta (default: 0.1).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:135

mirostatTau

• Optional mirostatTau: number

Set the Mirostat target entropy, parameter tau (default: 5.0).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:130

nKeep

• Optional nKeep: number

Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. By default, this value is set to 0 (meaning no tokens are kept). Use -1 to retain all tokens from the prompt.

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:77

nProbs

• Optional nProbs: number

If greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:168

numberOfGenerations

• Optional numberOfGenerations: number

Number of texts to generate.

Specifies the number of responses or completions the model should generate for a given prompt. This is useful when you need multiple different outputs or ideas for a single prompt. The model will generate 'n' distinct responses, each based on the same initial prompt. In a streaming model this will result in both responses streamed back in real time.

Does nothing if the model does not support this setting.

Example: numberOfGenerations: 3 // The model will produce 3 different responses.

Inherited from

TextGenerationModelSettings.numberOfGenerations

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:55

observers

• Optional observers: FunctionObserver[]

Observers that are called when the model is used in run functions.

Inherited from

TextGenerationModelSettings.observers

Defined in

packages/modelfusion/src/model-function/Model.ts:8

penalizeNl

• Optional penalizeNl: boolean

Penalize newline tokens when applying the repeat penalty (default: true).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:102

penaltyPrompt

• Optional penaltyPrompt: string | number[]

This will replace the prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (default: null = use the original prompt).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:119

presencePenalty

• Optional presencePenalty: number

Repeat alpha presence penalty (default: 0.0, 0.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:107

promptTemplate

• Optional promptTemplate: TextGenerationPromptTemplateProvider<LlamaCppCompletionPrompt>

Prompt template provider that is used when calling .withTextPrompt(), withInstructionPrompt() or withChatPrompt().

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:184

repeatLastN

• Optional repeatLastN: number

Last n tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:97

repeatPenalty

• Optional repeatPenalty: number

Control the repetition of token sequences in the generated text (default: 1.1).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:92

seed

• Optional seed: number

Set the random number generator (RNG) seed (default: -1, -1 = random seed).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:148

slotId

• Optional slotId: number

Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot (default: -1)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:179

stopSequences

• Optional stopSequences: string[]

Stop sequences to use. Stop sequences are an array of strings or a single string that the model will recognize as end-of-text indicators. The model stops generating more content when it encounters any of these strings. This is particularly useful in scripted or formatted text generation, where a specific end point is required. Stop sequences not included in the generated text.

Does nothing if the model does not support this setting.

Example: stopSequences: ['\n', 'END']

Inherited from

TextGenerationModelSettings.stopSequences

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:41

temperature

• Optional temperature: number

Adjust the randomness of the generated text (default: 0.8).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:55

tfsZ

• Optional tfsZ: number

Enable tail free sampling with parameter z (default: 1.0, 1.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:82

topK

• Optional topK: number

Limit the next token selection to the K most probable tokens (default: 40).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:60

topP

• Optional topP: number

Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:65

trimWhitespace

• Optional trimWhitespace: boolean

When true, the leading and trailing white space and line terminator characters are removed from the generated text.

Default: true.

Inherited from

TextGenerationModelSettings.trimWhitespace

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:63

typicalP

• Optional typicalP: number

Enable locally typical sampling with parameter p (default: 1.0, 1.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:87

Type parameters​

Hierarchy​

Properties​

api​

Defined in​

cachePrompt​

Defined in​

contextWindowSize​

Defined in​

frequencyPenalty​

Defined in​

grammar​

Defined in​

ignoreEos​

Defined in​

logitBias​

Defined in​

maxGenerationTokens​

Inherited from​

Defined in​

minP​

Defined in​

mirostat​

Defined in​

mirostatEta​

Defined in​

mirostatTau​

Defined in​

nKeep​

Defined in​

nProbs​

Defined in​

numberOfGenerations​

Inherited from​

Defined in​

observers​

Inherited from​

Defined in​

penalizeNl​

Defined in​

penaltyPrompt​

Defined in​

presencePenalty​

Defined in​

promptTemplate​

Defined in​

repeatLastN​

Defined in​

repeatPenalty​

Defined in​

seed​

Defined in​

slotId​

Defined in​

stopSequences​

Inherited from​

Defined in​

temperature​

Defined in​

tfsZ​

Defined in​

topK​

Defined in​

topP​

Defined in​

trimWhitespace​

Inherited from​

Defined in​

typicalP​

Defined in​

Type parameters

Hierarchy

Properties

api

Defined in

cachePrompt

Defined in

contextWindowSize

Defined in

frequencyPenalty

Defined in

grammar

Defined in

ignoreEos

Defined in

logitBias

Defined in

maxGenerationTokens

Inherited from

Defined in

minP

Defined in

mirostat

Defined in

mirostatEta

Defined in

mirostatTau

Defined in

nKeep

Defined in

nProbs

Defined in

numberOfGenerations

Inherited from

Defined in

observers

Inherited from

Defined in

penalizeNl

Defined in

penaltyPrompt

Defined in

presencePenalty

Defined in

promptTemplate

Defined in

repeatLastN

Defined in

repeatPenalty

Defined in

seed

Defined in

slotId

Defined in

stopSequences

Inherited from

Defined in

temperature

Defined in

tfsZ

Defined in

topK

Defined in

topP

Defined in

trimWhitespace

Inherited from

Defined in

typicalP

Defined in