Skip to main content

Interface: LlamaCppCompletionModelSettings<CONTEXT_WINDOW_SIZE>

Type parameters

NameType
CONTEXT_WINDOW_SIZEextends number | undefined

Hierarchy

Properties

api

Optional api: ApiConfiguration

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:44


cachePrompt

Optional cachePrompt: boolean

Save the prompt and generation for avoid reprocess entire prompt if a part of this isn't change (default: false)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:173


contextWindowSize

Optional contextWindowSize: CONTEXT_WINDOW_SIZE

Specify the context window size of the model that you have loaded in your Llama.cpp server.

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:50


frequencyPenalty

Optional frequencyPenalty: number

Repeat alpha frequency penalty (default: 0.0, 0.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:112


grammar

Optional grammar: string

Set grammar for grammar-based sampling (default: no grammar)

See

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:142


ignoreEos

Optional ignoreEos: boolean

Ignore end of stream token and continue generating (default: false).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:153


logitBias

Optional logitBias: [number, number | false][]

Modify the likelihood of a token appearing in the generated text completion. For example, use "logit_bias": [[15043,1.0]] to increase the likelihood of the token 'Hello', or "logit_bias": [[15043,-1.0]] to decrease its likelihood. Setting the value to false, "logit_bias": [[15043,false]] ensures that the token Hello is never produced (default: []).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:162


maxGenerationTokens

Optional maxGenerationTokens: number

Specifies the maximum number of tokens (words, punctuation, parts of words) that the model can generate in a single response. It helps to control the length of the output.

Does nothing if the model does not support this setting.

Example: maxGenerationTokens: 1000

Inherited from

TextGenerationModelSettings.maxGenerationTokens

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:28


minP

Optional minP: number

The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:70


mirostat

Optional mirostat: number

Enable Mirostat sampling, controlling perplexity during text generation (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:125


mirostatEta

Optional mirostatEta: number

Set the Mirostat learning rate, parameter eta (default: 0.1).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:135


mirostatTau

Optional mirostatTau: number

Set the Mirostat target entropy, parameter tau (default: 5.0).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:130


nKeep

Optional nKeep: number

Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. By default, this value is set to 0 (meaning no tokens are kept). Use -1 to retain all tokens from the prompt.

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:77


nProbs

Optional nProbs: number

If greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:168


numberOfGenerations

Optional numberOfGenerations: number

Number of texts to generate.

Specifies the number of responses or completions the model should generate for a given prompt. This is useful when you need multiple different outputs or ideas for a single prompt. The model will generate 'n' distinct responses, each based on the same initial prompt. In a streaming model this will result in both responses streamed back in real time.

Does nothing if the model does not support this setting.

Example: numberOfGenerations: 3 // The model will produce 3 different responses.

Inherited from

TextGenerationModelSettings.numberOfGenerations

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:55


observers

Optional observers: FunctionObserver[]

Observers that are called when the model is used in run functions.

Inherited from

TextGenerationModelSettings.observers

Defined in

packages/modelfusion/src/model-function/Model.ts:8


penalizeNl

Optional penalizeNl: boolean

Penalize newline tokens when applying the repeat penalty (default: true).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:102


penaltyPrompt

Optional penaltyPrompt: string | number[]

This will replace the prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (default: null = use the original prompt).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:119


presencePenalty

Optional presencePenalty: number

Repeat alpha presence penalty (default: 0.0, 0.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:107


promptTemplate

Optional promptTemplate: TextGenerationPromptTemplateProvider<LlamaCppCompletionPrompt>

Prompt template provider that is used when calling .withTextPrompt(), withInstructionPrompt() or withChatPrompt().

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:184


repeatLastN

Optional repeatLastN: number

Last n tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:97


repeatPenalty

Optional repeatPenalty: number

Control the repetition of token sequences in the generated text (default: 1.1).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:92


seed

Optional seed: number

Set the random number generator (RNG) seed (default: -1, -1 = random seed).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:148


slotId

Optional slotId: number

Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot (default: -1)

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:179


stopSequences

Optional stopSequences: string[]

Stop sequences to use. Stop sequences are an array of strings or a single string that the model will recognize as end-of-text indicators. The model stops generating more content when it encounters any of these strings. This is particularly useful in scripted or formatted text generation, where a specific end point is required. Stop sequences not included in the generated text.

Does nothing if the model does not support this setting.

Example: stopSequences: ['\n', 'END']

Inherited from

TextGenerationModelSettings.stopSequences

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:41


temperature

Optional temperature: number

Adjust the randomness of the generated text (default: 0.8).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:55


tfsZ

Optional tfsZ: number

Enable tail free sampling with parameter z (default: 1.0, 1.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:82


topK

Optional topK: number

Limit the next token selection to the K most probable tokens (default: 40).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:60


topP

Optional topP: number

Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:65


trimWhitespace

Optional trimWhitespace: boolean

When true, the leading and trailing white space and line terminator characters are removed from the generated text.

Default: true.

Inherited from

TextGenerationModelSettings.trimWhitespace

Defined in

packages/modelfusion/src/model-function/generate-text/TextGenerationModel.ts:63


typicalP

Optional typicalP: number

Enable locally typical sampling with parameter p (default: 1.0, 1.0 = disabled).

Defined in

packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:87