Class: LlamaCppCompletionModel<CONTEXT_WINDOW_SIZE>
Type parameters
Name | Type |
---|---|
CONTEXT_WINDOW_SIZE | extends number | undefined |
Hierarchy
-
AbstractModel
<LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>>↳
LlamaCppCompletionModel
Implements
TextStreamingBaseModel
<LlamaCppCompletionPrompt
,LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>>
Accessors
contextWindowSize
• get
contextWindowSize(): CONTEXT_WINDOW_SIZE
Returns
CONTEXT_WINDOW_SIZE
Implementation of
TextStreamingBaseModel.contextWindowSize
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:221
modelInformation
• get
modelInformation(): ModelInformation
Returns
Implementation of
TextStreamingBaseModel.modelInformation
Inherited from
AbstractModel.modelInformation
Defined in
packages/modelfusion/src/model-function/AbstractModel.ts:17
modelName
• get
modelName(): null
Returns
null
Overrides
AbstractModel.modelName
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:217
settingsForEvent
• get
settingsForEvent(): Partial
<LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>>
Returns settings that should be recorded in observability events. Security-related settings (e.g. API keys) should not be included here.
Returns
Partial
<LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>>
Implementation of
TextStreamingBaseModel.settingsForEvent
Overrides
AbstractModel.settingsForEvent
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:293
Constructors
constructor
• new LlamaCppCompletionModel<CONTEXT_WINDOW_SIZE
>(settings?
): LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>
Type parameters
Name | Type |
---|---|
CONTEXT_WINDOW_SIZE | extends undefined | number |
Parameters
Name | Type |
---|---|
settings | LlamaCppCompletionModelSettings <CONTEXT_WINDOW_SIZE > |
Returns
LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>
Overrides
AbstractModel<LlamaCppCompletionModelSettings<CONTEXT_WINDOW_SIZE>>.constructor
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:209
Methods
asObjectGenerationModel
▸ asObjectGenerationModel<INPUT_PROMPT
, LlamaCppPrompt
>(promptTemplate
): ObjectFromTextStreamingModel
<INPUT_PROMPT
, unknown
, TextStreamingModel
<unknown
, TextGenerationModelSettings
>> | ObjectFromTextStreamingModel
<INPUT_PROMPT
, LlamaCppPrompt
, TextStreamingModel
<LlamaCppPrompt
, TextGenerationModelSettings
>>
Type parameters
Name |
---|
INPUT_PROMPT |
LlamaCppPrompt |
Parameters
Name | Type |
---|---|
promptTemplate | ObjectFromTextPromptTemplate <INPUT_PROMPT , LlamaCppPrompt > | FlexibleObjectFromTextPromptTemplate <INPUT_PROMPT , unknown > |
Returns
ObjectFromTextStreamingModel
<INPUT_PROMPT
, unknown
, TextStreamingModel
<unknown
, TextGenerationModelSettings
>> | ObjectFromTextStreamingModel
<INPUT_PROMPT
, LlamaCppPrompt
, TextStreamingModel
<LlamaCppPrompt
, TextGenerationModelSettings
>>
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:390
callAPI
▸ callAPI<RESPONSE
>(prompt
, callOptions
, options
): Promise
<RESPONSE
>
Type parameters
Name |
---|
RESPONSE |
Parameters
Name | Type |
---|---|
prompt | LlamaCppCompletionPrompt |
callOptions | FunctionCallOptions |
options | Object |
options.responseFormat | LlamaCppCompletionResponseFormatType <RESPONSE > |
Returns
Promise
<RESPONSE
>
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:227
countPromptTokens
▸ countPromptTokens(prompt
): Promise
<number
>
Parameters
Name | Type |
---|---|
prompt | LlamaCppCompletionPrompt |
Returns
Promise
<number
>
Implementation of
TextStreamingBaseModel.countPromptTokens
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:332
doGenerateTexts
▸ doGenerateTexts(prompt
, options
): Promise
<{ rawResponse
: { content
: string
; generation_settings
: { frequency_penalty
: number
; ignore_eos
: boolean
; logit_bias
: number
[] ; mirostat
: number
; mirostat_eta
: number
; mirostat_tau
: number
; model
: string
; n_ctx
: number
; n_keep
: number
; n_predict
: number
; n_probs
: number
; penalize_nl
: boolean
; presence_penalty
: number
; repeat_last_n
: number
; repeat_penalty
: number
; seed
: number
; stop
: string
[] ; stream
: boolean
; temperature?
: number
; tfs_z
: number
; top_k
: number
; top_p
: number
; typical_p
: number
} ; model
: string
; prompt
: string
; stop
: true
; stopped_eos
: boolean
; stopped_limit
: boolean
; stopped_word
: boolean
; stopping_word
: string
; timings
: { predicted_ms
: number
; predicted_n
: number
; predicted_per_second
: null
| number
; predicted_per_token_ms
: null
| number
; prompt_ms?
: null
| number
; prompt_n
: number
; prompt_per_second
: null
| number
; prompt_per_token_ms
: null
| number
} ; tokens_cached
: number
; tokens_evaluated
: number
; tokens_predicted
: number
; truncated
: boolean
} ; textGenerationResults
: { finishReason
: "length"
| "stop"
| "unknown"
; text
: string
= rawResponse.content }[] ; usage
: { completionTokens
: number
= rawResponse.tokens_predicted; promptTokens
: number
= rawResponse.tokens_evaluated; totalTokens
: number
} }>
Parameters
Name | Type |
---|---|
prompt | LlamaCppCompletionPrompt |
options | FunctionCallOptions |
Returns
Promise
<{ rawResponse
: { content
: string
; generation_settings
: { frequency_penalty
: number
; ignore_eos
: boolean
; logit_bias
: number
[] ; mirostat
: number
; mirostat_eta
: number
; mirostat_tau
: number
; model
: string
; n_ctx
: number
; n_keep
: number
; n_predict
: number
; n_probs
: number
; penalize_nl
: boolean
; presence_penalty
: number
; repeat_last_n
: number
; repeat_penalty
: number
; seed
: number
; stop
: string
[] ; stream
: boolean
; temperature?
: number
; tfs_z
: number
; top_k
: number
; top_p
: number
; typical_p
: number
} ; model
: string
; prompt
: string
; stop
: true
; stopped_eos
: boolean
; stopped_limit
: boolean
; stopped_word
: boolean
; stopping_word
: string
; timings
: { predicted_ms
: number
; predicted_n
: number
; predicted_per_second
: null
| number
; predicted_per_token_ms
: null
| number
; prompt_ms?
: null
| number
; prompt_n
: number
; prompt_per_second
: null
| number
; prompt_per_token_ms
: null
| number
} ; tokens_cached
: number
; tokens_evaluated
: number
; tokens_predicted
: number
; truncated
: boolean
} ; textGenerationResults
: { finishReason
: "length"
| "stop"
| "unknown"
; text
: string
= rawResponse.content }[] ; usage
: { completionTokens
: number
= rawResponse.tokens_predicted; promptTokens
: number
= rawResponse.tokens_evaluated; totalTokens
: number
} }>
Implementation of
TextStreamingBaseModel.doGenerateTexts
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:337
doStreamText
▸ doStreamText(prompt
, options
): Promise
<AsyncIterable
<Delta
<{ content
: string
; generation_settings
: { frequency_penalty
: number
; ignore_eos
: boolean
; logit_bias
: number
[] ; mirostat
: number
; mirostat_eta
: number
; mirostat_tau
: number
; model
: string
; n_ctx
: number
; n_keep
: number
; n_predict
: number
; n_probs
: number
; penalize_nl
: boolean
; presence_penalty
: number
; repeat_last_n
: number
; repeat_penalty
: number
; seed
: number
; stop
: string
[] ; stream
: boolean
; temperature?
: number
; tfs_z
: number
; top_k
: number
; top_p
: number
; typical_p
: number
} ; model
: string
; prompt
: string
; stop
: true
; stopped_eos
: boolean
; stopped_limit
: boolean
; stopped_word
: boolean
; stopping_word
: string
; timings
: { predicted_ms
: number
; predicted_n
: number
; predicted_per_second
: null
| number
; predicted_per_token_ms
: null
| number
; prompt_ms?
: null
| number
; prompt_n
: number
; prompt_per_second
: null
| number
; prompt_per_token_ms
: null
| number
} ; tokens_cached
: number
; tokens_evaluated
: number
; tokens_predicted
: number
; truncated
: boolean
} | { content
: string
; stop
: false
}>>>
Parameters
Name | Type |
---|---|
prompt | LlamaCppCompletionPrompt |
options | FunctionCallOptions |
Returns
Promise
<AsyncIterable
<Delta
<{ content
: string
; generation_settings
: { frequency_penalty
: number
; ignore_eos
: boolean
; logit_bias
: number
[] ; mirostat
: number
; mirostat_eta
: number
; mirostat_tau
: number
; model
: string
; n_ctx
: number
; n_keep
: number
; n_predict
: number
; n_probs
: number
; penalize_nl
: boolean
; presence_penalty
: number
; repeat_last_n
: number
; repeat_penalty
: number
; seed
: number
; stop
: string
[] ; stream
: boolean
; temperature?
: number
; tfs_z
: number
; top_k
: number
; top_p
: number
; typical_p
: number
} ; model
: string
; prompt
: string
; stop
: true
; stopped_eos
: boolean
; stopped_limit
: boolean
; stopped_word
: boolean
; stopping_word
: string
; timings
: { predicted_ms
: number
; predicted_n
: number
; predicted_per_second
: null
| number
; predicted_per_token_ms
: null
| number
; prompt_ms?
: null
| number
; prompt_n
: number
; prompt_per_second
: null
| number
; prompt_per_token_ms
: null
| number
} ; tokens_cached
: number
; tokens_evaluated
: number
; tokens_predicted
: number
; truncated
: boolean
} | { content
: string
; stop
: false
}>>>
Implementation of
TextStreamingBaseModel.doStreamText
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:380
extractTextDelta
▸ extractTextDelta(delta
): string
Parameters
Name | Type |
---|---|
delta | unknown |
Returns
string
Implementation of
TextStreamingBaseModel.extractTextDelta
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:386
processTextGenerationResponse
▸ processTextGenerationResponse(rawResponse
): Object
Parameters
Name | Type |
---|---|
rawResponse | Object |
rawResponse.content | string |
rawResponse.generation_settings | Object |
rawResponse.generation_settings.frequency_penalty | number |
rawResponse.generation_settings.ignore_eos | boolean |
rawResponse.generation_settings.logit_bias | number [] |
rawResponse.generation_settings.mirostat | number |
rawResponse.generation_settings.mirostat_eta | number |
rawResponse.generation_settings.mirostat_tau | number |
rawResponse.generation_settings.model | string |
rawResponse.generation_settings.n_ctx | number |
rawResponse.generation_settings.n_keep | number |
rawResponse.generation_settings.n_predict | number |
rawResponse.generation_settings.n_probs | number |
rawResponse.generation_settings.penalize_nl | boolean |
rawResponse.generation_settings.presence_penalty | number |
rawResponse.generation_settings.repeat_last_n | number |
rawResponse.generation_settings.repeat_penalty | number |
rawResponse.generation_settings.seed | number |
rawResponse.generation_settings.stop | string [] |
rawResponse.generation_settings.stream | boolean |
rawResponse.generation_settings.temperature? | number |
rawResponse.generation_settings.tfs_z | number |
rawResponse.generation_settings.top_k | number |
rawResponse.generation_settings.top_p | number |
rawResponse.generation_settings.typical_p | number |
rawResponse.model | string |
rawResponse.prompt | string |
rawResponse.stop | true |
rawResponse.stopped_eos | boolean |
rawResponse.stopped_limit | boolean |
rawResponse.stopped_word | boolean |
rawResponse.stopping_word | string |
rawResponse.timings | Object |
rawResponse.timings.predicted_ms | number |
rawResponse.timings.predicted_n | number |
rawResponse.timings.predicted_per_second | null | number |
rawResponse.timings.predicted_per_token_ms | null | number |
rawResponse.timings.prompt_ms? | null | number |
rawResponse.timings.prompt_n | number |
rawResponse.timings.prompt_per_second | null | number |
rawResponse.timings.prompt_per_token_ms | null | number |
rawResponse.tokens_cached | number |
rawResponse.tokens_evaluated | number |
rawResponse.tokens_predicted | number |
rawResponse.truncated | boolean |
Returns
Object
Name | Type |
---|---|
rawResponse | { content : string ; generation_settings : { frequency_penalty : number ; ignore_eos : boolean ; logit_bias : number [] ; mirostat : number ; mirostat_eta : number ; mirostat_tau : number ; model : string ; n_ctx : number ; n_keep : number ; n_predict : number ; n_probs : number ; penalize_nl : boolean ; presence_penalty : number ; repeat_last_n : number ; repeat_penalty : number ; seed : number ; stop : string [] ; stream : boolean ; temperature? : number ; tfs_z : number ; top_k : number ; top_p : number ; typical_p : number } ; model : string ; prompt : string ; stop : true ; stopped_eos : boolean ; stopped_limit : boolean ; stopped_word : boolean ; stopping_word : string ; timings : { predicted_ms : number ; predicted_n : number ; predicted_per_second : null | number ; predicted_per_token_ms : null | number ; prompt_ms? : null | number ; prompt_n : number ; prompt_per_second : null | number ; prompt_per_token_ms : null | number } ; tokens_cached : number ; tokens_evaluated : number ; tokens_predicted : number ; truncated : boolean } |
rawResponse.content | string |
rawResponse.generation_settings | { frequency_penalty : number ; ignore_eos : boolean ; logit_bias : number [] ; mirostat : number ; mirostat_eta : number ; mirostat_tau : number ; model : string ; n_ctx : number ; n_keep : number ; n_predict : number ; n_probs : number ; penalize_nl : boolean ; presence_penalty : number ; repeat_last_n : number ; repeat_penalty : number ; seed : number ; stop : string [] ; stream : boolean ; temperature? : number ; tfs_z : number ; top_k : number ; top_p : number ; typical_p : number } |
rawResponse.generation_settings.frequency_penalty | number |
rawResponse.generation_settings.ignore_eos | boolean |
rawResponse.generation_settings.logit_bias | number [] |
rawResponse.generation_settings.mirostat | number |
rawResponse.generation_settings.mirostat_eta | number |
rawResponse.generation_settings.mirostat_tau | number |
rawResponse.generation_settings.model | string |
rawResponse.generation_settings.n_ctx | number |
rawResponse.generation_settings.n_keep | number |
rawResponse.generation_settings.n_predict | number |
rawResponse.generation_settings.n_probs | number |
rawResponse.generation_settings.penalize_nl | boolean |
rawResponse.generation_settings.presence_penalty | number |
rawResponse.generation_settings.repeat_last_n | number |
rawResponse.generation_settings.repeat_penalty | number |
rawResponse.generation_settings.seed | number |
rawResponse.generation_settings.stop | string [] |
rawResponse.generation_settings.stream | boolean |
rawResponse.generation_settings.temperature? | number |
rawResponse.generation_settings.tfs_z | number |
rawResponse.generation_settings.top_k | number |
rawResponse.generation_settings.top_p | number |
rawResponse.generation_settings.typical_p | number |
rawResponse.model | string |
rawResponse.prompt | string |
rawResponse.stop | true |
rawResponse.stopped_eos | boolean |
rawResponse.stopped_limit | boolean |
rawResponse.stopped_word | boolean |
rawResponse.stopping_word | string |
rawResponse.timings | { predicted_ms : number ; predicted_n : number ; predicted_per_second : null | number ; predicted_per_token_ms : null | number ; prompt_ms? : null | number ; prompt_n : number ; prompt_per_second : null | number ; prompt_per_token_ms : null | number } |
rawResponse.timings.predicted_ms | number |
rawResponse.timings.predicted_n | number |
rawResponse.timings.predicted_per_second | null | number |
rawResponse.timings.predicted_per_token_ms | null | number |
rawResponse.timings.prompt_ms? | null | number |
rawResponse.timings.prompt_n | number |
rawResponse.timings.prompt_per_second | null | number |
rawResponse.timings.prompt_per_token_ms | null | number |
rawResponse.tokens_cached | number |
rawResponse.tokens_evaluated | number |
rawResponse.tokens_predicted | number |
rawResponse.truncated | boolean |
textGenerationResults | { finishReason : "length" | "stop" | "unknown" ; text : string = rawResponse.content }[] |
usage | { completionTokens : number = rawResponse.tokens_predicted; promptTokens : number = rawResponse.tokens_evaluated; totalTokens : number } |
usage.completionTokens | number |
usage.promptTokens | number |
usage.totalTokens | number |
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:357
restoreGeneratedTexts
▸ restoreGeneratedTexts(rawResponse
): Object
Parameters
Name | Type |
---|---|
rawResponse | unknown |
Returns
Object
Name | Type |
---|---|
rawResponse | { content : string ; generation_settings : { frequency_penalty : number ; ignore_eos : boolean ; logit_bias : number [] ; mirostat : number ; mirostat_eta : number ; mirostat_tau : number ; model : string ; n_ctx : number ; n_keep : number ; n_predict : number ; n_probs : number ; penalize_nl : boolean ; presence_penalty : number ; repeat_last_n : number ; repeat_penalty : number ; seed : number ; stop : string [] ; stream : boolean ; temperature? : number ; tfs_z : number ; top_k : number ; top_p : number ; typical_p : number } ; model : string ; prompt : string ; stop : true ; stopped_eos : boolean ; stopped_limit : boolean ; stopped_word : boolean ; stopping_word : string ; timings : { predicted_ms : number ; predicted_n : number ; predicted_per_second : null | number ; predicted_per_token_ms : null | number ; prompt_ms? : null | number ; prompt_n : number ; prompt_per_second : null | number ; prompt_per_token_ms : null | number } ; tokens_cached : number ; tokens_evaluated : number ; tokens_predicted : number ; truncated : boolean } |
rawResponse.content | string |
rawResponse.generation_settings | { frequency_penalty : number ; ignore_eos : boolean ; logit_bias : number [] ; mirostat : number ; mirostat_eta : number ; mirostat_tau : number ; model : string ; n_ctx : number ; n_keep : number ; n_predict : number ; n_probs : number ; penalize_nl : boolean ; presence_penalty : number ; repeat_last_n : number ; repeat_penalty : number ; seed : number ; stop : string [] ; stream : boolean ; temperature? : number ; tfs_z : number ; top_k : number ; top_p : number ; typical_p : number } |
rawResponse.generation_settings.frequency_penalty | number |
rawResponse.generation_settings.ignore_eos | boolean |
rawResponse.generation_settings.logit_bias | number [] |
rawResponse.generation_settings.mirostat | number |
rawResponse.generation_settings.mirostat_eta | number |
rawResponse.generation_settings.mirostat_tau | number |
rawResponse.generation_settings.model | string |
rawResponse.generation_settings.n_ctx | number |
rawResponse.generation_settings.n_keep | number |
rawResponse.generation_settings.n_predict | number |
rawResponse.generation_settings.n_probs | number |
rawResponse.generation_settings.penalize_nl | boolean |
rawResponse.generation_settings.presence_penalty | number |
rawResponse.generation_settings.repeat_last_n | number |
rawResponse.generation_settings.repeat_penalty | number |
rawResponse.generation_settings.seed | number |
rawResponse.generation_settings.stop | string [] |
rawResponse.generation_settings.stream | boolean |
rawResponse.generation_settings.temperature? | number |
rawResponse.generation_settings.tfs_z | number |
rawResponse.generation_settings.top_k | number |
rawResponse.generation_settings.top_p | number |
rawResponse.generation_settings.typical_p | number |
rawResponse.model | string |
rawResponse.prompt | string |
rawResponse.stop | true |
rawResponse.stopped_eos | boolean |
rawResponse.stopped_limit | boolean |
rawResponse.stopped_word | boolean |
rawResponse.stopping_word | string |
rawResponse.timings | { predicted_ms : number ; predicted_n : number ; predicted_per_second : null | number ; predicted_per_token_ms : null | number ; prompt_ms? : null | number ; prompt_n : number ; prompt_per_second : null | number ; prompt_per_token_ms : null | number } |
rawResponse.timings.predicted_ms | number |
rawResponse.timings.predicted_n | number |
rawResponse.timings.predicted_per_second | null | number |
rawResponse.timings.predicted_per_token_ms | null | number |
rawResponse.timings.prompt_ms? | null | number |
rawResponse.timings.prompt_n | number |
rawResponse.timings.prompt_per_second | null | number |
rawResponse.timings.prompt_per_token_ms | null | number |
rawResponse.tokens_cached | number |
rawResponse.tokens_evaluated | number |
rawResponse.tokens_predicted | number |
rawResponse.truncated | boolean |
textGenerationResults | { finishReason : "length" | "stop" | "unknown" ; text : string = rawResponse.content }[] |
usage | { completionTokens : number = rawResponse.tokens_predicted; promptTokens : number = rawResponse.tokens_evaluated; totalTokens : number } |
usage.completionTokens | number |
usage.promptTokens | number |
usage.totalTokens | number |
Implementation of
TextStreamingBaseModel.restoreGeneratedTexts
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:348
withChatPrompt
▸ withChatPrompt(): PromptTemplateTextStreamingModel
<ChatPrompt
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Returns this model with a chat prompt template.
Returns
PromptTemplateTextStreamingModel
<ChatPrompt
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Implementation of
TextStreamingBaseModel.withChatPrompt
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:441
withInstructionPrompt
▸ withInstructionPrompt(): PromptTemplateTextStreamingModel
<InstructionPrompt
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Returns this model with an instruction prompt template.
Returns
PromptTemplateTextStreamingModel
<InstructionPrompt
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Implementation of
TextStreamingBaseModel.withInstructionPrompt
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:432
withJsonOutput
▸ withJsonOutput(schema
): this
When possible, limit the output generation to the specified JSON schema, or super sets of it (e.g. JSON in general).
Parameters
Name | Type |
---|---|
schema | Schema <unknown > & JsonSchemaProducer |
Returns
this
Implementation of
TextStreamingBaseModel.withJsonOutput
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:406
withPromptTemplate
▸ withPromptTemplate<INPUT_PROMPT
>(promptTemplate
): PromptTemplateTextStreamingModel
<INPUT_PROMPT
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Maps the prompt for the full Llama.cpp prompt template (incl. image support).
Type parameters
Name |
---|
INPUT_PROMPT |
Parameters
Name | Type |
---|---|
promptTemplate | TextGenerationPromptTemplate <INPUT_PROMPT , LlamaCppCompletionPrompt > |
Returns
PromptTemplateTextStreamingModel
<INPUT_PROMPT
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Implementation of
TextStreamingBaseModel.withPromptTemplate
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:453
withSettings
▸ withSettings(additionalSettings
): LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>
The withSettings
method creates a new model with the same configuration as the original model, but with the specified settings changed.
Parameters
Name | Type |
---|---|
additionalSettings | Partial <LlamaCppCompletionModelSettings <CONTEXT_WINDOW_SIZE >> |
Returns
LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>
Example
const model = new OpenAICompletionModel({
model: "gpt-3.5-turbo-instruct",
maxGenerationTokens: 500,
});
const modelWithMoreTokens = model.withSettings({
maxGenerationTokens: 1000,
});
Implementation of
TextStreamingBaseModel.withSettings
Overrides
AbstractModel.withSettings
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:475
withTextPrompt
▸ withTextPrompt(): PromptTemplateTextStreamingModel
<string
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Returns this model with a text prompt template.
Returns
PromptTemplateTextStreamingModel
<string
, LlamaCppCompletionPrompt
, LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>, LlamaCppCompletionModel
<CONTEXT_WINDOW_SIZE
>>
Implementation of
TextStreamingBaseModel.withTextPrompt
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:423
Properties
provider
• Readonly
provider: "llamacpp"
Overrides
AbstractModel.provider
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:216
settings
• Readonly
settings: LlamaCppCompletionModelSettings
<CONTEXT_WINDOW_SIZE
>
Implementation of
TextStreamingBaseModel.settings
Inherited from
AbstractModel.settings
Defined in
packages/modelfusion/src/model-function/AbstractModel.ts:7
tokenizer
• Readonly
tokenizer: LlamaCppTokenizer
Implementation of
TextStreamingBaseModel.tokenizer
Defined in
packages/modelfusion/src/model-provider/llamacpp/LlamaCppCompletionModel.ts:225