Class: CohereTokenizer
Tokenizer for the Cohere models. It uses the Co.Tokenize and Co.Detokenize APIs.
See
Example
const tokenizer = new CohereTokenizer({ model: "command" });
const text = "At first, Nox didn't know what to do with the pup.";
const tokenCount = await countTokens(tokenizer, text);
const tokens = await tokenizer.tokenize(text);
const tokensAndTokenTexts = await tokenizer.tokenizeWithTexts(text);
const reconstructedText = await tokenizer.detokenize(tokens);
Implements
Constructors
constructor
• new CohereTokenizer(settings
): CohereTokenizer
Parameters
Name | Type |
---|---|
settings | CohereTokenizerSettings |
Returns
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:44
Methods
callDeTokenizeAPI
▸ callDeTokenizeAPI(tokens
, callOptions?
): Promise
<{ meta
: { api_version
: { version
: string
} } ; text
: string
}>
Parameters
Name | Type |
---|---|
tokens | number [] |
callOptions? | FunctionCallOptions |
Returns
Promise
<{ meta
: { api_version
: { version
: string
} } ; text
: string
}>
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:80
callTokenizeAPI
▸ callTokenizeAPI(text
, callOptions?
): Promise
<{ meta
: { api_version
: { version
: string
} } ; token_strings
: string
[] ; tokens
: number
[] }>
Parameters
Name | Type |
---|---|
text | string |
callOptions? | FunctionCallOptions |
Returns
Promise
<{ meta
: { api_version
: { version
: string
} } ; token_strings
: string
[] ; tokens
: number
[] }>
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:48
detokenize
▸ detokenize(tokens
): Promise
<string
>
Asynchronously revert a sequence of numeric tokens back into the original text. Detokenization is the process of transforming tokens back to a human-readable format, and it's essential in scenarios where the output needs to be interpretable or when the tokenization process has to be reversible.
Parameters
Name | Type | Description |
---|---|---|
tokens | number [] | An array of numeric tokens to be converted back to text. |
Returns
Promise
<string
>
A promise containing a string that represents the original text corresponding to the sequence of input tokens.
Implementation of
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:125
tokenize
▸ tokenize(text
): Promise
<number
[]>
Asynchronously tokenize the given text into a sequence of numeric tokens.
Parameters
Name | Type | Description |
---|---|---|
text | string | Input text string that needs to be tokenized. |
Returns
Promise
<number
[]>
A promise containing an array of numbers, where each number is a token representing a part or the whole of the input text.
Implementation of
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:112
tokenizeWithTexts
▸ tokenizeWithTexts(text
): Promise
<{ tokenTexts
: string
[] = response.token_strings; tokens
: number
[] = response.tokens }>
Asynchronously tokenize the given text, providing both the numeric tokens and their corresponding text.
Parameters
Name | Type | Description |
---|---|---|
text | string | Input text string to be tokenized. |
Returns
Promise
<{ tokenTexts
: string
[] = response.token_strings; tokens
: number
[] = response.tokens }>
A promise containing an object with two arrays:
tokens
- An array of numbers where each number is a token.tokenTexts
- An array of strings where each string represents the original text corresponding to each token.
Implementation of
FullTokenizer.tokenizeWithTexts
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:116
Properties
settings
• Readonly
settings: CohereTokenizerSettings
Defined in
packages/modelfusion/src/model-provider/cohere/CohereTokenizer.ts:42