Interface: FullTokenizer
Interface for a comprehensive tokenizer that extends the basic tokenization capabilities.
In addition to basic tokenization, this interface provides methods for detokenization and retrieving the original text corresponding to each token, enabling a more informative and reversible transformation process.
Hierarchy
-
↳
FullTokenizer
Implemented by
Properties
detokenize
• detokenize: (tokens
: number
[]) => PromiseLike
<string
>
Asynchronously revert a sequence of numeric tokens back into the original text. Detokenization is the process of transforming tokens back to a human-readable format, and it's essential in scenarios where the output needs to be interpretable or when the tokenization process has to be reversible.
Param
An array of numeric tokens to be converted back to text.
Type declaration
▸ (tokens
): PromiseLike
<string
>
Asynchronously revert a sequence of numeric tokens back into the original text. Detokenization is the process of transforming tokens back to a human-readable format, and it's essential in scenarios where the output needs to be interpretable or when the tokenization process has to be reversible.
Parameters
Name | Type | Description |
---|---|---|
tokens | number [] | An array of numeric tokens to be converted back to text. |
Returns
PromiseLike
<string
>
A promise containing a string that represents the original text corresponding to the sequence of input tokens.
Defined in
packages/modelfusion/src/model-function/tokenize-text/Tokenizer.ts:44
tokenize
• tokenize: (text
: string
) => PromiseLike
<number
[]>
Asynchronously tokenize the given text into a sequence of numeric tokens.
Param
Input text string that needs to be tokenized.
Type declaration
▸ (text
): PromiseLike
<number
[]>
Asynchronously tokenize the given text into a sequence of numeric tokens.
Parameters
Name | Type | Description |
---|---|---|
text | string | Input text string that needs to be tokenized. |
Returns
PromiseLike
<number
[]>
A promise containing an array of numbers, where each number is a token representing a part or the whole of the input text.
Inherited from
Defined in
packages/modelfusion/src/model-function/tokenize-text/Tokenizer.ts:13
tokenizeWithTexts
• tokenizeWithTexts: (text
: string
) => PromiseLike
<{ tokenTexts
: string
[] ; tokens
: number
[] }>
Asynchronously tokenize the given text, providing both the numeric tokens and their corresponding text.
Param
Input text string to be tokenized.
Type declaration
▸ (text
): PromiseLike
<{ tokenTexts
: string
[] ; tokens
: number
[] }>
Asynchronously tokenize the given text, providing both the numeric tokens and their corresponding text.
Parameters
Name | Type | Description |
---|---|---|
text | string | Input text string to be tokenized. |
Returns
PromiseLike
<{ tokenTexts
: string
[] ; tokens
: number
[] }>
A promise containing an object with two arrays:
tokens
- An array of numbers where each number is a token.tokenTexts
- An array of strings where each string represents the original text corresponding to each token.
Defined in
packages/modelfusion/src/model-function/tokenize-text/Tokenizer.ts:31