torchtext.functional¶

to_tensor¶

torchtext.functional.to_tensor(input: Any, padding_value: Optional[int] = None, dtype: dtype = torch.int64) → Tensor[source]¶

将输入转换为 torch 张量

Parameters:

Return type:

张量

Tutorials using to_tensor:: 使用 XLM-RoBERTa 模型进行 SST-2 二分类文本分类

使用 XLM-RoBERTa 模型进行 SST-2 二分类文本分类

torchtext.functional.truncate(input: Any, max_seq_len: int) → Any[source]¶

截断输入序列或批次

Parameters:

输入 (Union[List[Union[str, int]], List[List[Union[str, int]]]]) – 需要截断的输入序列或批次
max_seq_len (int) – 超过该长度的输入将会被丢弃

Returns:

截断序列

Return type:

联合[List[联合[字符串, 整数]], List[List[联合[字符串, 整数]]]]

torchtext.functional.add_token(input: Any, token_id: Any, begin: bool = True) → Any[source]¶

在序列开头或结尾添加令牌

Parameters:

输入 (Union[List[Union[str, int]], List[List[Union[str, int]]]]) – 输入序列或批处理
token_id (Union[str, int]) – 要添加的标记
开始 (布尔值, 可选) – 是否在序列开头或结尾插入标记，默认为True

Returns:

带有起始或结束 token_id 的序列或批次输入

Return type:

联合[List[联合[字符串, 整数]], List[List[联合[字符串, 整数]]]]

torchtext.functional.str_to_int(input: Any) → Any[source]¶

将字符串令牌转换为整数（单个序列或批处理）。