torcharrow.functional¶

Velox 核心功能¶

Velox 核心函数包含在 torcharrow.functional 中。

以下是 Velox 字符串函数 lpad 的示例用法：

>>> import torcharrow as ta
>>> from torcharrow import functional
>>> col = ta.column(["abc", "x", "yz"])
# Velox's lpad function: https://facebookincubator.github.io/velox/functions/string.html#lpad
>>> functional.lpad(col, 5, "123")
0  '12abc'
1  '1231x'
2  '123yz'
dtype: String(nullable=True), length: 3, null_count: 0, device: cpu

以下是 Velox 数组函数的另一个示例用法 array_except：

>>> col1 = ta.column([[1, 2, 3], [1, 2, 3], [1, 2, 2], [1, 2, 2]])
>>> col2 = ta.column([[4, 5, 6], [1, 2], [1, 1, 2], [1, 3, 4]])
# Velox's array_except function: https://facebookincubator.github.io/velox/functions/array.html#array_except
>>> functional.array_except(col1, col2)
0  [1, 2, 3]
1  [3]
2  []
3  [2]
dtype: List(Int64(nullable=True), nullable=True), length: 4, null_count: 0

文本作¶

add_tokens

将 tokens/indices 列表追加或追加到列。

推荐作¶

`bucketize`	为输入功能应用分桶化。
`sigrid_hash`	将哈希应用于索引或索引列表。
`firstx`	返回输入列 head 的前 x 个值
`has_id_overlap`	如果两个输入列重叠，则返回 1.0，否则返回 0.0
`id_overlap_count`	返回两个 ID 列表之间的重叠数
`get_max_count`	如果存在 input_ids 和 matching_ids 之间重叠的项目，则重叠 ID 的最大实例数将计入最大计数。
`get_jaccard_similarity`	返回 input_ids 和 matching_ids 之间的jaccard_similarity。
`get_cosine_similarity`	返回由 input_ids input_id_scores加权的向量与matching_ids 加权的余弦matching_id_scores
`get_score_sum`	返回 matching_id_scores 中所有分数的总和，这些分数在 matching_ids 中具有相应的 ID input_ids 中。
`get_score_min`	返回 matching_id_scores 中所有分数的最小值，该分数在 matching_ids 中具有相应的 ID input_ids。
`get_score_max`	返回 matching_id_scores 中所有分数的最小值，该分数在 matching_ids 中具有相应的 ID input_ids。

高级作¶

scale_to_0_1

返回缩放到范围 [0,1] 的列数据。

torcharrow.functional¶

Velox 核心功能¶

文本作¶

推荐作¶

高级作¶

文档

教程

资源