目录

torchtune.data

Text templates

Templates for instruct prompts and chat prompts. Includes some specific formatting for difference datasets and models.

InstructTemplate

Interface for instruction templates.

AlpacaInstructTemplate

Prompt template for Alpaca-style datasets.

GrammarErrorCorrectionTemplate

Prompt template for grammar correction datasets.

SummarizeTemplate

Prompt template to format datasets for summarization tasks.

StackExchangedPairedTemplate

Prompt template for preference datasets similar to StackExchangedPaired.

ChatFormat

Interface for chat formats.

ChatMLFormat

OpenAI's Chat Markup Language used by their chat models.

Llama2ChatFormat

Chat format that formats human and system prompts with appropriate tags used in Llama2 pre-training.

MistralChatFormat

Formats according to Mistral's instruct model.

Types

Message

This dataclass represents individual messages in an instruction or chat dataset.

Converters

Converts data from common JSON formats into a torchtune Message.

get_sharegpt_messages

Convert a chat sample adhering to the ShareGPT json structure to torchtune's Message structure.

get_openai_messages

Convert a chat sample adhering to the OpenAI API json structure to torchtune's Message structure.

Helper funcs

Miscellaneous helper functions used in modifying data.

validate_messages

Given a list of messages, ensure that messages form a valid back-and-forth conversation.

truncate

Truncate a list of tokens to a maximum length.

文档

访问 PyTorch 的全面开发人员文档

查看文档

教程

获取面向初学者和高级开发人员的深入教程

查看教程

资源

查找开发资源并解答您的问题

查看资源