Custom Pre-Tokenized Dataset

How to use a custom pre-tokenized dataset.

Sample config:

config.yml
datasets:
  - path: /path/to/your/file.jsonl
    ds_type: json
    type:

Sample jsonl:

{"input_ids":[271,299,99],"attention_mask":[1,1,1],"labels":[271,-100,99]}
{"input_ids":[87,227,8383,12],"attention_mask":[1,1,1,1],"labels":[87,227,8383,12]}