F
foodbrake3
@foodbrake3
https://s-games.net/
As an alternative of rewards, we use new varieties of suggestions, similar to demonstrations (within the above instance, human-written summaries), preferences (judgments about which of two summaries is healthier), corrections (modifications to a summary that may make it higher), and extra. We hope that BASALT might be used by anyone who aims to study from human suggestions, whether or not they are engaged on imitation studying, studying from comparisons, or another techniqu