try lit-gpt train from https://sebastianraschka.com/blog/2023/optimizing-LLMs-dataset-perspective.html