Add ability to load all text files from a subdirectory for training (#1997)

* Update utils.py

returns individual txt files and subdirectories to getdatasets to allow for training from a directory of text files

* Update training.py

minor tweak to training on raw datasets to detect if a directory is selected, and if so, to load in all the txt files in that directory for training

* Update put-trainer-datasets-here.txt

document

* Minor change

* Use pathlib, sort by natural keys

* Space

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
This commit is contained in:
kizinfo 2023-07-12 17:44:30 +03:00 committed by GitHub
parent 73a0def4af
commit 5d513eea22
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 22 additions and 5 deletions

View file

@ -0,0 +1 @@
to load multiple raw text files create a subdirectory and put them all there