I think this is a common problem: there are many data files gathered in several folders on my machine. And at some point I want to start new ML/DS project and pick some files from that common folders. Usually, I just create a new folder for this new project and copy-paste those files. But, with growing number of projects/researches, this slowly becomes a mess of duplicated files scattered all over the filesystem.
Another point, sometimes I have to mark or remember some specific observations / oultier events / problems, encoutered at some location within a single file. And later, when I have a sufficient amount of such observations from many files, I can start to investigate it.
So, it would be nice to have some database with the ability to tag specific info to some files (or group them by some criteria) without copying them, and arrange those tags into a tree, just like folders tree, but with multiple grouping. And then select file lists from any group for further processing.
I wonder if somebody had similar problems, and if so, what file database did he used for such tasks?