DHitMA: Data Organization Considerations


From the very beginning, it is important to consider the organization of one's data. Issues found and fixed early on in the data hoarding process, rather than later, are much simpler to fix; such is why up-front consideration is necessary. What folders will be created, and what shall they be named? This is but the most simple question; more important ones exist, and are detailed below:

  1. What will folders be created and named?
  2. How granular will one's folder categorization go (how many nested folders deep)? There is a practical limit to filepath length, on the order of a few hundred characters for most operating systems.
  3. Will folders be named alphabetically/numerically to organize them as well?
  4. Will individual files be named, or shall folder names and the grouping of them suffice for this purpose?
  5. Shall folder/file names be kept short or detailed?

If these questions are considered, then the data hoarder shall, in answering them, possess a scheme, so to speak, of how one's data will be organized. Again, this is vastly important, because problems considered now result in less pain later. If these questions are not answered to begin with, then one's data hoard may begin to grow unorganized and cluttered, and will eventually need to be remedied by answering the questions.

A very important rule that should be followed is the use of standard characters in naming files and folders. The characters a-z, A-Z, 0-9, and the underscore (_) are the only characters that should be used in naming files and folders. Spaces may be used; however, for absolute continuity (especially with older software, browsers, etc.), spaces should be replaced with underscores. Do not use special characters, emojis, foreign characters, etc. Plain English text is preferred for the reason that the English language is known worldwide and exists in every encoding scheme to the fullest extent. If one wants their data to be perfectly preserved for use in the future, the use of the characters a-z, A-Z, 0-9, _, and nothing else, shall suffice to meet this aim.