-
-
Notifications
You must be signed in to change notification settings - Fork 172
Description
The single file format has been discussed before. There are three key benefits of the current structure of saving each text document and meta data unit as a separate file. Especially when it comes to the project's text itself:
- It is safer against data loss.
- It is very file sync friendly.
- It is version control friendly.
Issue
The fact that all the data is stored in plain text files seems to imply to many users that the data is open to be edited externally, and therefore the project folder should be easy to navigate for that purpose. But the project folder is intended as a file database, not as a folder with files to be manually accessed. Because of the possibility of manual edits, a lot of extra care has to be taken when reading the data in order to account for user-introduced errors.
A few conditions must apply to a single file format:
- It must be optional (although perhaps default) for the foreseeable future.
- In order to satisfy point 1 above, the file format must retain the individual document read/write properties of the current setup, and not rely on an in-memory data buffer of all project content between open and save/close.
- The file sync stability of point 2 must be retained.
- The version control support can only be retained by using a single flat plain text file format, and even that is not ideal. Point 3 is not a requirement. For users who requires this, opting out of the single file format is a reasonable solution.
Implementation
I've considered a few options before, detailed in #259. While using a read/write database as storage would solve the in-place issue of point 1, database files really aren't file sync friendly or even file sync safe, which breaks with point 2.
The one solution that stands out as the simplest solution is to keep the project as an archive file. While an archive is not really writable in the sense demanded by point 1, a temporary workspace can bypass this issue: When the project is opened, it can be extracted into a temp folder, and written back as an archive when the project is saved.
The additional benefit of this implementation is that it avoids the need to make any fundamental changes to how novelWriter reads and writes data during the writing session. All that is needed is to inject an extra step in the open and save process.
Steps
Updated on 2024-10-24
General
- Bump project format version to 2.0 and make sure a dialog pops up to say the conversion is one-way and permanent.
- The main project file should be converted to JSON from XML.
- The content files should be named
.txt. See Use txt extension for text documents #1964. - The file extension for both the archive and the folder project file should be either
.nwprojor.nwprj. - Provide Project menu options to "Save as Folder" and "Save as Archive".
Archive Format
- Implement the single file as a Python ZipFile object.
- When open, the archive is extracted and works like a folder-based project. Where to extract is still to be determined. It should either be next to the archive file, or in the novelWriter user data folder.
- Sanitise the file list on read/write to the zip file. This will make it easier to clean out deprecated files, and also ensure that files added in other ways to the archive are cleaned out.
- The internal file extensions should reflect the actual data format used. So
.json,.txt, etc.
This implementation should be fairly file sync friendly as well. At least as much as a Word or Open Document file, which are built in essentially the same way.
Additional
- Allow the user to discard the working (extracted) copy of the project when closing, see close without saving #1470