Feature Proposal: improve Editoria's EPUB export
This feature proposal describes several steps to improve the quality of Editoria's EPUB export. While Editoria can currently export its contents as an EPUB from the Book Builder interface, there are some display and validation issues to be addressed. The following steps would produce a much more robust EPUB:
- Implement an existing UCP EPUB design as a new CSS sheet, using an existing CSS file for EPUBs from UCP as a starting point.
- Update how the EPUB component formats the HTML from Editoria, both so it works optimally with the new CSS sheet, and also to remove some legacy Vivliostyle-specific formatting.
- Validate an updated EPUB created with the new CSS sheet with the EPUBCheck tool, to ensure that it is syntactically valid and conforms to EPUB standards.
- A basic book metadata interface is being developed, and this metadata should be included in the EPUB.
These steps would lead to a noticeably more polished-looking EPUB, and validating that it conforms to EPUB specifications will ensure that the EPUB works in the widest variety of EPUB readers as possible.
Any user should be able to export an EPUB at any time from the Book Builder, as they can now. This proposal would not change Editoria's interface to export the book as an EPUB in the top left of the Book Builder screen.
EPUB files are essentially zipped file directories containing HTML files with the content and other files describing how they should be linked together into a book. A CSS file controls how the different HTML elements look as the EPUB is being read.
This is how Editoria’s EPUB export feature currently works:
- Any user clicks the "DOWNLOAD EPUB" button on the top right of the Book Builder interface.
- The PubSweet EPUB component (
pubsweet-component-epub) creates a series of HTML files from the content of the components in Editoria (1 file per chapter), and formats the HTML to work as part of an EPUB (more on this to come). It also adds other files and folders for the EPUB into a directory structure, then zips it all up into a single EPUB file.
- Finally, the complete EPUB file is delivered to the user as a download via their browser.
Create new CSS sheet
Tracked in #319 (closed). One step in this proposal to improve the EPUB is to replace the CSS file that the PS EPUB component uses for the EPUB. The CSS file Editoria currently packages with the exported EPUB causes some unexpected display issues. Furthermore, validating an Editoria-produced EPUB with EPUBCheck throws several errors and warnings related to the CSS (parsing and selector errors).
The CSS file extracted from a published UCP EPUB would make a good starting point for a new Editoria EPUB export CSS file. CSS works by using "selectors" to target specific HTML elements, then using the display properties associated with those selectors - margins, vertical space, font formatting, etc. - to control the way those HTML elements are displayed.
Right now, both Editoria’s EPUB CSS selectors and their associated display properties are different than the selectors and display properties used in UCP's EPUB stylesheet. UCP's CSS selectors are designed to work with HTML that is formatted differently than Editoria's HTML, created using a different book production process. For Editoria's export to have the same look as UCP's EPUB, the UCP display properties could be largely reused, but the CSS selectors associated with the display properties would need to be updated to work with Editoria’s HTML output. It's worth noting that font licensing might be an issue, but if it is, swapping fonts in and out is easy.
This approach (building from an existing CSS file) would save unnecessary duplication of effort as the display properties for the different elements wouldn't have to be built from scratch. Further, an EPUB using UCP’s CSS file passes EPUBCheck validation with no CSS errors or warnings. Finally, once the CSS selectors are updated to work with Editoria's HTML output, new EPUB designs could be made relatively easily by changing the display properties associated with the different selectors. As new users or organizations create their own CSS templates, one could imagine an Editoria interface for selecting different CSS templates for the EPUB export.
Update the PubSweet EPUB component
Tracked in #321 (closed). When the PubSweet EPUB component was first built, it was designed to work primarily with the Vivliostyle viewer for paginating. As a result, the PS EPUB component adds some formatting to the HTML that is specific to Vivliostyle. Since Editoria is no longer being used primarily with Vivliostyle, the way the PS EPUB component creates an EPUB should be updated to remove this formatting. It also seems likely that some tweaks to the HTML/EPUB structure would be helpful in implementing a new CSS template as described in the previous step.
Validate EPUB with EPUBCheck
Editoria should export an EPUB that is valid and built according to EPUB specifications. EPUBCheck is an open source tool for EPUB validation. When run through EPUBCheck validation, Editoria's currently exported EPUB fails with a series of errors and warnings (missing fonts, undefined resources and more).
Creating a brand new CSS file for use with Editoria will result in a different EPUB, as will removing formatting specifically meant for Vivliostyle. Thus, these changes should be made before addressing any remaining warnings and errors from EPUBCheck. Once these tasks are completed, though, any major outstanding EPUBCheck validation errors should then be resolved.
In the future, it would also be worth exploring the possibility of including a job as part of the EPUB export that would run the EPUBCheck validations on the exported EPUB and deliver the results to the user along with the EPUB file (likely a PubSweet job runner job).
Include book metadata in EPUB
Tracked in #320 (closed). While EPUBCheck isn't currently issuing any warnings or errors about missing metadata, we know that some of the basic book-level metadata is missing from Editoria and thus from the EPUB. A basic metadata interface for Editoria is currently under development (see this). Once it's built, the PS EPUB component should be updated to include the metadata in the EPUB.