... | @@ -44,13 +44,15 @@ As the system matures it should require less and less time from an XSLT develope |
... | @@ -44,13 +44,15 @@ As the system matures it should require less and less time from an XSLT develope |
|
|
|
|
|
Since there is much to be done to get to that point, this means being vigilant for opportunities for improvement.
|
|
Since there is much to be done to get to that point, this means being vigilant for opportunities for improvement.
|
|
|
|
|
|
|
|
Keep in mind that another advantage of a pipelining architecture is that XSLT can be combined into pipelines with transformations implemented in other languages.
|
|
|
|
|
|
## Big Questions
|
|
## Big Questions
|
|
|
|
|
|
### What should come through and what shouldn't and how do we know?
|
|
### What should come through and what shouldn't and how do we know?
|
|
|
|
|
|
While the transformation is lossless as respects "main document content" (that is, data that is stored as the nominal "document text" in the Word file), there is also much other data in a Word source file that should come with it (the most obvious: footnotes and figures), and due to the ("database-like") organization of the .docx source data, it is difficult or impossible to guarantee a transformation will never drop data. Especially since some of the info in the Word document (page headers come to mind as an example) should arguably not come into the HTML in any case.
|
|
While the transformation is lossless as respects "main document content" (that is, data that is stored as the nominal "document text" in the Word file), there is also much other data in a Word source file that should come with it (the most obvious: footnotes and figures), and due to the ("database-like") organization of the .docx source data, it is difficult or impossible to ensure a transformation will never drop data. Especially since some of the info in the Word document (page headers come to mind as an example) should arguably not come into the HTML in any case. (What can be ensured, at least theoretically, is that every new case of dropped data detected is also the last time that particular case is seen.)
|
|
|
|
|
|
In addition to some means of representing and aligning specifications for the different XSLTs (such as inline documentation), we need to have robust mechanisms for detecting problems in data extraction _especially lost data_, for ameliorating such problems in the instance (sometimes they may not be fatal errors), and for maintaining and improving the XSLTs so they don't happen.
|
|
In addition to some means of representing and aligning specifications for the different XSLTs (such as inline documentation), we need to have robust mechanisms for detecting problems in data extraction _especially lost data_; for ameliorating such problems in the instance (sometimes they may not be fatal errors); and for maintaining and improving the XSLTs so they don't happen.
|
|
|
|
|
|
Operationally, what will be the best way to specify corrections and feature requests? (Could use Issues on this here gitlab.)
|
|
Operationally, what will be the best way to specify corrections and feature requests? (Could use Issues on this here gitlab.)
|
|
|
|
|
... | | ... | |