... | @@ -48,9 +48,7 @@ Since there is much to be done to get to that point, this means being vigilant f |
... | @@ -48,9 +48,7 @@ Since there is much to be done to get to that point, this means being vigilant f |
|
|
|
|
|
### What should come through and what shouldn't and how do we know?
|
|
### What should come through and what shouldn't and how do we know?
|
|
|
|
|
|
No provision is made for passing through, for example, page headers, into the HTML, in any form.
|
|
While the transformation is lossless as respects "main document content" (that is, data that is stored as the nominal "document text" in the Word file), there is also much other data in a Word source file that should come with it (the most obvious: footnotes and figures), and due to the ("database-like") organization of the source data, it is difficult or impossible to guarantee a perfect transformation. Especially since some of the info in the Word document (page headers come to mind as an example) should arguably not come into the HTML.
|
|
|
|
|
|
However, as this is being written no provision is yet made for handling tables, for example, and we know we will have to handle them. So we already know we will be fixing up the XSLT to work for these cases. But what about cases we haven't seen yet?
|
|
|
|
|
|
|
|
We need to have robust mechanisms for detecting problems in data extraction _especially lost data_, for ameliorating such problems in the instance (sometimes they may not be fatal errors), and for maintaining and improving the XSLTs so they don't happen.
|
|
We need to have robust mechanisms for detecting problems in data extraction _especially lost data_, for ameliorating such problems in the instance (sometimes they may not be fatal errors), and for maintaining and improving the XSLTs so they don't happen.
|
|
|
|
|
... | | ... | |