... | ... | @@ -16,11 +16,11 @@ Not being able to provide a 100% solution, however, does not necessary make an 8 |
|
|
|
|
|
## Towards a solution
|
|
|
|
|
|
We are greedy. We know we need something that will work tolerably and usefully well, even if run with all the default settings intact and no special configuration. Running the set of stylesheets "lights out", that is (which is to say arbitrarily will no special inputs or supervision) should always be an option - and the outputs of such a process shouldn't be entirely useless.
|
|
|
We are ambitious. We know we need something that will work tolerably and usefully well, even if run with all the default settings intact and no special configuration. Running the set of stylesheets "lights out", that is (which is to say arbitrarily will no special inputs or supervision) should always be an option - and the outputs of such a process shouldn't be entirely useless.
|
|
|
|
|
|
However, we also want something that can be extended and adjusted to fit special cases and do special things - because special cases and special things are *regular* and *to be expected*.
|
|
|
However, we also want something that can be extended and adjusted to fit special cases and do special things - because special cases and special things are *regular* and *to be expected*. It is to be expected, we assert, that some loss of "semantic resolution" will occur - but the degree to which that happens, can be responsive to the level of effort and skill put in. In other words, while a solution should give us something (reasonable) "lights out", it should also be able to do more if opened up and tinkered with.
|
|
|
|
|
|
The best way we think we can do this is at the level of the architecture. We must design from the outset with the idea that everything in XSweet should be designed for adaptation and reuse. XSweet should work like a black box but you should also be able to open it up and rewire it -- completely, if need be.
|
|
|
The best way we think we can provide this is at the level of the architecture. We must design from the outset with the idea that everything in XSweet should be designed for adaptation and reuse. XSweet should work like a black box but you should also be able to open it up and rewire it -- completely, if need be.
|
|
|
|
|
|
And, because we already know that 'perfect for everyone all the time' is impossible, WE AIM (first) FOR USEFUL, NOT (yet) COMPLETE OR PERFECT
|
|
|
|
... | ... | @@ -32,16 +32,16 @@ If we break up the problem into detail, what are the pieces? |
|
|
|
|
|
Can we prioritize them based on which of them are more universal / ubiquitous, vs which show the most variation (across documents, document types, publication types and workflows) and will therefore be most problematic and peculiar?
|
|
|
|
|
|
- Main text including inline features such as bold, italics (and their warrants?)
|
|
|
- Main text including inline features such as bold, italics (and their warrants?) as well as significant structural divisions (headers)
|
|
|
- Footnotes, endnotes and textual apparatus with their cross-references
|
|
|
- 'Textual objects' including figures, tables, structured lists
|
|
|
(as represented typographically and by other means) with their cross-references
|
|
|
- Internal superstructure (parts, chapters, sections etc.) - determines scope(s) of reference(s) - w/ ToC
|
|
|
- Internal superstructure (parts, chapters, sections etc.) - determines scope(s) of reference(s) - ToC
|
|
|
- Bibliography / citations
|
|
|
- Specialized objects: math, formulae, drawings
|
|
|
- Specialized indexes
|
|
|
|
|
|
The beauty of listing them in order is that we can see that at least the low end can be addressed as "sloppy HTML" or HTML slops (messy, but nutritious) -- our target format of choice (see below).
|
|
|
The beauty of listing these in order is that we can see that at least the low end can be addressed as "sloppy HTML" or HTML slops (messy, but nutritious) -- our target format of choice (see below).
|
|
|
|
|
|
This is because WORDML IS NOT WHAT (practitioners call) GENERIC MARKUP
|
|
|
|
... | ... | @@ -53,7 +53,7 @@ Because our intermediate formats, however, will (also) be HTML, they may be imme |
|
|
|
|
|
Interestingly enough, we can do this all with an XML and specifically an XSLT-based pipeline architecture. Not only that, but if we take care that our HTML5 outputs are also well-formed XML, we can attach the extraction component to further processes (including XSLT processes) to provide missing parts of a complete solution.
|
|
|
|
|
|
### Generic markup (Considered as one of the Fine Arts)
|
|
|
### Generic Markup (Considered as One of the Fine Arts)
|
|
|
|
|
|
As an illustration of our problem in general, consider a microcosmic view, an example reduced to the barest possible. (The rest of the problem is much like this, only greatly magnified in scale and complexity.) Consider the following line:
|
|
|
|
... | ... | |