... | ... | @@ -138,7 +138,7 @@ So far so good - but what will that format actually be? It is not hard to envisi |
|
|
- `<b>Gene Roddenberry's <i>Star Trek</i></b>` (DITA)
|
|
|
- `<bold>Gene Roddenberry's <italic>Star Trek</italic></bold>` (JATS/BITS)
|
|
|
- `<emphasis role="bold">Gene Roddenberry's <emphasis>Star Trek</emphasis></emphasis>` (Docbook)
|
|
|
- ` <hi rend="bold">Gene Roddenberry's <hi rend="italic">Star Trek</hi></hi>` (TEI)
|
|
|
- `<hi rend="bold">Gene Roddenberry's <hi rend="italic">Star Trek</hi></hi>` (TEI)
|
|
|
- `<run format="b">Gene Roddenberry's <run format="i">Star Trek</run></run>` (made-for-purpose)
|
|
|
|
|
|
These are all more or less the same or at any rate semantically equivalent inasmuch as any one of them could be mapped to write any of the others. Also note that none of them presents quite the level of semantic richness of the TEI and JATS examples above (which tell us, for example, that 'Star Trek' is a title, not only italicized.) This is merely what is called *presentational* markup. Yet maybe this is enough for a first step.
|
... | ... | @@ -167,14 +167,14 @@ Why does HTML make a good target vocabulary? |
|
|
|
|
|
- We can easily leave our documents 'flat' as long as we need to - structure can come later! (This is a key distinction vs our final target format)
|
|
|
- It has `@class` and `@style`, fantastic escape hatches!
|
|
|
- One of the escape hatches gives us CSS! (and we are describing presentational features.) While the other can expose Word Styles (since it is for user-driven semantic labeling)
|
|
|
- Yet at the same time, HTML semantics are not so rich as to be very arguable (anything will do)
|
|
|
- HTML @style invites us to use CSS! (And we are describing presentational features. Perfect.) While HTML @class can expose Word Styles (since it is for user-driven semantic labeling after all).
|
|
|
- Yet at the same time, HTML semantics are not so rich as to be very arguable (anything will do); we should be able to avoid complications regarding the purported conformance/orthodoxy of our outputs to applicable standards. (Also a separable problem.)
|
|
|
- To top it off, HTML is a well-known vernacular --
|
|
|
- A custom vocabulary would have to be designed, tested, documented and learned by users; HTML lets us just fake it for now*;
|
|
|
- And since we are expecting to edit (at least initially) on an HTML platform, and go from there when it comes to other formats for interchange/archiving - we can just stick with that plan ...
|
|
|
- ... Illustrating the point: anyone can use HTML5 (especially wf XML HTML5) so let's use that
|
|
|
- And since we are expecting to work further with our data on an HTML platform, and go from there when it comes to other formats for interchange/archiving - we can just stick with that plan ...
|
|
|
- ... Illustrating the point: anyone can use HTML5 (especially wf XML HTML5) so let's use that.
|
|
|
|
|
|
(* Later if need be we can come back to formalize the target format as a profile of HTML5.)
|
|
|
(* Later if need be we can come back to formalize the target format as a profile of HTML5+CSS.)
|
|
|
|
|
|
Note the non-canonical and arguably deprecated heavy use of @style - we justify this on the grounds that we are going *up hill* and *by the time we reach the top* we can *cast these properties aside as nothing more than the engine that has got us there*.
|
|
|
|
... | ... | @@ -184,3 +184,8 @@ Next to these, the fact that HTML also has an element-type semantics albeit an i |
|
|
<p class="listing">Gene Roddenberry's <span class="title.cited">Star Trek</span></p>
|
|
|
```
|
|
|
|
|
|
## How it will work
|
|
|
|
|
|
XSLT (specifically XSLT 2.0) is a great tool for this job. The most powerful, flexible and generic (hence portable) approach would combine XSLT stylesheets in a pipeline, a multi-step process beginning with "data extraction" (reading the data from the WordML and re-expressing it, as 'literally' as possible, in HTML+CSS), and then proceeding through as many steps as necessary of subsequent "refinement", in which the markup would be cleaned up and enhanced. One advantage of a pipeline arrangement is how straightforward it makes it to extend and modify (by changing or adding steps only where necessary).
|
|
|
|
|
|
Pipelines are logical organizations and can be implemented in many different ways. While with our code base we will offer pipeline configurations using both Bourne Shell (bash) and W3C XProc, for reference and component testing, we also expect to use our sibling project INK as a pipelining architecture, making XSweet available as a service to anyone using INK. |