... | ... | @@ -22,13 +22,13 @@ HTML Typescript isn't one thing, because it is a transitional format. (So the sa |
|
|
|
|
|
* Not much richness of tagging. No HTML5 'semantic' elements such as `header` or `aside`. Mostly just `p` elements with inline elements, `span` and the like.
|
|
|
|
|
|
In other words, this looks much like the kind of language you would use in a simple program to control basic print layout -- maybe something like a "word processor" except without all the application's superstructure and internal wierdnesses. (See example below.)
|
|
|
In other words, this looks much like the kind of language you would use in a simple program to control basic print layout -- maybe something like a "word processor" except without all the application's superstructure and internal wierdnesses. (See example below.)
|
|
|
|
|
|
* The tagging will be *presentational*. For many purposes in document processing of course this is a complete no-no! But in HTML Typescript, we like to see presentational tagging as long as working with the data still includes a forensic process -- that is, as long as we are still interested in "what the author wrote (in the Word document)". (Why? Because that's what the author did was put that formatting in.) In other words, in converting data this is information we want to hang onto at least until we know for sure, we don't want it (because we know it is meaningless or we have captured its meaning a better way).
|
|
|
|
|
|
And of course cleaning up the cruft is exactly how we get "nicer" HTML Typescript out of "noisier" HTML Typescript.
|
|
|
And of course cleaning up the cruft is exactly how we get "nicer" HTML Typescript out of "noisier" HTML Typescript.
|
|
|
|
|
|
By 'presentational' of course we mean that tagging is devoted to describing presentational or "formatting" (and generally 'visual') properties of the text, without abstract labeling of "semantic" categories. Font shifts and margins are the big ones.
|
|
|
By 'presentational' of course we mean that tagging is devoted to describing presentational or "formatting" (and generally 'visual') properties of the text, without abstract labeling of "semantic" categories. Font shifts and margins are the big ones.
|
|
|
|
|
|
* Structure will be "hidden" in presentational features. For example, margin shifts may indicate things like block quotes or excerpts.
|
|
|
|
... | ... | @@ -92,10 +92,8 @@ But this is no longer HTML Typescript. (It's something more like "HTML Galley Pr |
|
|
* Cross-references are emulated for footnotes and endnotes (as regular structures created as such in the Word document), but by and large, there will be no linking or cross-referencing implemented where they are warranted in the source. Where the source data has a string, we get a string and nothing more. (We expect robust linking mechanisms to come later when data is better structured and more regular.)
|
|
|
* `class` overloading is okay. `class` is convenient for any semantic labeling. We will be more or less shameless in adding whatever values we think we need, to communicate downstream.
|
|
|
* Similarly `style` and other *presentational* formatting is not only acceptable - for some applications it is preferred. Word has the feature that any arbitrary span or segment of text may be assigned its own properties for presentation. (And writers using Word do this a lot.) This has a straightforward analogue in HTML `@style` attribute, which is (for ordinary every day purposes) deprecated or discouraged. It permits us to hang CSS to describe formatting basically wherever we like [example]
|
|
|
|
|
|
In effect, CSS provides us the language to expose a range of features in our source data, in a way familiar to developers and readily processed using tools they already have. So if the Word says, "1inch left margin" we can turn that into "left margin: 72pt" or indeed "left-margin: 1in" in CSS.
|
|
|
|
|
|
Indeed, the very reasons why @style and presentational tagging in general should be avoided in "good" markup -- because *at best*, they represent *work to be done*, while at worst they are misleading cruft -- these are the very same reasons why presentational tagging, `i`, `b` and all that species, are acceptable and indeed preferable as a representation (as 'transparent' as possible) of Word source data. Because that's what the Word data has, and we need to see exactly what is there prior to "casting" it into anything.
|
|
|
In effect, CSS provides us the language to expose a range of features in our source data, in a way familiar to developers and readily processed using tools they already have. So if the Word says, "1inch left margin" we can turn that into "left margin: 72pt" or indeed "left-margin: 1in" in CSS.
|
|
|
Indeed, the very reasons why @style and presentational tagging in general should be avoided in "good" markup -- because *at best*, they represent *work to be done*, while at worst they are misleading cruft -- these are the very same reasons why presentational tagging, `i`, `b` and all that species, are acceptable and indeed preferable as a representation (as 'transparent' as possible) of Word source data. Because that's what the Word data has, and we need to see exactly what is there prior to "casting" it into anything.
|
|
|
|
|
|
The expectation of HTML Typescript is that we wish the thing to be a basis for improvement, not a finished thing in itself.
|
|
|
|
... | ... | |