... | ... | @@ -2,17 +2,17 @@ |
|
|
|
|
|
Not a formal subset or profile of HTML or any other formally specified language, but an *approach* to using HTML and web pages.
|
|
|
|
|
|
HTML Typescript has nothing to do with Word documents or word processing documents - except that it is designed so that producing recognizable HTML Typescript documents is relatively *easy* and *straightforward* to produce reliably from Word OpenXML.
|
|
|
HTML Typescript has nothing to do with Word documents or word processing documents - except that it is designed so that producing recognizable HTML Typescript documents is relatively *easy* and *straightforward* to produce reliably from Word OpenXML. So while there is no *formal* dependency, there is something of a practical one. Our first use for HTML Typescript is to make data available, which is now locked up in Word - so it has to work for that.
|
|
|
|
|
|
In particular, this means that insofar as as information available only from Word documents, is ambiguous and underspecified, an HTML Typescript version must mirror (exactly) that ambiguity and underspecification. It should tell us (albeit in HTML+CSS, a language we can understand) exactly what we could know from the Word (if we understood WordML) - and no more. It should not make logical leaps, inferences, or even much in the way of translation.
|
|
|
Among other things, this means that insofar as as information available only from Word documents, is ambiguous and underspecified, an HTML Typescript version must mirror (exactly) that ambiguity and underspecification. It should tell us (albeit in HTML+CSS, a language we can understand) exactly what we could know from the Word (if we understood WordML) - and no more. It should not make logical leaps, inferences, or even do very much in the way of translation.
|
|
|
|
|
|
### Design principles
|
|
|
|
|
|
A typewriter can be used to create an artifact (namely a typed MS or typescript) amenable to a type- and print-based publication process. Although a material object, the typescript also typically provides for a kind of *encoding* (indeed in more than one modality) by which it can communicate intentions from author to editors. Of course a typescript is also a "platform" for changes, with a kind of "production loop" built around it -- wherein it might be said to evolve in form, from submitted "fair copy" to galleys to printed production.
|
|
|
A typewriter can be used to create an artifact (namely a typed MS or typescript) amenable to a type- and print-based publication process. Although a material object, the (manipulated, annotated) typescript also provides for a kind of *encoding* (indeed in more than one modality) by which it can communicate intentions, among authors and from authors to editors. Thus a typescript becomes an (almost literal) "platform" for the (print) production process, with what we would recognize as a kind of "production loop" built around it -- wherein it might be said to evolve in form, from submitted "fair copy" to galleys to printed production.
|
|
|
|
|
|
We aim to provide the same sort of "paper functionality" in HTML. It is not -- quite -- an electronic doodle pad, nor a publishing application, but something in between. What we see are recognizably documents, with the features of formatted documents. But look at the code and you'll see -- once you get past the sheer verbosity of it -- the data is not in a formally controlled or even very regular arrangement. Paradoxically, since no structure is imposed or formal control exerted, what stands out is consistencies in the *way things are made to appear* (when you 'hit print' or view in a commodity browser) -- and as it turns out, these consistencies are precisely the guideposts we want, to make futher inferences.
|
|
|
We aim to provide the same sort of "paper functionality" in HTML. It is not -- quite -- an electronic doodle pad, nor a publishing application (either print or any other kind), but something in between. What we see are recognizably documents, with the features of formatted documents. But look at the code and you'll see -- once you get past the sheer verbosity of it -- the data is not in a formally controlled or even very regular arrangement. Paradoxically, since no structure is imposed or formal control exerted, what stands out is consistencies in the *way things are made to appear* (when you 'hit print' or view in a commodity browser) -- and as it turns out, these consistencies are precisely the guideposts we want, to make futher inferences.
|
|
|
|
|
|
We'll do this using basic-brain-dead HTML/CSS: basically a few structural divs for framing, then `p` elements with an assortment of inline mixed content including `b`, `i`, `u`. There will be resort to CSS to describe things that are not easily described using tags alone (such as margins and indents on paragraphs). But nothing should be obscure to the web developer.
|
|
|
We'll do this using basic-brain-dead HTML/CSS: basically a few structural divs for framing, then `p` elements with an assortment of inline mixed content including `b`, `i`, `u`. There will be resort to CSS to describe things that are not easily described using tags alone (such as margins and indents on paragraphs). But nothing should be obscure to the web developer. It should all be familiar stuff -- a bunch of "bad habits" that have to be cleaned up after and improved.
|
|
|
|
|
|
### Variability
|
|
|
|
... | ... | @@ -20,9 +20,9 @@ HTML Typescript isn't one thing, because it is a transitional format. (So the sa |
|
|
|
|
|
* At least early in editing, it will be mostly flat. (No application-oriented scaffolding to speak of certainly not lots of deeply nested divs.)
|
|
|
|
|
|
* Not much richness of tagging. No HTML5 'semantic' elements such as 'header' or 'aside'. Mostly just `p` elements with inline elements, `span` and the like.
|
|
|
* Not much richness of tagging. No HTML5 'semantic' elements such as `header` or `aside`. Mostly just `p` elements with inline elements, `span` and the like.
|
|
|
|
|
|
In other words, this looks much like the kind of language you would use in a simple program to control basic print layout -- maybe something like a "word processor" except without all the application's superstructure and internal wierdnesses. (See example below.) In effect the static is turned way way down so you can actually hear the signal.
|
|
|
In other words, this looks much like the kind of language you would use in a simple program to control basic print layout -- maybe something like a "word processor" except without all the application's superstructure and internal wierdnesses. (See example below.)
|
|
|
|
|
|
* The tagging will be *presentational*. For many purposes in document processing of course this is a complete no-no! But in HTML Typescript, we like to see presentational tagging as long as working with the data still includes a forensic process -- that is, as long as we are still interested in "what the author wrote (in the Word document)". (Why? Because that's what the author did was put that formatting in.) In other words, in converting data this is information we want to hang onto at least until we know for sure, we don't want it (because we know it is meaningless or we have captured its meaning a better way).
|
|
|
|
... | ... | |