XSweet issueshttps://gitlab.coko.foundation/XSweet/XSweet/-/issues2023-02-21T07:09:18Zhttps://gitlab.coko.foundation/XSweet/XSweet/-/issues/177Ingest Docx files containing binary math2023-02-21T07:09:18ZRyan Dix-PeekIngest Docx files containing binary math**Issue description;** the purpose of this task is to support the import of Docx files that contain binary math (formatted supported by Mathtype) and **view** the math in Wax.
Potential solutions; use Xsweet pipeline to extract the gif...**Issue description;** the purpose of this task is to support the import of Docx files that contain binary math (formatted supported by Mathtype) and **view** the math in Wax.
Potential solutions; use Xsweet pipeline to extract the gif files on import and display the formulas as images in Wax.
[BinaryMath.docx](/uploads/73878e990b94ef590ae6e0e7c868a4ab/BinaryMath.docx)
Error message on import into Kotahi;
![Screenshot_2022-05-31_at_09.24.38](/uploads/b77b5e296378ad792cace63249b462ce/Screenshot_2022-05-31_at_09.24.38.png)
Wax content view;
![Screenshot_2022-05-31_at_11.54.33](/uploads/d14c37fc4857ac9ae99bd6f8ad81197b/Screenshot_2022-05-31_at_11.54.33.png)BharathydasanBharathydasanhttps://gitlab.coko.foundation/XSweet/XSweet/-/issues/176Import PDF & convert to Kotahi's HTML profile2022-06-08T17:04:19ZRyan Dix-PeekImport PDF & convert to Kotahi's HTML profile**Description;** the purpose of this task is to support the import of PDFs into Kotahi through integration with Sciencebeam. Sciencebeam supports the conversion of PDFs to XML. We require conversion of PDF to HTML (Kotahi HTML profile sp...**Description;** the purpose of this task is to support the import of PDFs into Kotahi through integration with Sciencebeam. Sciencebeam supports the conversion of PDFs to XML. We require conversion of PDF to HTML (Kotahi HTML profile specifically).
Suggested solution; XSweet accepts docx, but the remaining pipelines support HTML. Convert PDF to HTML and then feed the output through XSweet for the doc clean-up; PDF -> TEI-XML -> Docx -> XSweet -> Wax
**Acceptance criteria;**
- Ensure HTML is accessible in Wax.
- Extract manuscript metadata and populate the submission form i.e. title, abstract and/or author name data.Suki VenkatSuki Venkathttps://gitlab.coko.foundation/XSweet/XSweet/-/issues/173Copyediting cleanups are not suitable for Spanish language2022-05-16T09:15:37ZSofia OlguinCopyediting cleanups are not suitable for Spanish language## Context
The [HTMLevator copyediting cleanups](https://xsweet.org/documentation/htmlevator/) does the following:
>Any number of spaces before or after em dashes are removed
This is suitable for English language texts, but in Spani...## Context
The [HTMLevator copyediting cleanups](https://xsweet.org/documentation/htmlevator/) does the following:
>Any number of spaces before or after em dashes are removed
This is suitable for English language texts, but in Spanish this generates errors in all the dialogs. In Spanish, the dialogs are written this:
—Hola —dijo el joven.
To reproduce:
- upload the attached word file in Editoria
- Check the chapter in Editoria and see how the space character disappear.
[dashSpace.docx](/uploads/7af2653f4e107dcf0ba33789e6b1ceb7/dashSpace.docx)
## Suggested solution
The HTMLevator copyediting cleanups should be configureable to support:
* difference uses cases between languages
* diffrerent use cases between Editorial house style guidesDione Mentisdione@coko.foundationDione Mentisdione@coko.foundationhttps://gitlab.coko.foundation/XSweet/XSweet/-/issues/158Some more Word detritus2018-10-10T04:56:20ZWendell PiezSome more Word detritusTo catch WordML elements so far unaccounted for -- we should consider the following matching and whether there isn't info to be captured e.g. from `caps` or `highlight`. The rest should be cleaned up in a "scrub" phase:
```
<xsl:templat...To catch WordML elements so far unaccounted for -- we should consider the following matching and whether there isn't info to be captured e.g. from `caps` or `highlight`. The rest should be cleaned up in a "scrub" phase:
```
<xsl:template match="noProof | iCs">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="caps | spacing | highlight | webHidden">
<span class="{local-name()}">
<xsl:apply-templates/>
</span>
</xsl:template>
```https://gitlab.coko.foundation/XSweet/XSweet/-/issues/141Extract a default font from Word docs2018-04-24T06:41:30ZAlex ThegExtract a default font from Word docsI believe Word applies the "Normal" style to text by default, when no other Style is specified. We could extract and apply the default font specified for text upon which no other font is specified. Currently, this text displays in the br...I believe Word applies the "Normal" style to text by default, when no other Style is specified. We could extract and apply the default font specified for text upon which no other font is specified. Currently, this text displays in the browser's default font.
Putting this on hold as a future development.https://gitlab.coko.foundation/XSweet/XSweet/-/issues/49Boxes, borders and rules2022-04-22T04:57:40ZWendell PiezBoxes, borders and rulesWe haven't seen any cases of boxes, borders or rules, but that doesn't mean they won't show up.
Since they can be expressed in CSS, there's an argument we should be capturing them.
We may have to create an artificial test sample with s...We haven't seen any cases of boxes, borders or rules, but that doesn't mean they won't show up.
Since they can be expressed in CSS, there's an argument we should be capturing them.
We may have to create an artificial test sample with some made-up stuff just so we can get a look.Wendell PiezWendell Piezhttps://gitlab.coko.foundation/XSweet/XSweet/-/issues/42Handle highlighting2020-06-03T15:08:18ZAlex ThegHandle highlightingOpening this issue because I'm looking at an example, but I'm going to put it on hold for now as it's a low priority.
From Green, Ch 1, "Fig. 6 about here" is highlighted green in Word, and comes through as a highlight tag in the HTML,...Opening this issue because I'm looking at an example, but I'm going to put it on hold for now as it's a low priority.
From Green, Ch 1, "Fig. 6 about here" is highlighted green in Word, and comes through as a highlight tag in the HTML, but does not actually appear in the HTML as highlighed:
```html
<p style="font-weight: bold; font-size: 18pt">
<highlight>[Fig. 6 about here.]</highlight>
</p>
```
1. Should we try to catch highlighting?
2. If so, do we care about preserving the original color?