Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • XSweet XSweet
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 49
    • Issues 49
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1
    • Merge requests 1
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • XSweet
  • XSweetXSweet
  • Issues
  • #40

Closed
Open
Created Oct 20, 2016 by Alex Theg@athegOwner

Dropped spaces due to long strings of repeated tags

The book by Green has 4 parts, each with an introductory section. Part 1 come through rinsed really nicely and very clean, but parts 2, 3, and 4 drop almost all the spaces.

For whatever reason, large portions of this book are being extracted with each word wrapped in its own tag. A big part of Part 1 gets extracted as long strings of <iCs> tags, with empty iCs tags for spaces:

                <iCs>remained</iCs>
                <iCs></iCs>
                <iCs>the</iCs>
                <iCs></iCs>
                <iCs>majority</iCs>
                <iCs></iCs>
                <iCs>population</iCs>
                <iCs></iCs>
                <iCs>of</iCs>
                <iCs></iCs>
                <iCs>many</iCs>

These all gets collapsed into a p tag with spaces preserved and the HTML looks great.

Parts 2, 3, and 4, though, have strings of spans instead:

                <span style="font-size: 12pt">Safavids,</span>
                <span style="font-size: 12pt"></span>
                <span style="font-size: 12pt">and</span>
                <span style="font-size: 12pt"></span>
                <span style="font-size: 12pt">Uzbeks—seized</span>
                <span style="font-size: 12pt"></span>
                <span style="font-size: 12pt">control</span>
                <span style="font-size: 12pt"></span>

When this happens, the content ends up in one long p tag with no spaces.

The introduction has a long string of <lang> tags, similar to the iCs tags above. They don’t cause dropped spaces.

This is related to #35 (closed) but may be caused by a different underlying issue.

Assignee
Assign to
Time tracking