Skip to content

Add SILE format support #6087

@alerque

Description

@alerque

Back in 2015 I cloned Pandoc and started hacking on a fork to support The SILE Typesetter. It's now 2020 and this issue is to track the overall progress on that effort with the goal of getting it upstreamed into Pandoc. I'm hesitant to drop links here because the only thing most people are going to see is that I'm a terrible Haskell hack. I'm pretty sure the commit history will reveal that I'm not much better than a bunch of circus monkeys pushing random buttons hoping something works. But lets face it, this is never going to get contributed if I don't ⓐ get some help and ⓑ am not motivated by the exposure.

SILE support has taken several shapes in my hacking, and long term I think it should take on one or two more. I initially started with a copy of the LaTeX writer. As time has gone on I've been systematically stripping things from it because SILE is fundamentally simpler than LaTeX. I almost wish now I'd started with a blank slate ond built it one rule at a time—and maybe that's still the way to get this contribute. I could use somebody to hold my hand through the process!

I've had this working in full scale production since 2016. I now have 3 separate publishing companies using it as their exclusive book publishing workflow, keeping full length book projects in Markdown and using Pandoc to convert them for SILE to typeset to go to press. Yes it is a bit of a mess, but the proof of concept has far outlived the 'concept only' stage. The initial work started based on Pandoc 1.15, the current work is rebased onto master and works with 2.9.1.

Currently only the Writer works in any semblance of working condition. I hacked on a Reader but that's going to be much more complicated than the Writer, and I have no production use for it. As far as I'm concerned a writer with no reader is enough to get started. I'm sure someday a Reader will help somebody, but I don't want it to hold up getting the Writer upstreamed.

To get started:

  • Sile TeX-like Writer
  • Raw Sile support in other formats
  • Documentation
  • Tests

Later:

  • Direct PDF output workflow
  • Sile XML Writer
  • Sile Reader(s)

SILE supports two input formats, XML and a TeX-like format that is much more human writable. Eventually it would be nice to support both, but I've been concentrated on the TeX-like syntax. While bearing a lot of resemblance to actual TeX, the format is a lot more consistent and flexible. It is not a circus full of magic ponies, and it is not a Turing complete language—except that you can embed Lua code, so it has that going for it.

Unlike TeX:

  • Unicode input is expected, there are no fancy substitutions or character encoding issues to muck around with. A copyright symbol is inserted with not \copy, --- is three hyphens not an em-dash, which would be inserted as an actual .
  • Only 4 characters are special: \, {, }, and %. All can be escaped in input with a slash.
  • All commands can use 'environment' or 'command' syntax interchangeably. \begin{font}foo\end{font} and \font{foo} are the same.
  • All commands use the same syntax. None of them have monkey business like extra content groupings. All of them receive arguments the same way. \command[key=val,key="val,with,commas"]{content}. Both the options block and the content are optional: \font, \font[], \font{}, and \font[]{} are always acceptable syntax variants.
  • There is no preamble. Anything can be set anywhere as long as it is set before use. Packages can be loaded at any time as long as they are loaded before any commands defined by them are used.

In the last 5 years I've also gotten deeply involved in SILE development (and my Lua skills are definitely better than my Haskell ones). Eventually it dawned on me that rather than trying to teach Pandoc a bunch of fancy work-arounds for data types SILE didn't know much about it would be a lot easier to build first class support for everything Pandoc knows about into SILE. This creates a bit of a cart-horse problem in that both sides need to coordinate and compatibility needs to match. The released version of SILE has a package available to cover the current state of the Pandoc writer. I can keep iterating on that to support whatever final form gets officially upstreamed, but I will aim to have the support released in SILE before the Pandoc version comes out.

As an example, let's take HorizontalRules:

$ pandoc -t latex <<< '----'
\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

That's a bunch of presentation specific style hard coded into the output. Sile does have a \rule command that could be used in a similar way. This is valid SILE markup:

\center{\hrule[height=0.5pt,width=50%lw]}

A slightly fancier version is available that raises the line to the middle of the line and has a default height of 0.5pt, but full width, so this would work too:

\center{\fullrull[width=50%lw]}

But even that includes some hard coded presentation information so instead I've chosen to add a command to the pandoc package. While not important necessarily the definition looks like this:

SILE.registerCommand("HorizontalRule", function (options, _)
  SILE.call("raise", { height = options.raise or "0.8ex" }, function ()
    SILE.call("center", {}, function ()
      SILE.call("hrule", {
          height = options.height or "0.5pt",
          width = options.width or "50%lw"
        })
      end)
    end)
  end)

In practice this means Pandoc's output can look a lot like it's own internal AST:

$ pandoc -t sile <<< '----'
\HorizontalRule

A user could conceivable choose to style this differently (say, something other than 50% of the line width) by including their own restyled command without touching Pandoc's output.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions