I recently finished the second draft of my first novel and needed a way to prepare a decent-looking PDF to print and send to people so they could write all over it with red pens. Web technologies and a few Linux tools made this a fairly painless process.
The workflow goes:
And here’s what you’ll need:
The result will be a single PDF with numbered pages and an unnumbered front cover. Adding an unnumbered back cover is left as an exercise.
I can’t help you much with this one. Writing a book is hard. But I believe in your abilities.
One issue I encountered with mine was needing three modes for the body text:
So you might want to keep this in mind. I ended up using Markdown’s blockquote syntax for mode 2 and its code block syntax for mode 3. This made it easy to target those blocks with CSS.
Another issue that came up was section breaks—how to format breaks in the text without using chapter or sub-section headers. In the Markdown, I used a single %
character on a line by itself. So after the HTML is generated, it can be piped through sed
to add a custom CSS class, e.g., to replace <p>%</p>
with <p class="section-break">%</p>
.
I used MultiMarkdown but Pandoc would also make a great choice.
Depending on the way your book is split into files, you might want to start writing a build script. Here’s an example:
parts = [
"Talitha",
"Imal",
"Aunauf",
"Empress",
"Astronauts",
]
parts.each do |part|
system("multimarkdown -s ../#{part}/story.md | sed -E 's/_([^_]+)_/<em>\\1<\\/em>/g' | sed -E 's/<h1 .+<\\/h1>//g' | sed 's/<p>%<\\/p>/<p class=\"section-break\">%<\\/p>/g' > output-#{part}.html")
end
Those sed
commands (1) add intra-word italics (like Salinger does), (2) remove redundant h1
headers (I added one to each file/chapter for reasons I can’t remember right now), and (3) fix the section-break
s.
To get the page numbers right, you’ll want the HTML for every chapter in the same file. The loop above produces a separate file for each chapter, but you could also replace the output redirection with something like >> combined.html
.
Unless you add custom classes, the HTML generated from the markdown should not include classes, so your CSS will mostly need to target tag names—h1
, p
, blockquote
, etc.
If you want to add a page break between chapters, the chapter titles will need a consistent target (I used h2
tags) and this rule: page-break-before: always;
.
To make the front cover, follow the same process as with the book’s body: make the HTML, style it with CSS. You could use Markdown for this but the HTML might be simple enough that writing it by hand is an agreeable option.
You’ll want a version of wkhtmltopdf
with patched QT. If the version packaged for your distribution doesn’t have the patched QT, then you’ll want to download and install it yourself. You can check for the patch with the -V
option:
$ wkhtmltopdf -V
wkhtmltopdf 0.12.4
$ wkhtmltox/bin/wkhtmltopdf -V
wkhtmltopdf 0.12.4 (with patched qt)
You can specify page size, top, bottom, left and right margins, stylesheet, and, for page numbers, a footer file:
wkhtmltox/bin/wkhtmltopdf -s Letter -T 1in -B 1in -L 1in -R 1in --user-style-sheet style.css --footer-html footer.html combined.html body.pdf
A footer file should look something like:
<html>
<head>
<script>
function subst() {
var vars={};
var x=document.location.search.substring(1).split('&');
for(var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
for(var i in x) {
var y = document.getElementsByClassName(x[i]);
for(var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
}
}
</script>
<link rel="stylesheet" type="text/css" href="style.css" />
</head>
<body onload="subst()">
<span class="page"></span>
</body>
</html>
The Javascript called onload
chomps through the variables passed to the file during processing and fills their values into the elements with matching class names. You can style those elements in the CSS.
Do something similar (but leave out the footer) to generate the cover:
wkhtmltox/bin/wkhtmltopdf -s Letter -T 1in -B 1in -L 1in -R 1in --user-style-sheet style.css title.html title.pdf
Then combine the PDFs with Ghostscript:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/default -dNOPAUSE -dQUIET -dBATCH -dDetectDuplicateImages -dCompressFonts=true -r150 -sOutputFile=final.pdf title.pdf body.pdf
So, to put this all together in a build script:
$ cat make
#!/usr/bin/ruby
wkhtmltopdf_cmd = "~/wkhtmltox/bin/wkhtmltopdf -s Letter -T 1in -B 1in -L 1in -R 1in --user-style-sheet style.css"
parts = [
"Talitha",
"Imal",
"Aunauf",
"Empress",
"Astronauts",
]
parts.each do |part|
system("multimarkdown -s ../#{part}/story.md | sed -E 's/_([^_]+)_/<em>\\1<\\/em>/g' | sed -E 's/<h1 .+<\\/h1>//g' | sed 's/<p>%<\\/p>/<p class=\"break\">%<\\/p>/g' > output-#{part}.html")
end
htmls = parts.reduce("") { |acc,val| "#{acc} output-#{val}.html" }
system("cat #{htmls} | #{wkhtmltopdf_cmd} --footer-html footer.html - body.pdf")
system("#{wkhtmltopdf_cmd} title.html title.pdf")
system("gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/default -dNOPAUSE -dQUIET -dBATCH -dDetectDuplicateImages -dCompressFonts=true -r150 -sOutputFile=final.pdf title.pdf body.pdf")
You probably wouldn’t want to use this process for a final draft but it should work for all the ones you’re going to mark up anyway.