Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding pdf/epub/mobi generation from the markdown files using pandoc #67

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

holgern
Copy link
Contributor

@holgern holgern commented Jan 14, 2021

This PR adds the necessary files to generate a pdf, epub and a mobi document from the markdown sources.
It uses a template for latex and epub generation (assets/templates). batch and bash scripts to generate pdf/epub have been added. I also explained what to install to the readme file.
Instead of index.html a new main index_pandoc.md is used as main file.

lua filters are used to do the following:

  • remove the yaml header from each markdown file and convert it into a markdown header
  • Fix the image path
  • Fix the width and class parameter which cannot be handled by pandoc
  • Fix the internal links, as pandoc cannot use the filename as label
  • Load markdown files in include
  • Adjust section levels (header levels cannot be skipped)

I added also a github action that is run everytime a new PR is merged or something is added and does the following:

  • creates a new Tag with the current date
  • Creates a pdf/epub/mobi file and upload it as release

You can view how it will look like here:
https://github.com/holgern/btcguide.github.io/releases

I'm not a LUA expert (I learned LUA with this project), so the lua code is not optimal, but works :).

The pandoc github action is failing for this PR, as it cannot create a tag yet.

@mflaxman
Copy link
Contributor

Thanks @holgern! Been getting a lot of PRs on the guide so I haven't yet given this the attention it deserves but it's really cool.

I want to get it working with GitHub so the script automatically generates the downloadable versions when content is updated.

mflaxman pushed a commit that referenced this pull request Jan 18, 2021
@mflaxman
Copy link
Contributor

Hey @holgern, I spent some time playing around with this and it's pretty neat! I also cleaned up the readme/script a little bit:
https://github.com/btcguide/btcguide.github.io/tree/pdf-attempt

It appears the #s produced by this script are kind of messed up unfortunately:
Screen Shot 2021-01-18 at 5 21 08 PM

I played around with it for a while, but I wasn't 100% sure I'm following the lua code.

end


function skip_include(el)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely confusing name, either rename to something that makes sense or add a comment

element.src = fix_path(element.src)
return element
end
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK lua, but are there usually standards for indenting? This is all over the place.

end
doc = pandoc.read(content)
else
io.stderr:write("Warning: --- was not found twice at: " .. title .. "count: " .. count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors this prints out appear to be nonsensical:

pandoc --pdf-engine=xelatex --template=assets/templates/eisvogel.latex --highlight-style zenburn --toc -N --lua-filter _pandoc_filter/image_link.lua --lua-filter _pandoc_filter/add_title.lua -o assets/btcguide.pdf index_pandoc.md
                           ./assets/img/setup-specter-detect-node.png                ./assets/img/setup-paper-calculate-seed.png    [WARNING] Duplicate link reference '[TODO]' at line 26 column 1
[WARNING] Duplicate link reference '[TODO]' at line 24 column 1
                                                 hw/python.md hw/shitcoins.md  hw/encouragement.md   hw/psbt.md  hosted/utxo_privacy.md    hosted/utxo_privacy.md     hw/python.md   hw/psbt.md        hw/psbt.md     hw/wired_airgap.md


end

function filter_content(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no way to DRY this out? This jekyll has already gotten far too complex for what it is :(

That's partially my own fault :(

content = string.gsub(content, '%(advanced#redundant_address_verification%)', '(#verify-receive-address-advanced)')
content = string.gsub(content, '%(/backup%-wallet%)', '(#backup-wallet)')
content = string.gsub(content, '%(/backup%-wallet/seeds%)', '(#backup-seeds)')
content = string.gsub(content, '%(/backup%-wallet/public%-keys%)', '(#backup-public-keys)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be DRY?

function Para(elem)
if #elem.content == 1 and elem.content[1].t == "Image" then
local img = elem.content[1]
if img.classes:find('markdown',1) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be DRY?

Also, unclear what markdownskip and these #s are (might be clear if DRY).

mmikeww pushed a commit to mmikeww/btcguide.github.io that referenced this pull request Jan 31, 2021
@holgern
Copy link
Contributor Author

holgern commented Mar 12, 2021

I'm still interested in proceeding.
I will go over the LUA script and try to improve it.

Thinks could be simplified when the header levels would be fixed directly in the markdown code.
e.g.

##

####

should be fixed to

##

###

Should I make a new PR in which the header levers are fixed so that the LUA script could be simplified?

@mflaxman
Copy link
Contributor

Should I make a new PR in which the header levers are fixed so that the LUA script could be simplified?

Sorry, I'm not following what this change would do, can you give me an example page of what the before/after would look like?

Is the idea that I'm jumping from 2 #s to 4 #s and I should be going from 2 #s to 3 #s? It's hard to evaluate without an example, but generally I'm trying to avoid large text headers when they aren't needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants