PDF2Wiki Conversion Comparison

So, some people may ask, why are you trying to convert PDF to Wiki? PDF is usually the last step in the process, so just use the original document. My response would naturally be, what if you don’t have the original document?

A Two-Step Process
Through my searching and reading on the topic, it seems there is no PDF2Wiki Converter. Every site that I have read explains converting the PDF to one of: DOC, RTF, HTML, XML first then to wiki format.

PDF2HTML
I tried a number of PDF to HTML programs, but none of them worked to my satisfaction. Most of them only converted simple formatting, such as bold and italics.  Adobe has an online conversion tool. It’s better than some of the others I’ve tried as it interprets lists and such. The resulting code is rather ugly and a lot of the code would need to be stripped before using a HTML to Wiki converter. See my previous post on HTML2Wiki for a couple of tools on tidying or stripping HTML code.

PDF2DOC
I found that a much better alternative was converting the PDF to a DOC/RTF file since it’s a lot simpler and some formatting might be lost, but you won’t have a lot of needless code that might mess up your wiki page. There are a lot of online tools that provide a PDF to DOC/RTF service, however, again, they only tend to do basic formatting.  Adobe Acrobat does a really good job, because it will change lists into formatted lists (instead of normal text).  The major downside of course is that Acrobat is a paid program though there is a 30-day trial.

Conclusion
I had a lot of problems in particular with PDF to HTML, so I thought PDF to DOC/RTF is simply. Honestly though, unless you have a PDF file which is really long and has a lot of simple formatting (bold, italics, etc.), if you cannot get your hands on Acrobat, then I suggest simply copy/paste (or alternatively save as a text file) and manually formatting it in the wiki’s editing box. Of course this depends on the wiki you’re using because ones that don’t have a toolbar to help you quickly format might be a bit of a pain. Someone please let me know if you have found a better method!

Published by

Cynthia

A librarian learning the ways of technology, accessibility, metadata, and people

Leave a Comment

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s