Reformat MS Word DOC
Some of the ebook conversion articles I made in the past mentioned editing a MS Word DOC file before proceeding to another conversion. In this case, MS Word will serve as the “bridge” for another conversion. The reason is because most ebook converters can convert a DOC file directly. I use MS Word as the “bridge file” in converting repligo to other ebook formats. I also use this procedure to convert PDF to LIT, without purchasing commercial ebook converters.
The image shown below is a copied text from a PDF file pasted to MS Word blank document. Most often than not, you will have the same result. Of course we cannot convert it at this point and needs to reformat.

Choppy lines is obvious breaking the lines in unexpected places, this is because the original file had encoding (if its from html before PDF), paragraph and/or line breaks. That’s why it doesn’t come out looking exactly as the copied file.

The show/hide button of MS Word is a great tool to see the markings of breaks, it is located on your toolbar. When this button is clicked, you will see the symbol to every line breaks. It is easier to see exactly where you need to edit.
Manual removing of line breaks needs a lot of patience and time especially if you’ll be editing a big ebook file size. That would mean go to the symbol and delete… a repetitive process.
The easier way is to click edit from your toolbar, then choose replace. A pop-up window will show.
Find what: ^p or click special and choose paragraph mark
Replace with: ^s or click special and choose nonbreaking space
Then click Find Next button, when target line reach, click Replace button. Continue doing this until you reach the end of the document.
For other desired formatting depends on your file, choose and click from the special button in the Find and Replace window. If you plan to convert the DOC file to PRC afterwards, you might find Prepare MS Word DOC useful.
Related posts:



