Reformat MS Word DOC

Some of the ebook conversion articles I made in the past mentioned editing a MS Word DOC file before proceeding to another conversion.  In this case, MS Word will serve as the “bridge” for another conversion. The reason is because most ebook converters can convert a DOC file directly. I use MS Word as the “bridge file” in converting repligo to other ebook formats. I also use this procedure to convert PDF to LIT, without purchasing commercial ebook converters.

The image shown below is a copied text from a PDF file pasted to MS Word blank document. Most often than not, you will have the same result. Of course we cannot convert it at this point and needs to reformat.

copied PDF to DOC

Choppy lines is obvious breaking the lines in unexpected places, this is because the original file had encoding (if its from html before PDF), paragraph and/or line breaks. That’s why it doesn’t come out looking exactly as the copied file.

showhide_button
The show/hide button of MS Word is a great tool to see the markings of breaks, it is located on your toolbar. When this button is clicked, you will see the symbol to every line breaks. It is easier to see exactly where you need to edit.

Manual removing of line breaks needs a lot of patience and time especially if you’ll be editing a big ebook file size. That would mean go to the symbol and delete… a repetitive process.

findandreplaceThe easier way is to click edit from your toolbar, then choose replace. A pop-up window will show.

Find what: ^p or click special and choose paragraph mark
Replace with: ^s or click special and choose nonbreaking space

Then click Find Next button, when target line reach, click Replace button. Continue doing this until you reach the end of the document.

For other desired formatting depends on your file, choose and click from the special button in the Find and Replace window. If you plan to convert the DOC file to PRC afterwards, you might find Prepare MS Word DOC useful.

glitch_sig


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>