NOTE: The information on this page only covers the most basic process for converting books into the HTML supported on the Kindle. My book contains more detailed information and provides you with easy-to-follow instructions on the process.

Kindle formatThe Amazon Kindle format is very basic HTML with a small amount of support for Cascading Style Sheets (CSS). The converter built into the DTP website will accept books in Word and some other formats, but those books invariably turn out with off-the-wall formatting, missing images, and other unacceptable issues.

Amazon suggests that authors and publishers do the conversion to HTML themselves, but many of them have never even seen HTML code, much less formatted an eBook with it. For those of you who do not already know or are not interested in learning HTML, my eBook conversion services are available. For those of you who are trying to make the process work yourself, here are some pointers and hints on how to format text for the Kindle.

Step 1: Converting the document

Word Document

If you have a Microsoft Word document you first need to convert it into usable HTML. Before you do that you might want to go through the file and take out any page numbers, extraneous images, and other content that will not show up in the final Kindle book. Then, go to File | Save As, and save the document as “Web Page (Filtered).” The filtered mode will remove much of the extra Word-specific code that is usually included in files saved with the regular ”Web Page” option. save from e-mail as HTMLIf you do not have Word, you can sometimes get the HTML from these types of files using the View as HTML setting found in some e-mail programs (notably Google’s Gmail), which converts many files automatically when they are e-mailed. However, the HTML in that case will probably be much more bloated. download HTMLAnother option is to go ahead and upload your source file to the DTP and download the HTML it creates. The HTML you get from that process will lack a lot of the original formatting, but it is still usable. You can do this at the top of the Preview page. You might also consider using OpenOffice.org to convert the file to HTML, or even importing the file into Mobipocket Creator and using the HTML it produces.

Other document formats will require their own processes. PDF files can be saved as HTML in Acrobat Professional and in some other programs. Usually it is better to save the PDF into Word first, then into HTML with the process explained above. Adobe InDesign can save as HTML in some cases, as can Quark, Microsoft Publisher, and other layout programs. If you don’t have a digital file for the book, but you do have a hardcopy, you will have to scan the book and have an Optical Charater Recognition (OCR) program read the text for you. Most flatbed scanners come with a simple OCR program, but scanning each page individually can become tedious quickly. You should consider having the entire book scanned and OCR’d by a professional.

Step 2: Cleaning up the code

HTML is a very simple language to learn. All of the formatting in an HTML document is marked by “tags”. Tags consist of text enclosed in angle brackets, like this: <p>. There are great tutorials online that can help you learn HTML. Start with the W3C Tutorial.

Once you know the basics of HTML formatting you will need to look at the HTML file for your book. It is likely that you will see a lot of bloat in the document: extra tags that are not needed, <span> tags inside every paragraph to assign formatting to the text, inline CSS styles, etc. I suggest you remove as much of this as you can and get your file down to a minimum level of formatting.

An example of messy code saved from Microsoft Word can be found here. The same file, cleaned up, can be found here.

Step 3: Formatting the book

Now, I am obviously not going to give away all of my secrets, but I will pass along some pointers about the Kindle and which tags/formatting it will use and which it will ignore. The full list of supported tags can be found in my book.

Something important to note is that the Kindle does not support the use of margins in traditional ways. This makes some formats, like poetry and blockquotes, hard to achieve, but not impossible. See my Quick Tip: Margins blog post for more information on how to make margins work in your book.

You can use a stylesheet to format your book, and I do suggest that you use CSS as much as possible. There are some CSS rules, however, that are not supported at all or only supported minimally in the Kindle: margins, different fonts, font sizes, and colors are a few. The best control over font sizes can be achieved using heading tags (<h1>, <h2>, <h3>, etc.). The <big> and <small> tags are not as robust as they are in HTML, but you will find that they allow you to adjust the size up or down a couple of sizes. The font-size property in CSS is essentally useless.

Use the custom <mbp:pagebreak /> tags to mark pagebreaks in the text. I suggest you include one in front of every chapter or section.

The Kindle has built-in bookmarks for the Table of Contents and the start of the book's content. Use the following anchor tags to mark those places in your book: <a name="TOC"/> and <a name="start"/>. Place the anchors right after the page break tag, before any headings or paragraphs.

Images uploaded to the Kindle will be resized to fill the screen appropriately. On the Kindle 2, images larger or smaller than 520px by 622px are resized up or down.

When you are finished formatting your book you should upload it to your shelf on the DTP and look at the Preview, or, even better, upload it to a Kindle to look at the final result. Be aware that while the DTP Previewer provides a decent representation of what the book will look like on the Kindle, there are still quite a few places where it does not give an accurate representation.

If you have images in your book, the best way to get them on your Kindle with the HTML file is to create a Mobipocket file and copy that to the device. To do that:

  1. Install the Mobipocket Creator program on your computer.
  2. Open Mobipocket Creator.
  3. Mobipocket ImportSelect "HTML Document" from the section "Import from an Existing File".
  4. Browse to the HTML file and press "Import".
  5. This will generate a folder in your My Documents\My Publications folder that has the same as your HTML file.
  6. Open that folder and copy into it any images that are in your book.
  7. Mobipocket BuildIn Creator, select "Build" from the Menu.
  8. On the Build page, press the "Build" button.
  9. Go back to your book folder. You will now see a .opf file and a .prc file in there.
  10. Plug in your Kindle and copy the .prc file to your "documents" folder, or e-mail the file to your kindle.com address.

This process is explained in detail in Chapter 7 of my book.

Common Problems

Here are a few common issues you may run into as you format and proof your final book:

1. Conversion failed, no reason given.

This is most likely due to some invalid code in your HTML document. Use an HTML validator (like this one at the W3C website) to find out where the errors are. Be aware that the custom <mbp:pagebreak /> and similar tags will generate an error in any validator. To get around that, place them in comments before running the validator (<!--mbp:pagebreak /-->) and remove the comments before uploading the book again.

2. Text not formatted in the Preview window.

This is probably due to a stylesheet not being uploaded with the HTML file. Make sure that your CSS file is linked to the HTML file with the link tag (<link rel="stylesheet" type="text/css" href="styles.css"></link>) or that the CSS is placed in the header of your HTML in <style> tags.

Other solutions to common problems can be found on the DTP forums. If you run into something that is not answered well there, please feel free to drop me a line and I will try to answer your questions.

Converting an InDesign file to ePub and Kindle

For some information and resources on this process, see here.