Monday, February 16, 2009

Greek text on the Kindle

As I mentioned in my last post, the Amazon team recently released a firmware update (version 1.2) that allows some much-needed functionality in Kindle books. I was finally able to test the Greek functionality and figured out how to add Greek text to HTML files destined for the Kindle.

First, add the Greek characters into the file using Unicode character entities. For instance, the lowercase alpha is α or α. You could also add the actual character (copied from character map or another source) but I do not suggest doing that since it is usually a better coding practice to use the entity. Also, it just makes inserting and messing with the characters easier.

After the characters are inserted, the file needs to be saved with a Unicode encoding. I suggest using UTF-8, a very common encoding that will be sufficient for these purposes. Just open the HTML file in your default text editor or in Notepad, go to the Save As dialog box, set the encoding to UTF-8, and save the file with the same name or a new one. That HTML file can now be used in Mobipocket Creator to create a PRC file for testing, or be sent to the Kindle through the automated conversion system.

As always, I do not suggest you try uploading Microsoft Word or PDF files, with or without these characters in them. The Kindle format is HTML, and you are always better off formatting and tweaking in that code.

Overall, the Greek support is pretty good on the Kindle. The only characters which are not supported are the archaic koppa, sampi, digamma, and stigma in uppercase and lowercase. The Kindle does support all of the other Greek characters, including all of the pre-composed characters with diacritics... and I mean all of them. I was not able to find any that are not covered. I have included some screenshots below that will give you a sampling of what the Greek looks like on the device, including in the mono-spaced font.

As always, if you have any questions or thoughts on this, please let me know!


Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
Image 7
Image 8
Image 9

Labels: ,

6 Comments:

Blogger Phil Frank said...

I was very glad to see the Greek, and I found it accepts other Unicode as well, like what we use for Greek transliteration (e.g., e[macron]: 0113; and o[macron]: 014D).

February 19, 2009 9:11 AM  

Blogger beowulf2k8 said...

I uploaded a file with unicode Greek and it doesn't display any characters with accents on their preview on the dtp site. I did save the file with UTF8 encoding, and also put the <meta http-equiv="Content-Type" content="text/html;charset=UTF-8" /> in the head of my html. Didn't work. So I tried specifying <font face="Palatino Linotype"> Still didn't work. Is their default font supposed to support Greek or do you have to use a certain font? Or is it just that the preview on their site is junk?

March 8, 2009 1:59 PM  

Blogger Joshua Tallent said...

Yeah, the DTP Preview is not updated yet, apparently. Also, it is not the best example of what the Kindle screen looks like. It works in a pinch, but nothing beats the Kindle itself.

March 8, 2009 3:06 PM  

Blogger Theophilus said...

Here's a question.

I've been working on putting some Greek texts on my Kindle for my personal studies. One thing I noticed while working on homework with friends is that all my breathing marks on vowels (or consonants) without accents were backwords (i.e. rough are displayed as smooth and vise-versa). They display properly when viewing the HTML I created in my browser and when viewing the .prc in mobipocket reader. They're only displayed backwards on my K2.

Any ideas what's up with that?

April 28, 2009 11:29 PM  

Blogger Joshua Tallent said...

It does not surprise me to hear that. I have not had time to test it out, so I am not positive what is going on. It is likely due to the way they implemented the Greek font, or possibly a problem with the font itself.

April 29, 2009 5:34 PM  

Blogger Joel Kalvesmaki said...

This is to confirm the observation of Theophilus: breathings alone on vowels and the rho are consistently backward on the Kindle 2. When combined with an accent (acute, grave, circumflex) the breathing is oriented correctly.

Having worked with many Greek fonts over the years, I think this was a mistake in building the character tables of the font. I have looked through some of the other characters (Greek or otherwise) and found other letters that are imprecisely constructed and executed. Look, for example, at how the accents for all capital letters are set awkwardly at center above cap heiht, rather than in the traditional place, left at cap height.

Congratulations, Joshua, on a fine website and sharing your expertise with us.

June 5, 2009 1:54 PM  

Links to this post: