Something about Language Pack File


#1

Hi, I’m new here. I’m not sure whether there are discussions of Language Pack File before, but I’d like to share some of my discovery. Thanks!

  1. You can download Language Pack File from http://lp.getpebble.com/v1/languages
    Now there are files for German, Spanish, French, English and Chinese.
    There are different files for specific hardware, but I found they seem identical although their filenames are different.

  2. Language Pack File has extension .pbl for filename, and it is just a special Pebble Resource File
    You can unpack it by pbpack tool in Pebble SDK.
    This script may help you: https://github.com/xndcn/pebble-firmware-utils/blob/for-language-pack/pbpack_tool.py
    There will be 19 raw files unpacked from Language Pack File, named from 000, 001, to 018

  3. The first file 000, is a GNU message catalog file, which has an extension .mo usually
    This file can be transformed to .po file by msgunfmt 000 -o 000.po
    000.po is just a GNU gettext message file which contains original strings and localized translation.
    You can change it and transform to .mo file by msgfmt 000.po -o 000

  4. The other files, 001 to 018, are either Pebble Font File or just empty file
    You can extract codepoints from the font file by using https://github.com/xndcn/pebble-firmware-utils/blob/for-language-pack/extract_codepoints.py
    And then you can replace the font file by fontgen.py in Pebble SDK with your favorite font.

  5. However, I didn’t find any font file from Language Pack File except Chinese
    In Chinese pbl, there are 4 types font file.
    001 and 002 are identical, which contains only 217 characters and their maximum height is 14. Especially, it nearly contains all chinese characters in 000.po.
    003 and 004 are identical, which contains 9056 characters and their maximum height is 18.
    005, 006, 007, 008 are identical, which also contains 9056 characters and their maximum height is 24.
    016 contains only 12 characters for month names and its height is 21.
    Other files are empty.

I think the order of font file is corresponding to its resource id:
001 GOTHIC_14
002 GOTHIC_14_BOLD
003 GOTHIC_18
004 GOTHIC_18_BOLD
005 GOTHIC_24
006 GOTHIC_24_BOLD
007 GOTHIC_28
008 GOTHIC_28_BOLD
009 BITHAM_30_BLACK
010 BITHAM_42_BOLD
011 BITHAM_42_LIGHT
012 BITHAM_42_MEDIUM_NUMBERS
013 BITHAM_34_MEDIUM_NUMBERS
014 BITHAM_34_LIGHT_SUBSET
015 BITHAM_18_LIGHT_SUBSET
016 ROBOTO_CONDENSED_21
017 ROBOTO_BOLD_SUBSET_49
018 DROID_SERIF_28_BOLD

  1. Conclusion. What can we do by modifying Language Pack File?
    You can do localization or custom system strings as you like.
    You can replace the system font with your favorite font.

At last, after changing those resources, you can repack to a new.pbl and put it inside your mobile phone, then you can open it by Pebble App to install.


Pebble in Russian beta1
Some problems when I try to modify a pbl file
#2

I see new “unofficial” languages popping up as a result of this find! It’d be great if somebody were to bundle this into a web-app or application that allowed people to easily create their own language files.


#3

Thank you for your information so much. Now I am traveling into the Language Pack file with your guide, to improve character fonts and messages.


#4

I have already traveled all language packs that I can get. It seems

  • 8 bytes header for this file.
  • first 1 byte means Version
  • next 1 byte for maximum heights
  • next 2 bytes for number of characters
  • next 2 bytes seems as a character code (unicode) for unknown character (aka. TOUFU)
  • Now, it seems unicode af25, and its glyph seems like box and crosse like “[X]”.
  • next 1 byte seems numbers of Lookup table
  • last 1 byte seems bytes for language packs, 2 for oriental characters, 1 for european languages.
  • 4*255 bytes for Lookup table (if header determines)
  • Codepoints tables for many chalacters and offset addresses. (6 bytes unit: 2 bytes for unicode character code, 4 bytes offset address from the start of Codepoints table)
  • 8 bytes null between Codepoints table and character bitmap glyphs.
  • Start bitmap glyphs to the end
Language names seems to pick up from lp.getpebble.com. But character code of Language pack understand from the header of messaging catalogs in resource 000.

The bitmap glyph seems like BDF font format:
  • 20 bytes header.
  • first 8 bytes means depth and heighs of the glyph.
  • next 8 bytes seems offsets for holisontal and vertical.
  • last 4 bytes seems maximum heights of the glyph bitmap (seems how many times to render lines of bitmaps.)
  • bitmap glyphs continues to the end. combined all bitmap in one stream and packed every 8 bits.
I can not imagine how to use Lookup table in Chinese, Korean and Japanese.
Chinese one seems no informations on this table.
Korean and Japanese one have some informations, but I could not notice the meaning of them.

And I am wondering why only Codepoints table are randomized?
The bitmap glyphs seems as a order of unicode character codes.
But Codepoints table (index table for unicode character codes and offset address to the glyph itself) seems so randomised in its order.



#5

In european language packs, there are only gettext messaging catalog. No character glyph in their resources on them.


#6
aoshimak said:
I have already traveled all language packs that I can get. It seems
  • 8 bytes header for this file.
  • first 1 byte means Version
  • next 1 byte for maximum heights
  • next 2 bytes for number of characters
  • next 2 bytes seems as a character code (unicode) for unknown character (aka. TOUFU)
  • Now, it seems unicode af25, and its glyph seems like box and crosse like "[X]".
  • next 1 byte seems numbers of Lookup table
  • last 1 byte seems bytes for language packs, 2 for oriental characters, 1 for european languages.
  • 4*255 bytes for Lookup table (if header determines)
  • Codepoints tables for many chalacters and offset addresses. (6 bytes unit: 2 bytes for unicode character code, 4 bytes offset address from the start of Codepoints table)
  • 8 bytes null between Codepoints table and character bitmap glyphs.
  • Start bitmap glyphs to the end
Language names seems to pick up from lp.getpebble.com. But character code of Language pack understand from the header of messaging catalogs in resource 000.

The bitmap glyph seems like BDF font format:
  • 20 bytes header.
  • first 8 bytes means depth and heighs of the glyph.
  • next 8 bytes seems offsets for holisontal and vertical.
  • last 4 bytes seems maximum heights of the glyph bitmap (seems how many times to render lines of bitmaps.)
  • bitmap glyphs continues to the end. combined all bitmap in one stream and packed every 8 bits.
I can not imagine how to use Lookup table in Chinese, Korean and Japanese.
Chinese one seems no informations on this table.
Korean and Japanese one have some informations, but I could not notice the meaning of them.

And I am wondering why only Codepoints table are randomized?
The bitmap glyphs seems as a order of unicode character codes.
But Codepoints table (index table for unicode character codes and offset address to the glyph itself) seems so randomised in its order.


You can extract codepoints using https://github.com/xndcn/pebble-firmware-utils/blob/for-language-pack/extract_codepoints.py
and generate font file by `fontgen.py` in Pebble SDK

#7

Thanks xndcn. I have used your font tools well. I have only little knowledge for python. So I can not fully use your tools and tools on SDKs. I supose there are well readable bitmap font for bitmap displays in Japanese. So I wish I could make Language pack not only from Truetype fonts, but BDF fonts. ( I already tried to make my own Language pack manually, but not succeeded. ) Now I use 2.9SDK and pbpack_tool.py with modifying the path to link from your script to fontgen.py on SDK. The 3.0beta SDK with your tool couldn’t work well with some errors. That seems around python, but I can not correct these errors.


#8

Errors in python comes from this reason. Dynamic library of Freetype was not called from python.
http://forums.getpebble.com/discussion/7211/solved-freetype-library-not-found-using-macports-on-os-x-10-8
Then, fontgen.py works well with extended option!
I would try to build Language pack from now.


#9

To make the list of characters, or to limit numbers of characters in resources, the .json file from your “extract_codepoints.py” would work with fontgen.py ?


#10

Thanks xndcn again! I have made my own language pack with truetype fonts I would like to use. Now I would brush up my language pack to reduce characters and size of resources. Best regards to your utilities and comments.


#11
aoshimak said:
Thanks xndcn again! I have made my own language pack with truetype fonts I would like to use. Now I would brush up my language pack to reduce characters and size of resources. Best regards to your utilities and comments.
hello, could you please give a example of using the fontgen.py to replace the font? It will be very appreciated.

#12

Hi, I also join @Kuro’s request for some more elaboration…
Specifically, I’m interested in re-using the existing PebbleBit.com’s fonts, and make a language pack out of them.
The fonts there are packed as part of the firmware, and it is not quite clear how to map the files you investigated above to the many resource files that come with the firmware. Any pointers will be appreciated.

BTW I am targeting the Basalt v3.0, if it matters in some way.
-


#13

Interested to know more too, it would be great if there is a tutorial :slight_smile:


#14
Omer Agmon said:
Hi, I also join @Kuro's request for some more elaboration..
Specifically, I'm interested in re-using the existing PebbleBit.com's fonts, and make a language pack out of them.
The fonts there are packed as part of the firmware, and it is not quite clear how to map the files you investigated above to the many resource files that come with the firmware. Any pointers will be appreciated.

BTW I am targeting the Basalt v3.0, if it matters in some way.
-
Hi, I featured it out myself. the usage is below. hop it can help you.

python $PEBBLE_SDK_PATH/Pebble/common/tools/font/fontgen.py pfo --list codepoints.json --extended HEIGHT FONT_FILE OUTPUT_FILE

#15
Kuro said:
Hi, I featured it out myself. the usage is below. hop it can help you.

python $PEBBLE_SDK_PATH/Pebble/common/tools/font/fontgen.py pfo --list codepoints.json --extended HEIGHT FONT_FILE OUTPUT_FILE
Thank you! just found this file fontgen.py and read the comments in it.
Seems like the entire Font Resource format is elaborated there, looks great!

One point I'm still not sure about - who does the mapping between the character to display and the actual glyph that will be displayed? There is no evidence for the, say, UTF-8 encoding is the font format, so how does the firmware matches the right glyph to the right character?

#16

Someone to use some characters built on your applications, you would make pfo file and add it on your resource of your application. For your expansion about languages of pebble, your could make language pack file and install on your pebble.

For one-byte character countries, like european languages, you need to make only one message file.
- You would download some language pack from http://lp.getpebble.com/
- unpack language pack with pbpack_tool.py
- discompile message file 000 with msgunfmt from GNU gettext utilities.
- edit 000.po (messages and language settings.)
- compile 000.po to 000.mo with msgfmt from GNU gettext utilities, and rename it to 000.
- pack language pack include new 000 file

For two-byte countries, need to install 2-byte character glyph data, you need to make pfo files with fontgen.py, and build your language pack.
- unpack some language pack with pbpack_tool.py
- (if you need to make messages on your language) discompile message file 000 and edit it like above.
- make pfo files from some truetype font to install character glyphs on your pebble: 14dot, 18dot, 24dot, 28dot / both regular and bold weight
- rename pfo file for 14dot regular -> 001
- rename pfo file for 14dot bold -> 002
- rename pfo file for 18dot regular -> 003
- rename pfo file for 18dot bold -> 004
- rename pfo file for 24dot regular -> 005
- rename pfo file for 24dot bold -> 006
- rename pfo file for 28dot regular -> 007
- rename pfo file for 28dot bold -> 008
- repack all of objects with pbpack_tool.py with pack option
then language pack would made.
But all of characters for your language cannot put on the language pack because of pebbles storage size.
So, pfo file need to limit its include characters with json list.
When you use the same pfo file in the same dot size, for example 28dot regular (filename 007)and bold(filename 008), the language pack would smaller of its size. I suppose it would be link the same glyphs in the language pack, and ignore object.


#17
Omer Agmon said:
Kuro said:
Hi, I featured it out myself. the usage is below. hop it can help you.

python $PEBBLE_SDK_PATH/Pebble/common/tools/font/fontgen.py pfo --list codepoints.json --extended HEIGHT FONT_FILE OUTPUT_FILE
Thank you! just found this file fontgen.py and read the comments in it.
Seems like the entire Font Resource format is elaborated there, looks great!

One point I'm still not sure about - who does the mapping between the character to display and the actual glyph that will be displayed? There is no evidence for the, say, UTF-8 encoding is the font format, so how does the firmware matches the right glyph to the right character?
hi, there

I'm not sure, but I think the codepoints file does the mapping work here.

#18
aoshimak said:
Someone to use some characters built on your applications, you would make pfo file and add it on your resource of your application. For your expansion about languages of pebble, your could make language pack file and install on your pebble.

For one-byte character countries, like european languages, you need to make only one message file.
- You would download some language pack from http://lp.getpebble.com/
- unpack language pack with pbpack_tool.py
- discompile message file 000 with msgunfmt from GNU gettext utilities.
- edit 000.po (messages and language settings.)
- compile 000.po to 000.mo with msgfmt from GNU gettext utilities, and rename it to 000.
- pack language pack include new 000 file

For two-byte countries, need to install 2-byte character glyph data, you need to make pfo files with fontgen.py, and build your language pack.
- unpack some language pack with pbpack_tool.py
- (if you need to make messages on your language) discompile message file 000 and edit it like above.
- make pfo files from some truetype font to install character glyphs on your pebble: 14dot, 18dot, 24dot, 28dot / both regular and bold weight
- rename pfo file for 14dot regular -> 001
- rename pfo file for 14dot bold -> 002
- rename pfo file for 18dot regular -> 003
- rename pfo file for 18dot bold -> 004
- rename pfo file for 24dot regular -> 005
- rename pfo file for 24dot bold -> 006
- rename pfo file for 28dot regular -> 007
- rename pfo file for 28dot bold -> 008
- repack all of objects with pbpack_tool.py with pack option
then language pack would made.
But all of characters for your language cannot put on the language pack because of pebbles storage size.
So, pfo file need to limit its include characters with json list.
When you use the same pfo file in the same dot size, for example 28dot regular (filename 007)and bold(filename 008), the language pack would smaller of its size. I suppose it would be link the same glyphs in the language pack, and ignore object.

hi, there

Thanks so much for your helpful explanation! especially for the last sentence. I had made a pack myself, but the size is a little bit big, because i used differed weight fonts for regular and bold pfo files. I will try to use the same weight fonts to recreate the pack to see if the size will be trimmed.

Thanks again!

#19

For my environment under python2.7, I cannot use xndcn’s utilities with Pebble SDK 3.0beta or 3.0 with some errors.
So I use SDK 2.9 and rewrite the path on xndcn’s pbpack_tool.py for its directory structure of SDK2.9’s.
(I use only monochrome model, so it is easy way to make language packs.)


#20
Omer Agmon said:

One point I'm still not sure about - who does the mapping between the character to display and the actual glyph that will be displayed? There is no evidence for the, say, UTF-8 encoding is the font format, so how does the firmware matches the right glyph to the right character?
In the truetype font, there are pair of informations about codepoint and glyph.
When you transform with fontgen.py the information pair also transform into pfo file.
I think pebble language pack use UTF-16 codepoints in its codetables.

You can check the pair of codepoints and characters, you can see with xndcn's extract_codepoints.py.
$ python extract_codepoints.py foo.pfo > characterlist.json
then, output the list of characters and its codepoints, decimal format of UTF-16, and character itself.
If you would limit or add characters for the new pfo file you would make, you would edit the codepoint list, and use with --list option of fontgen.py.