Friday, November 25, 2005

Web Internationalization [I18N]: Part II

Today, I'd like to talk about the "Character Encodings"

Since the very beginning of the Computer Science, Character Encodings is as old as CS. The most famous ASCII table, is one of the most popular Character Encodings.

So, what means character encoding? Character encoding is some kind of organization of numeric codes that represent the characters of a character set in memory.

There are many character encodings in this world because a lot of people had tried how to express their own language or characters in computer.

Before we take a deep digging into character encoding, we need understand some basic concepts.

  • Character: According to the glossary of Unicode standard[Unicode standard 4.0], a character is the smallest component of a written language that has semantic value.
  • Phoneme: A phoneme is a minimally distinct sound in the context of a particular spoken language. Also we can say that Phoneme is the unit of aural rendering, and in some scripts, character has a close relation to phoneme, while others have a close relation to meaning. There is no one-to-one correspondence between the characters and Phonemes.
  • Glyph: Glyphs are defined by ISO/IEC 9541-1[ISO/IEC 9541-1] as "a recoganizable abstract graphic symbol which is independent of a specific design". Usually, also referred as the unit of visual rendering. There is no one-to-one correspondence between the characters and Glyphs.
  • Unit of input: In keyboard input, it's NOT ALWAYS the case that keystrokes and input characters correspond one-to-one. Only a few language like English can correspond the keystroke and the character one-to-one, there are many other languages outside there and they are using far more complex writing system. It's impossible to fit them all to the keyboard and they must rely on some kind of input method which transform keystroke sequence into character sequence.
  • Unit of collation: String comparison are used on sorting and searching which based on collation but not characters. Those collation does not have a one-to-one relation with characters. For example, in triditional Spanish sorting, the character sequence 'ch' and 'll' are treated as atomic collation unit.
  • Unit of storage: All information is stored in physical storage, the basic principle of CS, as usual, we know bits and bytes, thus the most complex part. A frequent error in specification and implementations is the equating of characters with unit of physical storage. That's mapping is our object, usally called the Character encoding.

The above terms are the basic conecpts for understanding character encoding.

Here is the end of Part II

Tuesday, November 22, 2005

Web Internationalization [I18N]: Part I

These days, I'm preparing for the first paper contest in my company. Although, somebody may consider this kind of contest is too naivety to attend, but u know what? I thought this kind of contest will drive most of the newbie employee to promote their skill in one certain field.

For me, as a member of company's Technical Center, I'd like to write some thing about the Internationalization (aka. I18N) on Web. And as we can see, web I18N has already been talked for a long time, and a lot of people had draw a conclusion or made a guider to web I18N, and I'd like to look through it and build a similar conclusion that a lot of experts had already recommanded. For me, it's a chance to let everybody know how I made the research work, and how I build a nice presentation.

As some time ago, I had considered to write something about AJAX, which becoming more and more hot over Internet, but 2 other guys had already decided to write such kind topic. So, I choose to give up and find myself another topic, which is this one - Web I18N

Within my paper, I'd like to cover these areas:

  • Character Encodings
  • Character Escaping
  • Unicode
  • Normalization
  • How to build a Web I18N site based on J2EE technologies
  • How to build a Web I18N site based on ASP.NET technologies
  • How to build a Web I18N site based on AJAX technologies
  • Finally, some demo

Ladies and Gentlemen, it's Show Time!

Monday, November 21, 2005

About the friendship

As a friend, I usally consider the friendship is builded on something we both accepted, and I also think that we usally have some basic principles to measure our friendship. Those principles are our baseline, if someone broke those principles, any of the principle, I would like to let the friendship die, even this friendship has been promoted to something we called Love.

Those basic principles are made of our values, our philosophy, in order to build some kind of wonderful life, I'd like to list some basic principles of mine.

  • Integrity
  • Respect other people, including their privacy
  • To be added

Friday, November 18, 2005

is it a bug of IE 6.0?

[Keywords: IE, Internet Explorer, 6.0, bug, onmouseover, event, not fired, not work]

Today, when I was trying to do something like drag and drop on IE, I found a problem.

My purpose: I'd like to build a table and when user click on the cell and drag it, it will work as what we select some word in MS Word. And I had added onMouseDown, onMouseOver, onMouseUp events to those cells which can be "selected".
The essential mechanism is using onMouseOver to change the background color of the cell when onMouseDown is fired on some cell at first.
One more thing, I used an Array to act as a stack which used to store those cells which are already selected.

My problem: When I traced the onMouseOver event, I found that, when dragging over some cell, the corresponding onMouseOver event it not fired! thus made the selection not work correctlly!.

Here is the demo file:[Download the demo].

Ur... it's really not make sense! Is there anyone can tell me why it happened?

BTW: My system enviroment is listed below.

CPU: Intel P4 2.4G
Memory: 512MB
OS: WinXP +SP2
IE Version: 6.0.2900.2180.xpsp-sp2-gdr.050301-1519

Wednesday, November 16, 2005

New Template

Today, I finished my new Template for LifeType(pLog), and I called it "toto's gray". The following is my template screenshot.

old template snapshot

new Template


Just finished a Character Test from

And the result is almost as I thought:)

您的人格类型是: ENFJ(外向,直觉,情感,判断)
  ◆ 出色的交流和表达能力
  ◆ 天生的领导才能和凝聚力
  ◆ 热情奔放,有较强的寻求合作的能力
  ◆ 坚决果断,有组织能力
  ◆ 渴望推陈出新
  ◆ 与别人感情交融,能预见他人的需要,能真诚地关怀他人
  ◆ 兴趣广泛,头脑灵活
  ◆ 能通观全局,能洞察行为和意识之间的关系
  ◆ 鞭策自己做出成绩,达到目的
  ◆ 对自己所信仰的职业尽职尽责

  ◆ 不愿意做与自己价值观相冲突的事情
  ◆ 容易把人际关系理想化
  ◆ 很难在竞争强,气氛紧张的环境下工作
  ◆ 对那些没有效率的,或者死脑筋的人没有耐心
  ◆ 逃避矛盾冲突,易于疏忽不愉快的事情
  ◆ 在没有收集足够的证据前,易于仓促决策
  ◆ 不愿意训诫下属
  ◆ 易于因轻率犯错误
  ◆ 易于满足小范围的管理,决不放弃控制权

Monday, November 14, 2005

Something u might not know about Apache Group

Something that you might not know:

  • Apache is originally as a patch of the first web server developed by Rob McCool when he was working at NCSA (NCSA = National Center for Supercomputer Applications) server. It?s a short of acronym of ?A PAtCHy Web Server?
  • 1999, the same folks who wrote the Apache server formed the Apache Software Foundation (ASF). The ASF is a non-profit organization created to facilitate the development of open source software projects.
  • The ASF license is much more loose than GPL or LGPL, ASF is allowed to be freely redistribute the products which is under ASF license.
  • If you want to integrate Apache and Tomcat there are 2 ways: AJP and WARP. AJP stands for Apache JServ Protocol, and first appeared with Tomcat 3.x to integrated with Apache server which using a mod_jserv for Apache. As Tomcat goes into version 4.x the new AJP is named as mod_jk2. WARP is another connector only for Tomcat 4.x series and provided greater flexibility and greater performance than AJP. WARP is using a connector named mod_webapp and currently is the only connector which implemented WARP protocol.