Wednesday, 26 October 2016

Family of Korean IDNs

The following is a list of functioning Korean IDNs (Internationalized Domain Names). They all belong to the same Computer Repair Company. The TLD (Top Level Domain) used is 닷컴 which is Verisign's new Korean language equivalent to their com TLD. Each IDN contains 컴퓨터수리 which means Computer Repair. The only difference between these IDNs is the first two characters which are the names of South Korean cities. I think this is clever and creative use of IDNs!

The last two IDNs below are structured differently. The first two characters are, I think, a neighbourhood and the first two characters after the hyphen are the city.

The cities are: 시흥 Siheung, 부천 Bucheon, 창원 Changwon, 마산 Masan, 평택 Pyeongtaek, 오산 Osan, 진해 Jinhae, 김해 Gimhae, 부산 Busan.

  1. 시흥컴퓨터수리.닷컴
  2. 부천컴퓨터수리.닷컴
  3. 창원컴퓨터수리.닷컴
  4. 마산컴퓨터수리.닷컴
  5. 평택컴퓨터수리.닷컴
  6. 오산컴퓨터수리.닷컴
  7. 진해컴퓨터수리.닷컴
  8. 김해컴퓨터수리.닷컴
  9. 북동컴퓨터수리-창원컴퓨터수리.닷컴
  10. 우동컴퓨터수리-부산컴퓨터수리.닷컴

Update 9th March 2017: Here is another family of Computer Repair 컴퓨터수리 IDNs with a different registrant.

  1. 김포컴퓨터수리.닷컴
  2. 안양컴퓨터수리.닷컴
  3. 용인컴퓨터수리.닷컴
  4. 용산컴퓨터수리.닷컴
  5. 대구컴퓨터수리.닷컴
  6. 종로컴퓨터수리.닷컴
  7. 강남컴퓨터수리.닷컴
  8. 파주컴퓨터수리.닷컴
  9. 일산컴퓨터수리.닷컴
  10. 성남컴퓨터수리.닷컴

Friday, 7 October 2016

Computer Science Internationalization — Bidi

Scripts such as Latin are written from Left to Right (L➡︎R). Scripts such as Arabic and Hebrew are written Right to Left (L⬅︎R). What happens when we mix L➡︎R and L⬅︎R scripts within a document? Here is an exercise in mixing scripts.

Take a mixed bidi (bidirectional) string consisting of Latin and Hebrew characters in a L➡︎R paragraph.

abcאבגdef

...and here is the same string in a L⬅︎R paragraph.

abcאבגdef

Now to the actual exercise. Copy the above stings to your text editor or word processor. You will need to setup the 2nd occurrence of the string as a L⬅︎R paragraph. I am assuming that your directionality is L➡︎R by default. Each string has two boundaries where the text changes direction. For each boundary you are going to insert a character, either a L➡︎R, such as x, or a L⬅︎R, such as ד. For each insertion operation use the initial mixed bidi string. There are two mixed strings above and so there are a total of 8 insertion operations. The challenge is to predict where in the strings the inserted character will appear before you actually insert the character. Give it a go! Good luck😀

If I did this exercise before I ever studied bidi, I would probably have scored 4/8. Now I understand how the computer is processing this bidi text and so I usually score full marks for such exercises. It is though not an intuitive process for me as I have spent most of my life reading and writing L➡︎R scripts only. I have to think very carefully as to how the computer does it in order to determine the correct answers.

The main purpose of this exercise is to think about the ordering of the characters in the strings. There are two orderings to consider: memory order and display order. Memory order is how it is logically saved in memory which in this case is the order in which I typed it. The memory order of the string I have used above is "abcגבאdef". Display order is how it is presented to the viewer. You have already seen, above, the two possible display orders for the single string "abcגבאdef".

I have used TextEdit for this exercise. In order to set paragraph text direction in TextEdit follow the path: "TextEdit➜ Format➜ Text➜ Writing Direction". Now set paragraph text direction to Right to Left. TextEdit correctly handles bidi text but that is not the case for all word processors or text editors.

There are several permutations of this exercise, including:

  1. What happens at the boundaries with forward delete and back delete?
  2. What happens if the initial memory order character(s) are L⬅︎R instead of L➡︎R?
  3. Use Arabic instead of Hebrew as this introduces the additional challenge of letters changing shape according to preceding and following characters.

This article is aimed at L➡︎R reading/writing people. If you are a L⬅︎R person then you will need to invert some of my instructions. Actually, if you are a L⬅︎R person you will be totally familiar with mixing bidi text and so will fully understand this exercise.

Environment: OSX v10.12 (Sierra), TextEdit v1.12