Monday, May 28, 2007

Things to do...

29 May 12:30 - ECON115 final exam (Sports Hall)
31 May 09:30 - HUMA099G final exam (LTA)
1 June 12:45 - CPEG appreciation lunch (LG7)
1 June 15:00 - Meeting at TnC Ltd. Office
4 June 17:00 - Deadline for submitting Pan-PRD competition materials
8 June 12:30 - Best FYP Award Presentation
1H June - Submit industrial training logbook (long overdue)

Sunday, May 20, 2007

Final examination


Right... after the FYP presentation comes the final examination of the final semester of my UG life.

Thursday, May 17, 2007

WT Toolkit presentation at HKUST

There will be a presentation of WT Toolkit in Hong Kong University of Science and Technology this Saturday. In addition to demonstrating WT Toolkit, we (i.e. me and Marco) will also discuss the difficulties and pitfalls facing AJAX developers, and we will compare WT Toolkit with other popular AJAX toolkits like Prototype and Dojo.

Date: 19th May, 2007
Time: 17:20 - 18:00 HKT (+0800)
Venue: Hong Kong University of Science and Technology, Room 4480

Wednesday, May 16, 2007

WT Toolkit broke into SourceForge top 500


Just in time for me to present my FYP on Saturday.

Tuesday, May 15, 2007

HKUST research on web technologies

While I was skimming the proceedings of WWW2007 conference for interesting ideas (e.g. this one, a simple method of adding security to AJAX mashups) today morning, I saw a paper from HKUST:

Exploring in the Weblog Space by Detecting Informative and Affective Articles


The paper describes a method that classifies blogs into various degrees between "informative" and "affective". Informative blogs, like Alex Russell's, dispense useful information that interest the readers. Affective blogs are diaries describing things that mostly interest the author only. High quality blogs (i.e. those that people want to read) are usually informative.

The method shouldn't be treated as an absolute measure of blog quality, however. Lets take a look at a random paragraph from Joel on Software, a popular informative blog:

That's why I'm incredibly honored that they invited me to write a guest editorial about recruiting and internships in this month's issue. Thanks to professional editing, it feels a little bit polished compared to my usual style. I don't think I would write, "Ah, college." I do remember writing, "Get me a frosty cold orange juice, hand-squeezed, and make it snappy!"

Highlighted in red is one of the top feature phrases indicating affective blogs, as described by the paper. Guess what the algorithm would classify the above paragraph, and Joel's blog entries in general? I don't have the software on my hands so I can't test it and get the data, but the above paragraph has a word that is ranked as a top representative feature in the affective category, and none in the informative category. And that's not just an isolated example:

Microsoft finally put Lookout back up for download, but they sure weren't happy about it. ... The story has a happy ending.

A number of years ago a programmer friend of mine worked for a company...

... it wouldn't be such a bad thing to take Air France and change planes at CDG.

Among other things, this week I've been working on the new office design with our architect, Roy Leone [flash site].

Microsoft did the only thing that made sense...

I've been nattering on about this topic for well over 5000 words and I don't really feel like we're getting anywhere.

Thanks to professional editing, it feels a little bit polished compared to my usual style.

I had a chance to visit 7 World Trade Center today...

The "like" example above assumes that there's no word sense disambiguation (or something similar) in their algorithm. Since the "like" in the paper and the "like" in my example has different meanings. But hey, the paper didn't mention WSD at all.

On the other hand, the only informative features I could find from Joel's blog today are "project" and "report". They appear much less frequently than affective features in Joel's blog.

Joel's blog, however, is widely regarded as highly informative by software engineers. It's just that Joel prefers to write his blog entries in an informal and personal style. But anyway, this is still an interesting reading in seeing how computers can attempt to "understand" and filter information these days.

Monday, May 14, 2007

WT Toolkit broke into SourceForge top 1000


Well, a day of top 1000 isn't too hard to do. BitAnarch broke into top 10 for a few days in 2003. But still, this is good. Considering the 190,000 ranked projects in SourceForge, we're firmly in the top 1% of all projects.

Sunday, May 13, 2007

AJAX frameworks are NOT pointless

This was a response I posted to Slashdot a week ago. Why am I reposting it here? It's because I found my own post back when I was searching on Google today. In particular, I found other bloggers and websites bookmarking or discussing the stuff I wrote. So I guess the stuff I wrote was useful? Then maybe you want to know about it too. So, here it is:



There are many little funny things that just happens when you're coding a web application in JavaScript without a framework/library/toolkit helping you. Unless you're really an AJAX/JavaScript wizard, coding an AJAX-enabled web application on your own and mixing online code receipts is a very dangerous thing to do.



Browser inconsistencies

This is the most obvious one, but only the entry to the rabbit hole. If you are not familiar with the example (maybe not exactly the same, but any AJAX web developer worth his salt should have seen one like that) I give below, then please, PLEASE, do yourself, your fellow developers and your users a favor, resist the urge to hack things together for once, use a mature AJAX framework.

An important part of AJAX is that you need to update what is displayed on the web browser in the client side (by JavaScript), without refreshing the page. This implies that you're very likely to have to create and destroy DOM nodes on the fly. Now, how do you create a radio button in JavaScript?

How about...

var node = document.createElement("input");
node.type = "radio"
node.name = ...
node.value = ...
That's what you would do if you follow the DOM standard. But sorry, this does not work. Try to create a radio button with the above code segment in Internet Explorer 6, you'll get a broken radio button - you can't select it. The correct way to create a radio button by DOM manipulation is described in this MSDN article [microsoft.com]:

newRadioButton = document.createElement("<INPUT TYPE='RADIO' NAME='RADIOTEST' VALUE='Second Choice'>")



Memory leaks

The last one was easy. Do you know you can make a web application that leaks memory like a sieve in Internet Explorer 6 by making a simple circular reference like the following one?

var node = document.createElement("div");
node.someAttr = node;
If you're a good programmer, I might have sounded an alarm in your head right now - any circular references involving DOM nodes in IE6 results in memory leaks that persist after URL changes or page refreshes - unless you use an AJAX toolkit that takes care of the issue for you. Have you assigned a DOM as an attribute value under another DOM node in the past? Yes? Then you'd better check your web application for memory leaks with Drip [outofhanwell.com], now.

What's more, it's not just assigning DOM nodes as attributes that would result in memory leaks, closures in JavaScript can also form circular references and cause memory leaks. What makes closures particularly dangerous is that circular references with closures are not easy to spot. For example, the following code segment leaks:

var node = document.createElement("div");
var clickHandler = function(){};
node.onclick = clickHandler;
Looks innocent enough, but you've already formed a leaky circular reference here. node->clickHandler->node.

For more information about memory leaks under IE6, read these:

Mihai Bazon's blog entry [bazon.net]
MSDN's lengthy and confusing description of the problem [microsoft.com]



The XMLHttpRequest object is not as simple as you think

Much of the magic of AJAX comes from the XMLHttpRequest object (or its ActiveX equivalent, or an iframe, etc.), right? Sure. If you're only doing something simple via AJAX (like, updating the server time), then you can just copy an XMLHttpRequest code snippet from sites like this [apple.com] and hack away, right?

Wrong! Those XMLHttpRequest code snippets are one of the very reasons why people are thinking AJAX as a hack - it sometimes doesn't work! The XMLHttpRequest code snippet given on Apple's site can be broken in commonly encountered situations, and you can simulate that yourself:

  1. Write a simple AJAX web application that retrieves and displays the current server time on a web browser using Apple's code snippet.

  2. Test it yourself under normal conditions. So it works and it's safe to use, right? Let's see...

  3. Change your computer's routing table such that you can have no route to the web server.

  4. Now test your application again in Firefox. Your application should fail. But does it fail gracefully? No. You see an error message in Firefox's error console stating that the XMLHttpRequest object's status attribute cannot be read. If you have coded something to handle AJAX request failures, your handler won't be called.

Why is that happening? It is because, any socket errors happening during an AJAX request will cause the onreadystatechange handler to be called under Firefox, yet the status attribute cannot be read. Reading it causes a JavaScript error and stops JavaScript execution (unless you add a try...catch... block there, but that assumes you already know about the problem so it's moot)! Under Internet Explorer, reading the status attribute in the same situation gives you the socket error code instead. Don't know about these stuff? Please, use a mature AJAX framework.



Performance problems

Coding AJAX applications is just like writing things in C++ or Java - so long as you're using efficient algorithms, your application should run fast, right?

Of course, you are wrong again. Let's say... in some part of your application, you want to concatenate a lot of string fragments together to form a long string in a for loop, how do you do it? How about...

var targetString = "";
for(var i=0;i<someArray.length;i++)
targetString += someArray[i];
That's the way most programmers would think of, intuitively. But the performance of that sucks under Internet Explorer. The correct way to combine strings under JavaScript is to use the Array.join() operation. You can read more about this here [comet.co.il]. The optimization I talked about is also implemented in Dojo Toolkit (kudos to Alex Russell), and I believe any reasonably robust AJAX framework should have it too. Not knowing about such problems, had you hacked together a fairly sophisticated AJAX web application yourself, you would be running into performance hell sooner or later.

Taking 646ms to combine strings still doesn't sound very slow for you, right? There are many more performance traps in JavaScript. Do you know there's a very significant performance difference between the following two code snippets?

First code snippet:

// placing 5000 "Hello World" messages in random positions
for(var i=0;i<5000;i++)
{
var node = document.createElement("div");
node.appendChild(document.createTextNode("Hello World!"));
document.body.appendChild(node);
node.style.position = "absolute";
node.style.left = parseInt(Math.random() * 800) + "px";
node.style.top = parseInt(Math.random() * 800) + "px";
}
Second code snippet:

// placing 5000 "Hello World" messages in random positions
for(var i=0;i<5000;i++)
{
var node = document.createElement("div");
node.appendChild(document.createTextNode("Hello World!"));
node.style.position = "absolute";
node.style.left = parseInt(Math.random() * 800) + "px";
node.style.top = parseInt(Math.random() * 800) + "px";
document.body.appendChild(node);
}
The only difference between the two code snippets is the placement of the document.body.appendChild() line. But if you actually test them out, the second code snippet is much faster, under both IE and Firefox. The performance difference has nothing to do with your algorithms - you just shuffled one line of code around; it has to do with how the browser render the randomly placed DIV nodes. Ever wondered why your hacked together web application is taking half a minute running JavaScript after all the files are loaded?

So, unless you're already a programming god or don't mind spending lots of time solving bugs that you shouldn't have solved; you really, really should use some of these AJAX frameworks if you're making anything fairly sophisticated with AJAX.

Friday, May 11, 2007

Feeling sick today...

My throat felt a little dry after eating some Bolognese spaghetti (it's just a $25 dish at fast food restaurants, not expensive stuff) for lunch yesterday. I thought that was normal, coz the spaghetti was somewhat spicy, and the soup was spicy too. I went on to attend lessons and meetings as usual. I slept at 6pm that day (yes, you read that correctly, 6pm, I have crazy sleeping times).

I got up by 2am today, and that little dryness I had in my throat turned into pain. Oops? Just what I had done wrong? I certainly ain't overworking myself these days. But I have a presentation to do today, that sucks. :(

Thursday, May 10, 2007

hkpcug.homeftp.net goes down for a day

There's no electricity to my home today due to a routine checkup. Server will get back online tomorrow.

Sunday, May 6, 2007

Delayed execution idea scrapped, but the optimizations stayed

After some more thoughts, delayed execution is found to be stupid. It requires the web developer to change their code to get the benefits, and it sometimes breaks your application.

What's this delayed execution stuff about originally? It's actually a trick to get around browser inefficiencies in rendering DOM nodes with CSS attributes.

Consider the following two code snippets:

// placing 5000 "Hello World" messages in random positions
for(var i=0;i<5000;i++)
{
var node = document.createElement("div");
node.appendChild(document.createTextNode("Hello World!"));
document.body.appendChild(node);
node.style.position = "absolute";
node.style.left = parseInt(Math.random() * 800) + "px";
node.style.top = parseInt(Math.random() * 800) + "px";
}

// placing 5000 "Hello World" messages in random positions
for(var i=0;i<5000;i++)
{
var node = document.createElement("div");
node.appendChild(document.createTextNode("Hello World!"));
node.style.position = "absolute";
node.style.left = parseInt(Math.random() * 800) + "px";
node.style.top = parseInt(Math.random() * 800) + "px";
document.body.appendChild(node);
}

Both code snippets place 5000 randomly positioned "Hello World!" messages in the browser window. The two code snippets differ only in the placement of the document.body.appendChild() line. Running the first code snippet in Firefox can take 1 minute or more, but running the second one takes only a few seconds. The second code snippet provides a more than 10x speedup compared to the first code snippet.

Similar phenomenon can be observed in Internet Explorer also, but only with much more complicated logic, so we'll not go over that. But anyway, the moral of the story is, modifying some CSS attributes (especially positioning attributes) is harmful after the DOM node is already visible.

So what did the scrapped delayed execution idea has to do with these browser weirdnesses? The delayed execution idea was meant to help in batch widget creation and batch CSS style manipulations. e.g. when you're creating 100 widgets in a single pass. It speeds up widget creation or CSS style manipulation by making a common ancestor DOM node of the widgets being manipulated/created invisible before executing the performance sensitive code, and making the ancestor node visible again after execution.

Sounds like a hack - yes it is a hack. It sometimes breaks your application code, it requires you to change your application code to use it. But as shown in the videos, it worked.

Now, the hack is scrapped, before it is even released. And that's because we've got a more consistent way of implementing the same optimization in WT Toolkit, without the need of using hacks.

So what do we have for 0.3.3 now:

  1. Massively increased widget creation performance in Internet Explorer, without needing the developer to change a single line of code.
  2. No performance improvement in Firefox if you don't change your application code... Oops?! But hey, that would be the same if we implemented the delayed execution hack.

But what if you want to make your WT Toolkit application run faster in Firefox? Just pass null as the parentWidget argument to the widget constructor as much as possible, and add the widget to the document tree only after you've done all the CSS manipulations.

Say, if you have

var n = new wtButton(myParent, "Yes!");
n.setAbsolutePosition(x, y);

Then, the optimized version would be

var n = new wtButton(null, "Yes!");
n.setAbsolutePosition(x, y);
myParent.addWidget(n);

Actually, you can perform the manual optimization with WT Toolkit 0.3.2 too.

Saturday, May 5, 2007

The art of presentations

I watched a total of four presentations and did one myself last Friday. Out of the four presentations that I saw, three were group presentations done for course projects, and the other one is a solo presentation done by an engineer in IELM311.

In the group presentations I watched, there was one presenter that was extremely remarkable - remarkably bad and unnatural. Good presentations feel like an old friend talking to you, even though you've never met the presenter beforehand. This guy... he spoke "perfect" English during the whole presentation, more perfect than native speakers - there was not even the slightest pause in his presentation. He just kept talking talking and talking, jumping around mechanically as if those were gestures, with a smile always so wide on his face that he looked schizophrenic.

But aren't these stuff what our English teachers taught? Of course, nobody taught you to deliver your gesture mechanically, yet there's always somebody who goes too far in following those lessons.

The presenter on Friday got me recalling another presenter I saw when I was in a public speaking competition in form 6 - there was another presenter from another top secondary school that acted exactly like him. That presenter also spoke perfect English - with appropriate pauses this time, even. But there was something very unnatural with him - his body was swinging like a pendulum the whole time during his presentation. Looking at him makes you feel like attending a rave party. The judge (who was a foreigner) gave him a very low grade as a result.

What is a good presentation? I've seen good and exciting presentations where the presenter didn't even speak good English (e.g. Tam Wai Ho's presentation in IELM311). The differentiating quality between good presenters and mediocre presenters is their ability to make the audience feel comfortable and keep them thinking instead of falling asleep. When you're seeing a product presentation and you're thinking, "Hey, this product seems amazing, what uses do I have for it? How did they do it? Are there any modifications that I'll need if I were to buy it?", then you're looking at a good presenter. This unique quality cannot be emulated by simply speaking good English (you can even do without that) or having tons of gestures in your presentation, as your English teacher would have taught you. But how did the good presenters do that? I wish I know. But understanding the audience should be the first step, since a good presentation directs the thoughts of the audience.

Where are the good presentations? Apple have them.
http://www.youtube.com/watch?v=HGTX_f3Piko

Thursday, May 3, 2007

WT Toolkit FYP Poster on display in HKUST Academic Concourse

If you come to HKUST, you can find our poster at the "Software Technologies" section of the CS FYP poster displays, in front of Lecture Theatres A and B.

Wednesday, May 2, 2007

Information wants to be free: 09-f9-11-02-9d-74-e3-5b-d8-41-56-c5-63-56-88-c0

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░09░░░░░░░░░░░░▒▒▒▒▒▒░░░░░░░░░░▒▒▒▒▒▒░░░░░░░░░░░░
░░░░░░░░░░░░░░░▒▒▒▒▒▒▒▒░░░░░░░░▒▒▒▒▒▒░░░░░░░░░░░░
░░░░░░░F9░░░░░░▒▒▒▒▒▒▒▒░░░11░░░▒▒▒▒▒▒░░░░░02░░░░░
░░░░░░░░░░░░░░░▒▒▒▒▒▒▒▒░░▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░
░░9D░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░74░░░░░░░░░
░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░▐▌░░
░░E3░░░░░▒▒▒▒▒▒▒▒▒▒██▒▒▒▒▒▒▒██▒▒▒▒▒▒▒░░░5B░░░▐▌░░
░░░░░░░░░▒▒▒▒▒▒▒▒██▒▒▒▒▒▒▒▒▒▒▒██▒▒▒▒▒░░░░░░░████░
░░░D8░░░░░▒▒▒▒▒██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██▒▒▒░░░41░░░██░░
░░░░░░░░░░▒▒▒██▒▒▒████▒▒▒▒▒████▒▒▒██▒░░░░░░░░██░░
░░░56░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░██░░
░░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒████████░░
░░░░░░██▒▒▒▒▒▒▒▒▒▒▒██████████▒▒▒▒▒▒▒▒▒▒░░░C5░░░░░
░░░░██░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░
░░██░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░
░░██░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░
░░░░░63░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░56░░░░
░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░
░░░░░░░░░░░░░░░░░░░░██░░░░██░░░░░░░░░░░░░░░░░░░░░
░░░░░88░░░░░░░░░██████░░░░██████░░░░░░░░░░C0░░░░░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

So, the DeCSS debacle all over again. Somebody print that on a t-shirt.

http://blog.digg.com/?p=74
http://yro.slashdot.org/yro/07/05/02/0235228.shtml
http://www.chillingeffects.org/anticircumvention/notice.cgi?NoticeID=7180