Tuesday, February 24, 2009

Encode Unicode character to Utf-8 in url

URLs should only contain ASCII/Unicode charachter. That character set is quite restrictive if you want to use Gujarati characters for instance, so there is some encoding needed here. So if you've got a string with a Gujarati character and you want to link to it, you need to do this:

"કળા: -> "%E0%AA%95%E0%AA%B3%E0%AA%BE"

Thankfully this can be done with a bit of Java:


So, whenever you need to generate something for the address bar or a direct or something like that, you must URL encode the data. You don't have to detect this as it doesn't hurt to do this for links which are just plain as they don't get changed, as you can see with the string ending