My most recent project has caused me to have to be "unicode aware" at times (something I've never had to do before), and so I am learning a lot about encoding and display of special characters as I go along. My latest challenge related to this topic involved a User Manager section I created, wherein the users could very well have names that contain special characters (foreign names). This particular section performs its updates, deletes, and inserts via Ajax calls and client-side JS manipulation of a JSON data set. My Ajax is performed via the Prototype library, my code is all ColdFusion living within the Coldbox framework, I'm using Coldspring to manage my object relationships, Transfer is my ORM, and my backend database is MSSQL 2005.
The Challenge: Data that contained special characters was being successfully inserted/updated via my Ajax calls, but the JSON data set returned via those calls did NOT contain those special characters (or contained an incorrect interpretation of them, like numbers, question marks, etc.). A quick check of the database verified that the data was indeed stored in the tables properly.
Setting the Stage
At this stage in the game for me, the smorgasbord of terms, acronyms, and concepts revolving around properly handling unicode is a bit foggy for me. (On a side note, I WISH someone who has the full understanding would put together a simple "checklist" of "Things you need to do in order to handle special characters in ColdFusion"!) From what I currently understand, the physical template you write has to be "encoded" properly (set within the IDE you are using); The database you are using has to have the proper encoding(called Collation in MSSQL 2005); The fields in your table have to be of the proper type to store unicode text(ie: 'nVarchar' instead of 'Varchar', etc.); your browser has to have the proper languages associated in order to display certain sets of special characters(Tools/Internet Options/Languages in IE); your JS functions, if living in a separate file, must have the page encoded properly(again, via your IDE); your ColdFusion datasource has to have the checkbox for "Enable High ASCII characters and Unicode for data sources configured for non-Latin characters " checked; and to top it all off, after having handled all of that, your JS functionality still yet needs to have ITS encoding types set in the proper place.
That sounds like a LOT of fiery hoops just to be able to deal with special characters, right? Well I'm with ya...it's on the verge of being a nightmare for someone who's never had to deal with it. And I do realize that for some of you reading this, the first time YOU tried to deal with special characters, everything just frickin "worked right out of the box" and you probably didn't have to do but one or two of those things, at the most. I say that you got lucky that things were configured just so for the particular character set you were dealing with, and even though from your perspective it didn't seem like that big of a deal, the fact is having an understanding of what's going on behind the scenes can be pretty doggone important anyway, just in case you suddenly get the directive to start storing characters from some other encoding scheme that you AREN'T prepared for out of the box.
Okay, so back to my challenge. Here's the nutshell of how my process flows:
My initial page load is provided with a query of all of the users in the system. That query is then translated to a JS object using a line in my template like the following:
<script> When a user is chosen for edit, I load up the values for that user from the client-side data set into form fields, allow them to be edited, then submit the form values back to Coldbox via an Ajax call where the record gets updated. After the update occurs, my event grabs a fresh copy of the user query (which now contains the updated record), serializes it, and returns it as a JSON string to the Ajax call. Here's the line of JS that performs the Ajax call:
new Ajax.Request(saveURL,{parameters: myparams, method:'post',onCreate:showWorking,onComplete:postSave}); Here is the line in my handler(controller) that returns the data to the call:
//make our initial data set available to JS...
var objUsers = <cfoutput>#serializeJSON(qryUsers,true)#</cfoutput>;....
<cfset arguments.event.renderData(type="plain",data=serializeJSON(variables.userService.getAllUsers(activeOnly=false),true),contentType="application/json) />
Bear in mind that the "getAllUsers" method call you see is the exact same method call being used during the initial page load to retrieve the data, which DOES contain the special characters as it should.
So here is where the problem manifests itself. The JSON string that the "postSave" method is provided with has the special characters stripped out! Poof, they are just gone. Okay, so let me go and investigate some of the optional parameters that Prototype provides for its Ajax.Request method and see if any of them might apply in this situation.... Ah, here are a few! 'encoding', 'evalJSON', 'sanitizeJSON'. Well, playing with all three of these resulted in zero changes to the symptoms. Sheesh, I've encoded everything I can possibly think to encode...what else is there? After a lot of google time, skimming page after page of semi-related (but not directly relevant) info, I came across a tiny little tweek to the contentType being returned that I tried, and lo and behold it frickin worked! Here is the new line that returns a CORRECT data set to my Ajax call:
<cfset arguments.event.renderData(type="plain",data=serializeJSON(variables.userService.getAllUsers(activeOnly=false),true),contentType="application/json; charset=UTF-8") /> The difference: adding in "charset=UTF-8" to the contentType of the data being returned. Apparently THAT'S what JS was looking for all along.
I hope this helps someone else avoid a huge loss of time. And again, for those of you out there who know this stuff inside and out and can actually visualize how it all works in your head, it sure would be an assett to the community if you could put that info into a kind of "checklist" a person could use to make sure they have all of their Unicode ducks in a row when trying to deal with special characters! Pretty please? 
Doug out.
