Wednesday, May 21, 2008

Merging 2 + XML Documents with different Encoding Types

Problem:
I have the task of merging 2 XML documents together that differ in their encoding types. One is declared as a UTF-8 (8bit UCS Transformation format) doc, the other the ISO-8859-1 format type (Latin Alpha No. 1). I wanted to do it w/o parsing the xml as well, as that's an expensive operation and with large documents can be problematic. Well, I figured, this is easy! I'll do the following:
  • Create a String Buffer to hold the new Large XML
  • Strip the Root Nodes and any xml header (doctype/?xml etc)
  • Write each xml doc to the String Buffer
  • Append the Root Node Back to the Main String
  • Close the String Buffer
  • Have a snack
XML Doc 1:
<?xml version="1.0" encoding="UTF-8"?>
<jobs>
<job>
<jobtitle>Job 1</jobtitle> ...
</job>
</jobs>
XML Doc 2:
<?xml version="1.0" encoding="iso-8859-1"?>
<jobs>
<job>
<jobtitle>Job 2</jobtitle> ...
</job>
</jobs>


Turns out, that only works well if the XML documents you're attempting to merge are of the same encoding type. Any ideas on a work around?

Solution:
What I did, is I specified an encoding type for a FileWriter object, and followed the same process, but had to write the file to disk specifying a unified encoding type, then read the file back.

This worked ok, but I am looking at alternatives like going to Binary and back to String again, but for now, this is my best available option.

ToScript or not ToScript

Nice little function in CF 7 that I totally overlooked: ToScript

Creates a JavaScript or ActionScript expression that assigns the value of a ColdFusion variable to a JavaScript or ActionScript variable. This function can convert ColdFusion strings, numbers, arrays, structures, and queries to JavaScript or ActionScript syntax that defines equivalent variables and values.

Returns

A string that contains a JavaScript or ActionScript variable definition corresponding to the specified ColdFusion variable value.

This can contain one of the following:
  • String
  • Number
  • Array
  • Structure
  • Query
Very useful when assigning cf data to javascript - nice to know now?!

CF APIC

Why the name? Well it's a bit of a play on the BASIC acronym but I thought it would apply fairly well with how I feel about what ColdFusion technology allows us to accomplish.

I'm a Canadian citizen/US Permanent resident alien from Swift Current, SK living in Mpls. An Application Developer of 10 years running currently employed at Jobs2Web, Inc, a great small start up out of Minnetonka MN. I've been with them from the start and have developed such applications like hotgigs.com and the jobs2web.com platform.

I dabble in ColdSpring, Fusebox, XML and Databases while listening the best and latest in Trance Music and Progressive by the likes of Armin van Buuren, and George Acosta, Markus Schulz and Paul Oakenfold.

I'm going to use this platform to divulge in all sorts of strange encounters and endevors through my working days with the latest technologies.

Blog ya later