Following a conversation with a friend regarding how ColdFusion handles arrays and structures in ‘the background’, I was interested to find out what Java classes each were mapped to. This was a relatively simple case of using the functions getClass(), getSuperClass() and getName() to parse out the name of the Java classes.

<!--- Arrays (repeat up the hierarchy) --->
<cfdump var="#arrayNew(1).getClass().getName()#">
<cfdump var="#arrayNew(1).getClass().getSuperClass().getName()#">
<!--- Structures (repeat up the hierarchy) --->
<cfdump var="#structNew().getClass().getName()#">
<cfdump var="#structNew().getClass().getSuperClass().getName()#">

What became immediately apparent was that both were handled differently. In the case of a ColdFusion array, the parent class was the java.util.Vector, which seemed obvious. The Vector class implements a growable array of objects. Like an array, it contains components that can be accessed using an integer index. However, the size of a Vector can grow or shrink as needed to accommodate adding and removing items after the Vector has been created.

ColdFusion structures, however, are handled entirely differently, with the parent class being a coldfusion.runtime class. In fact, as we can see below, the code to handle a ColdFusion structure is entirely written for ColdFusion until the root class of java.lang.Object. Clearly ColdFusion does some complex wizardry to handle the complex nature of structues (associative arrays).

The class hierarchy for an Adobe ColdFusion array:

java.lang.Object
	java.util.AbstractCollection
		java.util.AbstractList
			java.util.Vector
				coldfusion.runtime.Array

The class hierarchy for an Adobe ColdFusion structure:

java.lang.Object
	coldfusion.util.CaseInsensitiveMap
		coldfusion.util.FastHashtable
			coldfusion.runtime.Struct

Being an inquisitive person, I applied the same code to Railo and BlueDragon, with equally interesting results.

The Java class hierarchy behind a Railo array:

java.lang.Object
	railo.runtime.type.ArrayImpl

The Java class hierarchy behind a Railo structure:

java.lang.Object
	railo.runtime.type.StructImpl

The Java class hierarchy behind a BlueDragon array:

java.lang.Object
	com.naryx.tagfusion.cfm.engine.cfData
		com.naryx.tagfusion.cfm.engine.cfJavaObjectData
			com.naryx.tagfusion.cfm.engine.cfArrayData
				com.naryx.tagfusion.cfm.engine.cfArrayListData

The Java class hierarchy behind a BlueDragon structure:

java.lang.Object
	com.naryx.tagfusion.cfm.engine.cfData
		com.naryx.tagfusion.cfm.engine.cfJavaObjectData
			com.naryx.tagfusion.cfm.engine.cfStructData

Adendum: Ben Nadel has a good little post on ColdFusion Data Types from Different Sources and How ColdFusion Sees Them.

Arrays and Structures are considered to be complex data types in ColdFusion. In contrast, simple data types are ones that contain a single piece of data, such as an Integer, String, or Boolean value. A complex data type can contain multiple pieces of data, which, in the case of arrays, are usually related. All the data are referenced under a single variable name. You can think of a complex variable as a variable that contains a collection of other variables inside it. An array maps Integers to arbitrarily typed objects (Integers, Strings, Booleans and Objects) while a structure, or associative array, maps arbitrarily typed objects to arbitrarily typed objects.

Arrays and structures are also known as reference types. A reference is an object containing information which refers to data stored elsewhere, as opposed to containing the data itself. References are fundamental to constructing many data structures and in exchanging information between different parts of a program. References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code. As long as we can access a reference to the data, we can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.

Array and structure reference types are pointers to a memory space. Pointers are the most primitive and error-prone but also one of the most powerful and efficient types of references, storing only the address of an object in memory. Because arrays and structures are pointers to a space in memory, it allows for easy copying of that data. However, when copying an array or structure, there is a distinct difference between how the copy is performed. This difference relates to the concepts of shallow and deep copying.

When copying an array, a deep copy is performed by ColdFusion. This means that the data is copied and a reference to the new copy is created in memory. Therefore, following the example below, a change in the value held at position 1 of array b will not over-write the value held in position 1 of array a as they have different memory references:

<cfscript>
	a = arrayNew(1);
	a[1] = 100;
	b = a; //deep copy for Array
	b[1] = 500;
	writeOutput(a[1]); // the returned result is 100
</cfscript>

We can also use the little-known Java function equals() to demonstrate that the two arrays are not equal (the references are different).

<!--- 
the following returns false
a and b do not refer to the same object
 --->
<cfdump var="#a.equals(b)#">

This is not the case for a structure. When copying a structure, as in the example below, a shallow copy is performed by ColdFusion. This means that a change in b.name will also result in a change of a.name:

<cfscript>
	a = structNew();
	a.name = "Simon Whatley";
	b = a; //shallow copy for Struct
	b.name = "John Doe";
	writeOutput(a.name); // the returned result is John Doe
</cfscript>

Using the equals() method, we can show that the two structures refer to the same object:

<!--- 
the following returns true
a and b refer to the same object
 --->
<cfdump var="#a.equals(b)#">

However, by using the duplicate() function, a deep copy of a structure can be performed:

<cfscript>
	a = structNew();
	a.name = "Wu Mingshi";
	b = duplicate(a); //deep copy for Struct
	b.name = "Ivan Ivanovic";
	writeOutput(a.name); // the returned result is Wu Mingshi
</cfscript>

The above code results in two structures being created.

Arrays and structures although similar, behave very differently. Both are known as reference types which refer to a pointer in memory. Both have the ability to contain multiple values underneath a single variable name. Both use an index to access an individual value, but the index is numeric for arrays and a arbitrary, text value for structures. Arrays are extremely useful for numeric calculations, tabular data, and data sorting. Structures, by their nature, cannot be sorted by value, only by key name. They are commonly for related data, where order is not important and direct access to an individual element is important. Many of ColdFusion’s variable scopes (such as server, application, variables, session, form etc) can be accessed as structures.

In an earlier post I eluded to the implicit creation of arrays in ColdFusion 8. Well, the same can be said of structures.

A structure, also known as an associative array, is a complex data type composed of a collection of keys and a collection of values, where each key is associated with one value (a key-value pair). The operation of finding the value associated with a key is called a lookup or indexing, and this is the most important operation supported by a structure. The relationship between a key and its value is sometimes called a mapping or binding. For example, if the value associated with the key “Age” is 29 and “City” is “London”, we say that our structure maps “Age” to 29 and “City” to “London”.

Using structures, you can call the array element you need using a string rather than a number, which is often easier to remember. The downside is that these aren’t as useful in a loop because they do not use numbers as the index value.

We can think of an address book as a good example of a structure. The classic way of creating and assigning key-values pairs to a structure, in earlier versions of ColdFusion, would be as follows:

<cfscript>
strPerson = structNew();
strPerson.firstName = "Jean";
strPerson.lastName = "Dupont";
strPerson.city = "Paris";
</cfscript>

Or an alternative method uses array-notation to create the necessary key-value pairs:

<cfscript>
strPerson = structNew();
strPerson["Firstname"] = "Hans";
strPerson["Lastname"] = "Mustermann";
strPerson["Country"] = "Germany";
</cfscript>

NB. When using the array-notation, the key names keep their case. However, running the following code results in the value “France” being overwritten with “Germany”, even though the key name is a different case. This serves to highlight that ColdFusion is not case-sensitive.

<cfscript>
strPerson = structNew();
strPerson["Country"] = "France";
strPerson["COUNTRY"] = "Germany";
</cfscript>

Implicit Structures

With the introduction of implicit structures in ColdFusion 8, the creation of structures is greatly simplified. For example, rather than having to use the structNew() function, we can now simply do the following:

Using strings for values:

<cfscript>
myStruct = {firstname="Simon", lastname="Whatley", city="London"};
</cfscript>

Implicit Structures - Strings as Keys and Values

Using integers for values:

<cfscript>
myStruct = {account_no=12345678, sort_code=123456};
</cfscript>

Implicit Structures - Integers as Values

Using integers as keys:

This example most closely represents an array since arrays have numeric keys.

<cfscript>
myStruct = {10001="John", 10002="Doe", 10003="New York"};
</cfscript>

Implicit Structures - Integers as Keys

The integer could represent the unique identifier of an object, for example, user ID or order ID. Therefore, if we had nested structures like below, 10001 would be the ID of Simon Whatley, whilst 10002 would be the ID of John Doe.

<cfscript>
myStruct1 = {firstname="Simon", lastname="Whatley", city="London"};
myStruct2 = {firstname="John", lastname="Doe", city="New York"};
myStruct3 = {10001=myStruct1, 10002=myStruct2};
</cfscript>

Implicit Structures - Nested Structures

Mixed data types:

It is possible to mix the data types in an structure. For example, we can use an Integer, String and Array as elements within an array, with no problems. Since we need to know the key name before accessing the value, it is also likely we will know the type of the value and will be able to handle it accordingly. However, never assume this is always the case, so type checking is necessary when retrieving the data.

The example below demonstrates the ability to add arrays to structures.

<cfscript>
myArray1 = [1,2,3];
myArray2 = ["One","Two","Three"];
myStruct = {array1=myArray1, array2=myArray2};
</cfscript>

Implicit Structures - Nested Arrays

Structures, by their nature, cannot be sorted by value, only by the key name. They are best for related data, where order is not important and direct access to an individual element is important. Many of ColdFusion’s variable scopes can be accessed as structures, for example, Server, Application, Session and Variables etc.

Words of Caution

Implicit structures do have their limitations. For example, you cannot nest implicit struct, or indeed array, creation.

<cfscript>
myStruct1 = {
	myStruct2 = {
		firstName = "Jean",
		lastName = "Dupont",
		country = "France"
	},
	myStruct3 = {
		firstName = "Juan",
		lastName = "Pablo",
		country = "Spain"
	}
}
</cfscript>

The above will throw the following parsing error:

coldfusion.compiler.ParseException: 
Invalid CFML construct found on line 3 at column 10.

UPDATE: The recent ColdFusion Update now includes the ability to nest implicit structures.

To get around this problem, you can create each structure individually and then use the structure as the value in a key-value pair (as seen in the nested structure example above).

A (possible) strength of ColdFusion is that you can add key-value pairs as many times as is necessary. This is the same for explicit and implicit structure creation. However, the following code and screenshot serves to demonstrate that whether you explicitly or implicitly create a structure, if you duplicate a key, the last key-value pair in the sequence is the one that is represented in the structure:

<cfscript>
myStruct = {
	firstName = "Jean",
	lastName = "Dupont",
	country = "France",
	country = "Germany"
};
</cfscript>

Implicit Structures - Duplicate Keys