Some time ago, well almost a year ago actually, I posted an article called Parsing Twitter Usernames, Hashtags and URLs with JavaScript. From that article, it became immediately apparent that this was an issue many people were confronting and one that required an answer. Now, belatedly, it is the turn of ColdFusion to get the Twitter love.

Compared to JavaScript it is far easier to parse the URLs, Usernames and Hashtags in a tweet using ColdFusion and minor amendments to the regular expressions used in the JavaScript code.

Below is an example tweet that I’ll use for this post.

<cfset myTweet = "Woot! I've just taken receipt of my Holux M-241 GPS logger. Good call @fordie. http://bit.ly/2RsAu ##holux ##ipslogger" />

NB. For the purpose of this test, I need to double-hash the hashtags to prevent ColdFusion throwing an error.

Parsing URLs as Links to the resource

We can simply demonstrate the parsing of the link with the following code in the body of the page:

<cfset myTweet = REReplace(myTweet,'([A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&amp;\?\/.=]+)','<a href="\1">\1</a>','ALL') />

NB. The \1 is a back reference to part of the regular expression match. A backreference stores the part of the string matched by the part of the regular expression inside the parentheses. This means you can reuse it inside the regular expression, or afterwards as I am doing in each of these examples.

The resultant HTML generated is the following:

Woot! I've just taken receipt of my Holux M-241 GPS logger. Good call @fordie. <a href="http://bit.ly/2RsAu">http://bit.ly/2RsAu</a> #holux #ipslogger

Parsing Usernames as Links to Twitter

Following on from the URL example above, we can apply a similar methodology to Twitter usernames since they can also be URLs to their associated Twitter page.

We can simply demonstrate this with the following code:

<cfset myTweet = REReplace(myTweet,'[@]+([A-Za-z0-9-_]+)','<a href="http://twitter.com/\1" rel="nofollow">@\1</a>','ALL') />

The regular expression in this case finds all instances of @username. The Twitter URL is then applied to the username.

The resultant HTML generated is the following:

Woot! I've just taken receipt of my Holux M-241 GPS logger. Good call <a href="http://twitter.com/fordie" rel="nofollow">@fordie</a>. http://bit.ly/2RsAu #holux #ipslogger

Parsing Hashtags as Links to Twitter’s Search

Finally, Twitter also allows user’s to create Hastags within their posts. Hashtags are a community-driven convention for adding additional context and metadata to your tweets. Like regular URLs and usernames, Hastags can been parsed as a URL to an online resource, in this case, Twitter’s search.

We can simply demonstrate this with the following code:

<cfset myTweet = REReplace(myTweet,'[##]+([A-Za-z0-9-_]+)','<a href="http://search.twitter.com/search?q=%23\1" rel="nofollow">##\1</a>','ALL') />

The regular expression in this case finds all instances of #hashtag. The Twitter Search URL is then applied to the hashtag.

The resultant HTML generated is the following:

Woot! I've just taken receipt of my Holux M-241 GPS logger. Good call @fordie. http://bit.ly/2RsAu <a href="http://search.twitter.com/search?q=%23holux" rel="nofollow">#holux</a> <a href="http://search.twitter.com/search?q=%23ipslogger" rel="nofollow">#ipslogger</a>

All in one

So, putting all the regular expressions together, you would end up with the following:

Woot! I've just taken receipt of my Holux M-241 GPS logger. Good call <a href="http://twitter.com/fordie" rel="nofollow">@fordie</a>. <a href="http://bit.ly/2RsAu">http://bit.ly/2RsAu</a> <a href="http://search.twitter.com/search?q=%23holux" rel="nofollow">#holux</a> <a href="http://search.twitter.com/search?q=%23ipslogger" rel="nofollow">#ipslogger</a>

Which translates as the more useful tweet:

Woot! I’ve just taken receipt of my Holux M-241 GPS logger. Good call @fordie. http://bit.ly/2RsAu #holux #ipslogger

Where to take it next

Wrap these code snippets up into a simple twitterise function could be a good starter for ten. Following that, we could also create a simple Twitter feed reader, but I’ll leave that up to you to develop.

As part of an AIR project that I have been working on with my good friend Rob, we came across the need to parse a number of URLs within the text of a Twitter post. This may not sound too easy at first, but thanks to the prototype property available on JavaScript objects, our task was a relatively simple one.

The prototype object of JavaScript is a prebuilt object that simplifies the process of adding custom properties or methods to all instances of an object. For example, there is not a trim() method available on the String class, therefore, through the wizardry of regular expressions and the prototype property, I can add one.

You simply need to specify String.prototype before your method definition. e.g.:

String.prototype.trim = function() { 
	return this.replace(/^\s+|\s+$/g,"");
}

With this in mind, we can add methods to our String class, at runtime, that will allow us to manipulate the text string that is passed back in a Twitter JSON packet.

The Goal

To auto-magically parse different types of links within a text string. We will look at standard URL links, links applied to Twitter usernames and those applied to Hashtags.

Demo

The demonstration simply takes a test string and outputs it to the screen using JavaScript.

See the demo in action.

Parsing URLs as Links to the resource

First we create a custom method of the String.prototype property called parseURL. When invoked on a string, the regular expression finds any instance of a URL and will wrap the URL with an HTML anchor, with the correct href attribute and value applied.

String.prototype.parseURL = function() {
	return this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&\?\/.=]+/, function(url) {
		return url.link(url);
	});
};

Demo 1.

We can simply demonstrate the parsing of the link with the following code in the body of the page:

<script type="text/javascript">
var test = "Simon Whatley's online musings can be found at: http://www.simonwhatley.co.uk";
document.write(test.parseURL());
</script>

In the above example, a simple string variable is created called test, which contains a URL. The text does not contain any HTML at this stage. We then write out the test variable applying the parseURL() method to it.

The resultant HTML generated is the following:

Simon Whatley's online musings can be found at: <a href="http://www.simonwhatley.co.uk">http://www.simonwhatley.co.uk</a>

When rendered in a browser, the code becomes a hyper-link.

Parsing Usernames as Links to Twitter

Following on from the URL example above, we can apply a similar methodology to Twitter usernames since they can also be URLs to their associated Twitter page.

Again we create a custom method of the String.prototype property, this time we’ll called it parseUser. The regular expression in this case finds all instances of @username. We then simply replace the @ as this is not part of the actual username. The Twitter URL is then applied to the username.

String.prototype.parseUsername = function() {
	return this.replace(/[@]+[A-Za-z0-9-_]+/, function(u) {
		var username = u.replace("@","")
		return u.link("http://twitter.com/"+username);
	});
};

Demo 2.

We can simply demonstrate this with the following code:

<script type="text/javascript">
var test = "@whatterz is writing a post about JavaScript.";
document.writeln(test.parseUsername());
</script>

The resultant HTML generated is the following:

<a href="http://twitter.com/whatterz">@whatterz</a> is writing a post about JavaScript

Parsing Hashtags as Links to Twitter’s Search

Finally, Twitter also allows user’s to create Hastags within their posts. Hashtags are a community-driven convention for adding additional context and metadata to your tweets. Like regular URLs and usernames, Hastags can been parsed as a URL to an online resource, in this case, Twitter’s search.

Again we create a custom method of the String.prototype property, this time we’ll called it parseHashtag. The regular expression in this case finds all instances of #hashtag. The Twitter Search URL is then applied to the hashtag.

String.prototype.parseHashtag = function() {
	return this.replace(/[#]+[A-Za-z0-9-_]+/, function(t) {
		var tag = t.replace("#","%23")
		return t.link("http://search.twitter.com/search?q="+tag);
	});
};

Demo 3.

We can simply demonstrate this with the following code:

<script type="text/javascript">
var test = "Simon is writing a post about #twitter and parsing hashtags as URLs";
document.writeln(test.parseHashtag());
</script>

The resultant HTML generated is the following:

Simon is writing a post about <a href="http://search.twitter.com/search?q=%23twitter">#twitter</a> and parsing hashtags as URLs

NB. Twitter’s search was originally provided by Summize. However, as of July 2008, they have been bought by Twitter and the search can be found at http://search.twitter.com.

Where to take it next

Using the above code, we can now create a simple Twitter feed reader. Using, for example jQuery, to get and parse the Twitter JSON packet we can then apply the prototype methods to the text entries.

It is also worth noting that it is possible to cascade the methods, so we can do the following:

<script type="text/javascript">
var test = "@whatterz is writing a blog post about #twitter, which can be found at http://www.simonwhatley.co.uk";
document.writeln(test.parseURL().parseUsername().parseHashtag());
</script>

Download the code

The example code can be downloaded from the demo page.