RPGExplorer Updates

Aug 2, 2015
Comments Off on RPGExplorer Updates

I finally made time to do a few updates to the RPGExplorer process yesterday. The work resulted in two key improvements: 1) reduction of HTML fragments and HTML keywords as tags; 2) applying some additional intelligence to the Twitter output stream; and 3) Updates to the output on the RPGExplorer site.

The HTML handling was primarily grunt work consisting of cleaning up of the text extraction method and resulting keyword. The text extraction engine often receives malformed HTML fragments that cause keywords from the standard to be flagged as important tags. I improved the text extraction process and improved areas where that failed. Additionally, I added analysis of the resulting tags to remove any HTML keywords. Not sexy but badly need.

Once the tag information was cleaner, I added some pseudo-intelligence hashtag output when writing Tweets. The code simply evaluates for the presence of a subset of the overall tags and applies appropriate hashtags if space is available in the tweet. To simplify the output, I also used the TinyUrl shortener service, which is very meta since Twitter automatically shortens Urls as well but doesn’t make the information available prior to posting the tweet. The shorter URL length makes the resulting tweet far easier to read.

Finally, I removed author index pages from the RPGExplorer site. The author information was not uniquely identifying authors with the same name so the output intermixed them. As such, it wasn’t useful information. Also, it reduces the size of the overall site keeping it from out-growing the limited disk space allocation on the hosting site.

I still have a significant TODO list for the overall process including building a useful tag taxonomy and building an application to apply periodic online training to the tagging process to coalesce tags into a smaller, more useful set. Those improvement will wait until another day as I want to do some creative stuff for an Old West campaign setting.


Comments are closed.