Added in some establishments from Chicago, Denver, Larimer County Colorado, Portland and El Paso. Unique name count is now up to nearly 70000 entries. I still need to purge the DB of what I find less than useful but its far better than what it once was.
I’ve been trolling around for real business names for some time. Databases of actual business are available at crazy prices but I do one level for this hobby — zero cost. Thanks to data.gov I stumbled across a list of NY establishments — those inspected by the state. The data was a mess. Upper and lower case words, oddball combinations of words, strange abbreviations, and other curious combinations so I spent a fair amount of time cleaning it up. I also stripped out nearly 15000 entries related to schools or repetitious entries. More time at fixing the data could be spent but I wanted to roll out the full system before tweaking and tuning even further.
The resulting generator is the Random Restaurant/Eatery Generator.
I’d really like to merge some of the results with the Street Name and City Name Generators to expand the field exponentially. Until then, it will remain a whole lot of NY style names (both state and city). I like it thus far even if the data cleanup is incomplete.
I reduced the overall data set for the . Frankly, portions of the database were overwhelming results that made little sense given the purpose of the generation system. I purged everything related to Cause and Supplemental Classification since that is perhaps a different form of the generator. Additionally, I removed all the pregnancy/child birth results. I feel they are far too limiting in nature for the general purpose.
Mostly, I just want wounds or diseases. The eliminated elements are not forsaken they just are off limits for the current approach. That data set has so many possibilities it may spawn independent facets down the roads. I must stare at it some more to conjure up ideas.
Afflictions, diseases and injuries. I’ve long wanted to produce a generator for those ideas and now I have. The results are based off the codes your doctor uses to categorize the problem. Not all are interesting. I’m finding that I like producing at least 5-10 results to down select what afflicts my person of interest.
Sometimes you grab the wrong data set. Well, I do when exploring odd things available from government agencies. I thought I had a full set of diagnostic codes for injuries and diseases. Turns out, it’s a subset of the ICD focused on diseases only. I quit parsing the data. It would make a nice random table of some sorts but isn’t quite what I need.
Here’s what I managed. The full data is available from the U.S. CDC.
I love reading RPG blog posts on occasion if they fit within my pondering about RPG topics at that particular moment. Those thoughts shift quickly; I have an ephemeral attention span. An aggregation site is awesome if I a) want to peruse something topical now or b) just want to leaf through general topics.
It fails when I want to drill down into a topic and see what has been said not just currently but over a period of time and also because people are apt to under/over tag or categorize their own posts. Both aspects are prevalent. I’ve seen “I fail to blog” posts with a dozen tags or categorizes presented Just as many explore the intricacies of a mega dungeon with no tags or only one sufficient to be picked up the the aggregation site.
To be truly useful, the consolidator needs to keep and retain history along with independently selecting keywords and concepts based on the full post (or other content). Then the information needs to be exposed so topical browsing is available but also searchable. How awesome would it be if you could start topical browsing about goblins and subsequently find alternate goblin settlements pertaining to brewing? That’s just one strange example I suspect the gaming community would love.
Accomplishing this requires natural language processing. The field has come quite a long way in the last few years. Tool kits are available to roll your own solution as well as specialized engines to extract concepts and keywords. In a niche area such as RPG’s, both are going to problematic if applied generically and both can be powerful tools if applied with some level of human intelligence.
I’ve explored one of the specialized engines and its far better than self categorization but fails in some edge cases. I’ve also discussed the idea with a few other individuals; most but not all think the service would be useful. One has progressed further down the path than I have explored thus far.
In the end, I’m not convinced enough people use aggregation sites sufficiently beyond extra promotion for such an undertaking to be useful. Likewise, it requires something beyond the common “shared hosting” site to be done. Periodic processes; a significantly sized database; etc. Most consumers of information, in my experience, are driven not from other sites but from search engines.
Back in August of 2010, I posted about a variety of gaming terms based on Google Trend data. While its not quite 4 years later (closer to 3.5), let’s see what those terms are doing today.
These are the same terms, with the addition of some new terms (noted below), which I suspected would interesting. Nothing has really changed over the course of a few years. Pretty much anything gaming related is sloping down. Of course, this is trend data based on search terms and there are many, many alternate ways people find materials today; even more than there were four years ago with the explosion of social media.
I also added a few terms based on systems that I hear about on a regular basis — Labyrinth Lord, Pathfinder, and Dungeon Crawl Classics. Virtual Table Top was a new entry as well, since it has been popular the last couple of years. (I tried LOTFP but no trend data was available). There are many others but I chose that small set. Pathfinder and Labyrinth Lord bucked the down and to the right trend. PF is gaining audience based on search popularity and LL is holding near flat. DCC appears to have the same general trend as gaming in general. VTT has a curious cliff at the end of the chart. People appear to be searching for a specific client rather the generic term — namely Roll20. The Roll20 trend line is awesomely upwards.
The Original Terms
Overall, the original terms are pretty much what one would expect. Almost everything gaming is down and to the right. Porn is holding strong but seems to have hit a wall of late. Farmville had its moment in the spotlight but lost its luster quickly. It will be interesting to track the downfall along with the “D&D” and “Dungeons and Dragons” terms. I suspect it will crater at some point in the future; a fate I do not foresee with the D&D brand. Cat videos are still amazingly popular. I have no idea why.
I am a bit sad that the donkey videos are flat-lining. Cats are more popular than ever but donkeys just don’t get enough love. Perhaps they are too closely associated with jack asses for the modern politically correct aesthetic.