Subscribe to:

The Kiwi's TaleWitchBlasterDerelict Blow Stuff Up

The Twitter Archive of Babel

I took a break from the Traffic Department 2192 remake to do something more quirky and experimental.

The Twitter Archive of Babel is a massive virtual world that contains (Within the limitation of 26 lowercase letters, @ and #) every single possible tweet with no exact duplicates.

So while the overwhelming majority of tweets in the world are utter gibberish, it contains, amongst other things:

  • Every possible tweet that has ever been made.
  • Every possible tweet that will ever be made.
  • Perfectly accurate predictions of every future event.
  • Every secret that has been lost to time.
  • The complete works of William Shakespeare, divided up into individual tweets.
  • Edit: And of course all of the above in every possible language that can be rendered with the latin character set

(In other words, Twitter is now redundant ;) )

The world is a series of almost identical, interconnected chambers connected to each other in the four cardinal directions plus staircases running up and down. It is cuboid in shape, 2946 chambers tall and wide with 2947 chambers across it's breadth. To put things in perspective, if the scale of each chamber was only 1 meter across, it'd still be 4.3 x 1040 times wider than the radius of the observable universe.

I was inspired by the short story "The Library of Babel" by Argentinian author Jorge Borges, which is set in a gargantuan library that contains every possible 410 page book. As much as I wanted to bring the world to life in virtual form, I figured it was just going to be too difficult to do it faithfully so I ended up using the much smaller scope of tweets.

The biggest frustration in developing it was probably giving the world the appearance of being sorted randomly. Each tweet is derived from the X, Y and Z index of the chamber plus the wall position, so I needed to make it so that the smallest change in any of those four coordinates would derive a vastly different tweet. Essentially what I needed was something like MD5, but completely reversable, so you could derive a tweet from coordinates or coordinates from a tweet. In the end I used an array of different ciphers to achieve the desired effect.

The raycaster engine I developed for it was based on some early work I did towards an HTML5 reboot of Derelict. Performance isn't fantastic but it is in a highly unoptimised state.

I'm not sure what to do with it next. Might do some form of online multiplayer mode where the chambers could be explored with a group. Definitely not happy with the clarity of the text, so I'll either need to double the texture size or double the text size. It may also be worth looking into some sort of API integration with Twitter itself.

The current version is up on a temporary page at http://earok.net/games/tab/index.html

Tags:

Comments

arran4
Offline
Joined: 05/05/2009

Hah I was wondering what you were doing with your development time these days a couple weeks ago. So this is it. ;)

 

Personally I would have written it using a genetic algorithm, and used "englishishness" as a heuristic, then cached the rooms in a similar way to Spore. :)

Earok
Earok's picture
Offline
Joined: 02/06/2009

Ah, but "englishness" by definition would filter out all non-English languages ;)

If you can come up with something in Javascript that'd output a tweet with a certain amount of englishness based on a seeded input, and have it two way reversable (so you could give a tweet that will return the seed), I'd be happy to try it out in the game.

arran4
Offline
Joined: 05/05/2009

I'm not saying filter. I am saying sort. :)

arran4
Offline
Joined: 05/05/2009

What structure is the data. Being JS is annoying as I can't source stuff off disk.

Earok
Earok's picture
Offline
Joined: 02/06/2009

Hmm, I never really thought about sorting the tweets per their "englishness", that's an intriguing idea but I can't imagine how I could get it to work.

There's no "data" per se. There's a function that turns a tweet into world X-Y-Z-Room Position coordinates using common cipher methods, and vice versa.

If you wanted to see it, the source is not technically obsfucated, though it may be difficult to follow as it's translated from Monkey. It's at http://earok.net/games/tab/main.js

With JS you can source stuff off disk (although you need to be running on a web server). The chamber layout is actually loaded off the server from this file: http://earok.net/games/tab/data/layout.txt

Joshua Smyth (not verified)

If you split the tweet into 'words' and then use a bloom filter, which is an algorithm spell checkers use

http://en.wikipedia.org/wiki/Bloom_filter

 

You could then rank words (and thus the tweet) close to english.

Earok
Earok's picture
Offline
Joined: 02/06/2009

Very interesting, I had never heard of the concept before. There's a javascript bloomfilter library that I might be able to use.

Theoretically I could leave it examining chambers (sets of 29 tweets) at random for an extended period of time, and recording the coordinates of the chamber it's found with the most English language hits, which might make an interesting 'Tourist spot' in the world. Arran, I take it that's loosely what you had in mind when you mentioned caching?

arran4
Offline
Joined: 05/05/2009

Well you managed to get the idea. However my caching component was misleading.. Somehow it didn't. :) I suspect the issue is getting things too close to pure English. You want some varience?