Just another blog about technology
Wiki tab sweep
Tab sweep of items I reviewed regarding how social policies support or enable the success of wikis:
General Wiki Statistics
|
List of the largest installations of Mediawiki |
|
|
Information and stats about Wikimedia project |
|
|
General Statistics of Wikipedia |
|
|
Raw, public server-log style page statistics on Wikipedia |
|
|
Example of user community mining the public statistics published by Wikipedia |
|
|
Public charts showing Wikimedia usage stats |
|
|
|
|
Examples of Wikipedia policies / social roles
|
Welcoming Commitee |
|
|
Example tool provided to select users on Wikimedia sites, along with the social policies that govern it — CheckUser |
http://meta.wikimedia.org/wiki/Checkuser |
|
Wikipedia arbitration policy |
|
|
|
|
|
|
|
The study of wikis and why and how they work
|
How and Why Wikipedia Works: An Interview with Angela Beesley, Elisabeth Bauer, and Kizu Naoko |
http://dirkriehle.com/computer-science/research/2006/wikisym-2006-interview.pdf http://dirkriehle.com/computer-science/research/2006/wikisym-2006-interview.html |
|
International Symposium on Wikis |
|
|
From Little Things, Big Things Grow |
|
|
Wiki Patterns |
http://www.wikipatterns.com/display/wikipatterns/Wikipatterns |
|
measuring wiki viability |
|
|
what makes wikis work |
|
|
What makes wikis work well |
http://www.deltaknowledge.net/2008/08/what-makes-wiki-work-well.html |
|
Viable wikis |
|
|
Design principles of Wiki: How can so little do so much? |
|
|
|
|
Specific Papers I found enlightening
|
Cooperation and Quality in Wikipedia. |
|
|
Authorial Leadership in Wikipedia. |
|
|
Quantitative Analysis of theWikipedia Community of Users |
|
|
Structuring Wiki Revision History |
|
|
exposing wikipedia revision activity |
http://www.wikisym.org/ws2008/proceedings/research%20papers/18500102.pdf |
|
Wiki Trust Metrics |
http://www.wikisym.org/ws2008/proceedings/research%20papers/18500017.pdf |
|
measuring author contributions to the Wikipedia |
http://users.soe.ucsc.edu/~luca/papers/08/wikisym08-users.pdf |
|
a method for measuring co-authorship relationships in mediawiki |
http://www.wikisym.org/ws2008/proceedings/research%20papers/18500125.pdf |
|
Cooperation and quality in Wikipedia |
http://www.hpl.hp.com/research/idl/papers/wikipedia/wikipedia07.pdf |
|
Are wikis usable? |
Graph processing
I’ve been wrestling for a while trying to find a good technique for large scale graph processing. Some wonderful folks here at work have come up with some cool solutions based on map-reduce.
Today, however, Google sent out a tease about their in-house solution — Pregel. It will be interesting to see what the details are when they discuss it.
Tab Sweep: Search
Current tabs
- Blog entry from Matthew Hurst exploring Google^2
- Blog entry from William Cohen, discussing the above entry and SEAL
- Google^2
- SEAL
Identifier Tab Sweep
I’m swapping into working memory the history and relationships of URLs, URNs, and URIs….
When two people know less than one
I can’t begin to count the number of times that I’ve seen it (and, sadly, participated sometimes — I like to believe I’m wiser now)….
Posit the question: Do two people who don’t know what they are talking
about know more or less than one person who doesn’t know what he’s
talking about?
One person will only go so far out on a limb in his construction of
deeply hypothetical structures, and will often end with a shrug or a
raising of hands to indicate the dismissability of his particular take
on a subject. With two people, the intricacies, the gives and takes,
the wherefores and why-nots, can become a veritable pas-de-deux of breathtaking speculation, interwoven in such a way that apologies or gestures of doubt are rendered unnecessary.
Typical scaling progression for a large website
John Engales, the CTO of Rackspace, has written a presentation about the typical stages of scalability undergone by a website.
Scalr
By it’s own description, “Scalr is a fully redundant, self-curing and self-scaling hosting environment utilizing Amazon’s EC2″. While I haven’t yet tried it, there certainly seems to be a market for management tools riding on top of EC2. Rightscale.com, a commercial alternative to Scalr, can attest to that.
I find the type of automation offered fairly compelling. We are approaching the day where deploying your application at internet scale will be push-button simple. Perhap’s Time’s person of the year this time ’round should be a cloud.
Standalone, Java implementation of Bloom Filters
I wrote an implementation of Bloom Filters which you can download from here. This implementation offers some advantages
- Pluggable hash functions
- Pluggable “bit stores” (an array of bits)
- An option to store the bloom filter in a file on disk. This allows for bloom filters bigger than your JVM heap by using Memory Mapped Files
- An option to store the bloom filter in ram.
Instructions for using the bloom filter are in the javadocs.
Standalone, Java implementation of Cuckoo Hashing
While spending some time with Bloom Filters, I came across an interesting hashing technique called Cuckoo Hashing. In short, Cuckoo Hashing is a technique for building a hashtable with guaranteed O(1) access time. Very useful. Unfortunately, after poking around the net a bit, I wasn’t able to find any standalone implementations in Java. So I wrote one. If you find it useful, please drop me a line.
Scrum Master
I’m now a certified Scrum Master.
Subscribe to RSS Feed