technical


One of the advantages of a wiki is being able to watch Recent Changes. It’s a nice way to see what is happening, either to watch for vandalism, to help collaborate on articles, or to see who is active.

The problem is that Wikipedia’s recent changes list is that it’s crazy busy. There’s no way for one person to watch it. Wikipedia has a project devoted to tracking vandalism through Recent Changes, and there are even software tools written for this.

Some members in WikiProject Oregon watch for changes on Oregon-related pages. I use a large watchlist, but a more authoritative way to do it is to watch all 9135 articles in the project through the RecentChangesLinked function. It’s even on this blog- look at the upper right part of the page.

This list is maintained by keeping a list of every article in the project. WikiProject Oregon member EncMstr has maintained this list by hand (and using a hand-run vim script). I realized this would be a great use of the MediaWiki API.

A long story later, but the code is done, released under the Berkley license and available on GitHub. It runs on my personal server daily; EncMstr used to run it every few months.

Seeing the recent changes list more frequently allows us to watch the newest articles- another bot usually finds 1-5 new articles per day that are related to Oregon, and these new articles can result in a lot of collaboration between us.

So, to echo a fellow Oregonian reporter, “I, for one, welcome our robot overlords!”

-tedder

Advertisements

I’m a bit of a geek and a motorcycle junkie. Combining them was natural, though it’s taken some time.

I uploaded my first photo on July 6, 2003, when I realized there was no article on redcedar bolts, which are blocks of cedar used to make shingles. Yes, I was out riding- in fact, I was dirt biking in far western Washington on July 4th.

Some of the following pictures came while we were motorcycling through Latin America. Here’s a replenishment ship from the Royal Netherlands Navy, just after leaving the harbor in Cartagena, Colombia, which we saw because we were getting our motorcycle around the Darien Gap using the services of a drunken pilot with a scary-small sailboat.

Now that we’re back in the States, I’ve really been enjoying contributing with the WikiProject Oregon. I’ve recently been tackling editing every high school in the state and getting them up to a minimum standard (infobox, refs, location, coords, photo). The photo is difficult, as many of them are a long distance away.

I started playing with Category:Wikipedia requested photographs in Oregon, then User:Para pointed out the recursive category export tool, which outputs kml. Nice!

I then used gpsbabel to convert the kml to gdb, Garmin’s format. Score! I can now get the requested photos in my GPS. A recent trip to Tri-Cities Washington is a good example of how I will integrate photos into an existing motorcycle trip: plan the basic route, then add in locations with requested photos.

I added 99 miles and 4.2 hours to my trip over, and 34 miles and 2.3 hours to my trip back by taking the photos. In total, I collected 29 photos for Wikipedia. Here are some of my favorites:


I use a Garmin 60-series GPS for routing and storing tracks, and a SPOT Messenger so people can keep track of me in case something bad happens. My “big” camera is a Canon T1i (EOS 500d), and the motorcycle is a Suzuki V-Strom 650:

I hope this encourages you to get out and help take photos for WikiProject Oregon, or for any other part of Wikipedia! It can be done on foot, on bicycle, or via motorcycle/car/airplane/rocket.

It’s with regret that I direct your attention to this blog post from ReadWriteWeb (RWW). To sum it up: RWW, one of the 20 most visited blogs on the planet, has been on Wikipedia’s spam blacklist for something approaching a year.

Naturally, RWW founder and editor Richard MacManus was a bit miffed to learn of this. And like any netizen passionate about his work, he took steps to get the error corrected.

But the approach he took went horribly awry.

Apparently, Richard didn’t put much effort into determining what issues were at play. As a result, he began from a fundamentally flawed premise, which any regular Wikipedia editor could have pointed out to him: he confused the blacklist, a technical tool intended to combat the massive quantities of spam that get posted to Wikipedia articles, with Wikipedia’s general policy and guideline relating to verifiability and reliable sources. It’s true that citations to blogs are often discouraged, but that’s not because they’re blogs; it’s because most blogs don’t have a sufficient claim to being accurate and reliable. (Case in point, Richard’s post, which was apparently not run by anyone knowledgeable about Wikipedia.)

In short: there is no Wikipedia policy or guideline that rules out blogs or user-generated content from being cited on Wikipedia. The relevant policy and guideline outline some general considerations, but they make no outright prohibition on blogs.

What’s more, like all of Wikipedia, the guideline is open to influence. It’s ironic that someone who chooses to pontificate about the norms of a Web 2.0 world should fail so spectacularly to understand that constructive suggestions are the best (and often only) way to accomplish change in a community like Wikipedia.

I’m disappointed that the initial post set the stage for a bunch of ill-informed and non-constructive blog comments. I support Richard’s central contention that RWW should be removed from the blacklist, but his form of advocacy is damaging the public’s understanding of Wikipedia, and in my view reflects very poorly on ReadWriteWeb (a site that I generally admire).

Below is a comment I attempted to post in the thread, which hasn’t yet made it through moderation:

(more…)

Over the years, I have seen numerous individuals and organizations in the SEO and SEM industry wringing their hands over what to do about Wikipedia. Some have simply ranted about the massive SEO success of the free encyclopedia. Others try some rather underhanded tricks to get their own “Wikipedia page.” (Hint: Want an article to stick around? Ask for one.) Here are three personal observations from a Wikipedian that I think my friends working in this field need to hear…

1. Wikipedia is not a marketing tool. Period. Anything you might do outside that mindset is an unproductive way to approach interaction with the site and its community. The benefits of an article or links can be substantial. But you’re not going to get either if you don’t think of how your actions benefit Wikipedia as an encyclopedia first and foremost. When you edit out of self-interest instead of altruism, you are not only being unethical. You’re being dense by trying to force Wikipedia to become something it’s not. If you can’t think of a way to link to your client or write an article that doesn’t help readers a lot more than it helps you, then don’t do either.

2. Getting angry at Wikipedia is counterproductive. Ranting and raving may feel cathartic, but it’s not going to help your business. Apologies if that seemed completely obvious to the smart people that I know are in this line of work. But you’d be surprised at whom I’ve heard blame their failure on Wikipedia’s success (not impressive to peers or clients), or get muffed when Wikipedia doesn’t respond well to their marketing efforts (see point one).

3. The best way to capitalize on Wikipedia is not to get in Wikipedia. It’s to learn from our successes (and failures), and to use these strategies for your own purposes. Understanding what makes Wikipedia successful and imitating those practices is not hard. Even on an infinitely smaller scale, valuable original content with a sensible internal linking structure will provoke the genuine inbound links you desire. Gleaning the best practices that Wikipedia has (almost entirely by accident) learned, and implementing them in an environment that you control saves you much time and effort, as well as avoiding the potential blow to your reputation if there’s a backlash.

What will not succeed in the long run is trying to leech off us. No amount of manipulating Wikipedia will make up for having a client no one cares about. Our community didn’t set out to dominate search engine results. We set out to write something worth reading. We don’t always fulfill that mission, but we try our damnedest. Do the same, and you’ll probably engender a similar result.

Conclusion? The real shortcoming of these two industries is not that they are filled with nefarious or lazy people. It’s that the laundry list of  “tricks”  for gaming Wikipedia has obfuscated the fact that a little honest work is the easiest way to get the results you want, both inside and outside Wikipedia. Perhaps it’s time you did some.

With 2009 underway, Brion Vibber and the rest of the great staff developers of MediaWiki at the Wikimedia Foundation have put their noses to the grindstone once again, rolling out one minor but distinctive feature on Wikipedia and testing another very significant one.

It’s the little things that make a difference

The first is the addition of friendlyclock. Once only a part of Friendly, an optional collection of JavaScripts that many Wikipedians use to automate common editing tasks, friendlyclock is a simple feature that adds an updating UTC clock in the top right-hand corner of the screen next to the usual links for logged in editors. Clicking it also acts as a purge of your page cache.

on the far right

on the far right

Friendlyclock may not sound so exciting, but once you spend even an hour or two editing, you’ll come to appreciate having both of those features handy. In fact, I’ve had a JS gadget enabled that does the exact same thing for months now. It’s just this kind of incremental but gratifying change to the software that shows how sensitive Brion and the Wikimedia Foundation has been to the needs of the core community over the years.

Important new functionality.

The second MediaWiki addition is a full extension that Brion announced Friday. Currently in beta on test.wikipedia.org, the aptly named Drafts extension is a serious advance in MediaWiki and wiki software in general.

It doesn't get much easier than this.

It doesn't get much easier than this.

At least once, everyone has written an extensive draft only to see it disappear when human error, a browser crash or saving problems cause you lose all your hard work. In fact, several months ago I saw this exact experience happen to wiki inventor Ward Cunningham when using a MediaWiki installation.

Needless to say, an inability to save drafts in a wiki without a live version being saved as well has been frustrating at times. A lot of other great platforms (such as WordPress) have drafts capability built in already. But as far as I know, there is no wiki engine with native drafts functionality.

From some test edits to the Sandbox I made, I can tell that drafts is a delightfully AJAXy addition to the ecosystem of MediaWiki extensions. Drafts can be saved via an easily accessible button and are saved every 120 seconds regardless. Not only can you save and view drafts from a particular page, but you get a special list of all your drafts.

I found the interface for viewing saved drafts extremely intuitive.

I found the interface for viewing saved drafts extremely intuitive.

When it comes to wiki software, and MediaWiki in particular, I tend to be something of a stick in the mud. I’m used to MediaWiki and I like it just fine the way it is, thanks. But Drafts is one extension that I think is inarguably useful, and makes up for a key weakness in wiki software to date.