Aug
31
(2009)
converting mediawiki mysql database from latin1 to utf8
Filed under: work. | 2 Comments
Sometime after upgrading our copy of MediaWiki from the antique version I’d had to run, to the shiny latest version, I noticed (well, some of the wiki users noticed first…) that there were some borked characters – accented French characters, Mandarin characters, and fancy schmancy “smart quotes” were displaying as gobbledygook gibberish text. Smelled like a UTF8-related issue – IIRC, MediaWiki switched the front end of the web app to be UTF8, but my database was languishing behind in krufty latin1 encoding. oops.
So, how to convert a database that’s approaching a gigabyte of data? I started googling. Found some hints. But none of the solutions I found actually worked for me. So I googled some more, and duct-tape-and-bubblegummed a script together that seems to have successfully converted the database from latin1 to UTF8.
Here’s what I did, in case I need to do it again…
#!/bin/bash
# general config stuff
mysql_path="/usr/local/mysql/bin"
# source database config
# change this to point to the database server that is currently hosting the database.
source_host="localhost"
source_db="source_database_name" # mine was mediawiki
source_user="source_database_username"
source_pw="password" #change this. duh.
temp_sql_dir="/path/to/a/directory"
# destination database config
# change this to point to the database server that will host the converted database
# this could be the same server as above, but use a different database on it.
dest_host="localhost"
dest_db="dest_database_name" # mine was mediawiki-utf8
dest_user="dest_database_username"
dest_pw="password"
# magic happens
clear
for table in `$mysql_path/mysql --host=$source_host --user=$source_user --password=$source_pw $source_db -e 'show tables' | egrep -v 'Tables_in_' `; do
echo "Dumping $table"
# dump the table from the first database
$mysql_path/mysqldump --host=$source_host --user=$source_user --password=$source_pw --extended-insert=false --quote-names --default-character-set=latin1 $source_db $table > $temp_sql_dir/$table.sql
# convert the charset declarations from latin1 to utf8
# sed seems to bork mysteriously, and the mysql 'replace' command borks the files, so I settled on perl...
# you could add other transformations here, too...
perl -pi -w -e 's/latin1/utf8/g;' $temp_sql_dir/$table.sql
# import the converted table data into a fresh database table
$mysql_path/mysql --host=$dest_host --user=$dest_user --password=$dest_pw --default-character-set=utf8 $dest_db < $temp_sql_dir/$table.sql
done
# or not.
if [ "$table" = "" ]; then
echo "No tables found in db: $db"
fi
It seems to have worked for me. It took maybe 5 minutes to convert almost a gig of data. It borked on one table – categorylinks failed because of some problem with the key, so I just manually copied it over myself after the script was finished.
It is highly probable that there is a better way to do this. Perhaps even some magic bit that could have been twiddled to do this automagically and/or instantly on the server. I couldn’t find the proper lever to throw, so wound up trying brute force. When in doubt, try brute force. It worked for me. It might not work for you. You’ve been warned.
Aug
28
(2009)
not my blog anymore
Filed under: general. | 6 Comments
A couple of weeks ago, the number of words in comments (i.e., stuff you wrote) passed the number of words in posts (i.e., stuff I wrote) on this blog. And now, the comment word count just pushed over half a million words.
This is no longer my blog. I’m not sure what it is, but it ain’t (just) mine, for sure.
Aug
20
(2009)
on the open education experience
Filed under: general. Tags: conferences, opened09, openeducation, openness. | 35 Comments
The Open Education conference last week was easily one of the best conferences I’ve ever participated in. It was intense, incredibly run, thoughtfully planned, and brought together an extremely diverse and intelligent group of people. I can’t remember the last time I’ve been so intimidated by the sheer number of scary-smart people in the same room.
The conference was awesome. Lots of people have already recapped the conference itself – I’m not going to even try to add to that. I’m also not going to write a post about how fracking awesome everyone is, listing them all by name. I had a blast talking to everyone. They all rock. I am honoured to have had the chance to meet so many great new people, and to hang out with so many old friends. Blah blah blah…
What I was struck by was the ways I found the conference changing how I was thinking about education, openness, and inclusion. I felt a similar shift at the first Open Education conference I attended back in 2007, but this was a much deeper, more pervasive feeling.
Open Education is not about Resources
Although many of the sessions touched on Open Education Resources (OER, Learning Objects, content, etc…) there was a strong consensus that education is about so much more than content, and is also so much more than the tools and technologies used to present the content and connect the learners. This was a refreshing stance, as we seem to be highly content- and technology-centric when thinking about education (and Open Education, specifically). How do we shift the focus from content to interaction? From publishing and/or consuming to interaction and engagement? There were some interesting conversations about this, and although I don’t think there can be any solid answers, the fact that we’re looking at this stuff as more than just content, at education as more than just broadcast/receive, is a good sign.
Openness
Scott Leslie talks about “planning to share” vs. “just going ahead and sharing” – and the most interesting projects (and non-projects) all shared this theme. There were no RFPs, no committees, no Advisory Boards. People just started sharing. And that’s the only part of Openness that matters. It’s not about licenses, copyright, or anything other than just sharing what you’re doing.
And, there is also some hypocrisy in “open” projects – for example, the showing of a very short clip of RIP: A Remix Manifesto, at an education conference, in an art gallery, apparently cost over $100. And the distributors wanted over $300 to let us watch the entire movie. A movie that ends by saying “Download this movie” – and is not legally downloadable within Canada, even though it was produced by the National Film Board of Canada. Openness is not about licensing, it’s about sharing. And locking a movie that is inherently about sharing behind a paywall is breaking the spirit of openness. Hypocrisy.
Tribalism
At an evening session on copyright, Sonny Assu presented some of his work – where he appropriated many of the commercial symbols that have been pushed on us and have become part of our cultural heritage. He talked about how we now use these symbols as parts of our selected tribal identities. The tribe of the $5 coffee cup. The tribe of the white earbuds. This got me thinking about everything I saw in terms of tribalism and identity – which tribes or shared cultural groups do I broadcast membership in? What does that mean, for how other people perceive me? Do they see the symbols of the group identity? How does my perception of others’ group identities affect my interactions with them? How does this affect the relationships that are crucial in education? Lots of stuff to think about, and no answers to come.
Inclusion
Following on the thoughts of inclusion, and on the strong sense of male dominance at the conference (which was a veritable sausage party), I started thinking much more about inclusion. If the open education conference was so strongly over-represented by white males who shared similar backgrounds, why is that? If it’s not through active exclusion (there is no club to join, no registry to sign, no approval process), it may be through a sense of inclusion or non-inclusion. Why are women, people of colour, people of various other backgrounds, not as strongly represented here? Are they missing because they don’t feel welcome? Do they perceive a risk in joining the community? Do they see a barrier to entry? The middle-aged white dudes may not see barriers and risks, but are they tangible for others?
If so, what can be done to encourage others to actively participate in the community? Is that even something that is desirable for everyone? Does everyone’s participation need to be visible to be valid?
But… I said at the top of this post that the participants were extremely diverse. WTF? well, they were, compared to some other edu- and tech- conference. But were hardly diverse, when put into a global perspective. Yes, people were there from a long list of countries, and from a long list of institutions, but almost all shared a similar privileged western background.
Aug
20
(2009)
ucalgaryblogs spam removal
Filed under: aside. | Leave a Comment
I just removed 49 splogs, and at least as many spam accounts from UCalgaryBlogs.ca – the funny/annoying part is that the scripts used by these cretins doesn’t grok the private/public checkbox on the registration form, so they were all marked as Private and Google hadn’t crawled a single one of them.
UCalgaryBlogs.ca now requires a valid ucalgary.ca email address to create a blog site. Once a site is created, you can add anyone to it, with any email address. But the spam blogs should hopefully be gone.
Aug
14
(2009)
Shortwave – extensible quicksearch bookmarklet
Filed under: general. | 2 Comments
Jim just asked me about what I used to do a quick Wikipedia search in Safari – I just hit a key combo and a text entry window pops up, where I entered “w trans lux” and got to the wikipedia page for the company behind the awesome Mighty Hercules cartoon.
I’m using Shortwave to search all kinds of crap – it’s like the Firefox search shortcut dealie, except that you don’t have to inflict Firefox on yourself in order to use it. It’s just a javascript bookmarklet that is configured to search a bunch of services, and you can extend it with your own services by editing a text file. I have it set to search my own flickr account for stuff, search my blog, my delicious.com account, etc… Very cool.
And if I put the bookmarklet on my bookmarks bar, it can be triggered by a keyboard shortcut – on mine, it’s the 9th bookmark in the bookmarks bar, so ⌘+9 brings up the box, and I’m off and running.
Easy. Free. Extensible. Shiny. Now you know, and knowing’s half the battle.
Aug
4
(2009)
on censorship in the Apple app store
Filed under: general. Tags: apple, censorship, dictionary, iphone, rants, software. | 6 Comments
I’ve been trying to be a voice of reason when it comes to how Apple operates. I’d rather see them as generally trying to do the right thing, but struggling sometimes with some of the nitty gritty things. Like letting individuals interpret blanket policies for what is and is not acceptable in the app store.
I’m fine with Apple deciding that an app is unacceptable if it crashes the iPhone. If it hijacks the cellular network. If it leaks memory, data, or something.
I’m not fine with Apple censoring apps. They hold the exclusive entry for software to get installed on an iPhone or iPod Touch. There is no other authorized way to install apps, without going through the Apple app store. And that means Apple has a very serious responsibility to act honourably, and in the best interests of its customers.
The latest app store controversy is swirling around Ninjawords. An application that provides a slick UI on top of the online Wiktionary dictionary database.
Someone at Apple decided to test the app by explicitly and manually searching for “fuck” “shit” and a few other stopwords. The software was designed to disable text autocompletion for questionable terms, so the only way to find them is to type them in yourself. But the developers missed “cunt” in their autocomplete filter in the last version. So Apple responds by slapping the app with a restricted 17+ rating – meaning kids don’t have access to a good dictionary on their Apple mobile devices.
Apple, this is not cool. You don’t get to censor content, especially content in a FUCKING DICTIONARY. Jesus fucking h. christ.
ps. this screenshot was taken of the Dictionary.app that came pre-installed on my Mac – the same Dictionary.app that my 6 year old son has unrestricted access to.
Update: Phil Schiller responded to John Gruber as a result of his post on DaringFireball.net – the response is a good one, but John’s take is pretty much the same as mine – even if Apple doesn’t censor the app themselves, there is pressure put on developers to censor themselves to avoid age-restrictive ratings. The inconsistent application of these ratings means writing an app can be a bit of crap shoot. But, Schiller’s email is a very good sign.



