Archive for the 'internet' Category

The problem with the “cloud model”

During the panel that I participated in at OSBC the panelists were asked what they thought of the “Google model” of computing. I was rather negative about this. Actually, according to Yahoo News, I said “The Google model really scares me”. Yep, that sounds about right. I talked about this before. Where’s my data and Google’s subpoena is a sign of some of the downside of their server centric approach… . So consider this part three of an ongoing series…

Here’s a new example why the cloud model is flawed - and just for kicks, it’s not picking on Google but on Adobe this time. Adobe recently released their Web version of Photoshop Express. If you ignore the cute “the lawyers make us do this” tool-tips and actually click through the terms of use you’ll find language like this:

Adobe does not claim ownership of Your Content. However, with respect to Your Content that you submit or make available for inclusion on publicly accessible areas of the Services, you grant Adobe a worldwide, royalty-free, nonexclusive, perpetual, irrevocable, and fully sublicensable license to use, distribute, derive revenue or other remuneration from, reproduce, modify, adapt, publish, translate, publicly perform and publicly display such Content (in whole or in part) and to incorporate such Content into other Materials or works in any format or medium now known or later developed.

So when you use their web software you grant Adobe worldwide, royalty-free, perpetual, irrevocable and sublicensable rights to use, reproduce, modify(!), publish(!) and derive revenue(!) from your images.

Wow. One more reason why I don’t like these “cloud services”.

Thanks for visiting!
I hope this was helpful - if not, please leave a comment and let me know why! Were you searching for something else? Did I miss an important aspect?

Google’s power

Lately traffic on this blog has gone up quite a bit (more than quadrupled over the last couple of months). And the main driver of all these new visitors? Google.

More than 90% of the readers of this blog come from search now (up from less than 40% in December) and again more than 90% of the search hits come from Google.

Looking at the traffic and at the actual searches that got people to my blog brings some surprising results:

Which brings me back to the title of this post - even though this is clearly my blog (the posts are authored by “Dirk” and the domain is “Hohndel.org”) and even though I write about lots of things here, most of the people who get here through Google are not looking for me but for information on a small subsets of the topics that I post about. And a change in Google’s ranking (which I can’t really influence, can I?) will completely change the amount of people coming to this site and the topics that they might be interested in. Something to think about.

So besides silly search engine optimization tools - is there an easy way to fix the fact that searching for my name doesn’t get you here? Here’s one idea: if you happen to have a blog or site yourself… would you mind adding a link named “Dirk Hohndel’s blog” which points to the entry page of this blog (so something like <a href=”http://www.hohndel.org/communitymatters”>Dirk Hohndel’s blog</a>)? And drop me a line if you did (I’ll happily add a link here in this post back to you). I want to try these searches again in a few weeks and see if the rankings respond…

How to get from “here” to “there”

Sorry, this is goofy, but I think you’ll get a good laugh. Go to Google Maps and ask for directions from here to there. I’ll help you, just click this Google Maps link.

In case you’re wondering, it’s about 683km (about 426mls) and will take you almost 8 hours…

PHP limitations

Here’s what I want to do. I want to be able to open a file select box from PHP and select multiple files in that box and have the names returned to my PHP program in an array. Sounds very simple. And doesn’t seem to be possible.

Searching the web for various ways to describe this feature returns many pages that talk about ways to have multiple ‘Browse’ buttons on one page or point you to commercial solutions using Flash or Java applets to do the trick. But PHP itself doesn’t seem to be able to do this. That seems like a seriously missing feature to me.

Why do I want this? I have another WordPress blog where I post many pictures in my articles (not as galleries, just as part of the flow of my posts) and it would be much easier to simply select all the pictures that I’ve exported from Adobe Lightroom, upload them to my server and then add the references to the blog post. But it’s the “simply select all the pictures” part that isn’t possible in PHP.

A global EDGE plan?

Here’s something I think is really missing. Some vendors (like RIM with their BlackBerry service) manage to offer you a basically global data rate that allows you to access your email in any country with a GPRS or EDGE network (which means all but Japan, Korea and a couple more countries) at an affordable price.

But no one seems to be able to offer something similar for EDGE modem service (either through a USB modem, or through a phone like the Motorola Ming or others that can offer data service via Bluetooth) so that you could use your computer over this service. The international data rates that AT&T or T-Mobile will offer you are just simply outrageous - checking email can cost you ten dollars if you are in a different country.

I’ll keep searching…

Playing with AJAX

I’ve been playing with a cool Wordpress plugin. AJAX’d Wordpress allows you to add a number of features to your site - what I really like was the ability to do in-line comments and commenting.

Pretty slick…

Is your child a computer hacker?

I found this hilarious article today and thought I should share it. Here are some gem quotes:

If your son has requested a new “processor” from a company called “AMD”, this is genuine cause for alarm. AMD is a third-world based company who make inferior, “knock-off” copies of American processor chips. They use child labor extensively in their third world sweatshops, and they deliberately disable the security features that American processor makers, such as Intel, use to prevent hacking.

Umm, maybe I shouldn’t link to this from my work blog :-)

Or how about this important piece of information:

BSD, Lunix, Debian and Mandrake are all versions of an illegal hacker operation system, invented by a Soviet computer hacker named Linyos Torovoltos, before the Russians lost the Cold War. It is based on a program called “xenix”, which was written by Microsoft for the US government. These programs are used by hackers to break into other people’s computer systems to steal credit card numbers. They may also be used to break into people’s stereos to steal their music, using the “mp3″ program. Torovoltos is a notorious hacker, responsible for writing many hacker programs, such as “telnet”, which is used by hackers to connect to machines on the internet without using a telephone.

I love the Internet. Tons of amusing stuff coming through the tubes this morning.

Bloglines and server errors

Like many other people, I just love Bloglines. I can’t imagine how else I’d keep track of the blogs I read (and I read them on three different computers, so browser internal solutions or client programs just wouldn’t work for me).

Today I encountered an annoying problem. My blogs both were down for a couple of hours due to a database issue (MySQL decided not to allow connects from the local machine as there were too many connection failures (hu???)) and bloglines got an invalid page when trying to open my feeds (basically, the WordPress error page). After that bloglines immediately stopped crawling my sites and shows the little red “[!]” behind the feeds in its display. And it appears that there is no way to make bloglines aware that the issue has been fixed and to get them to re-crawl the feed.

That seems like a bug to me. In case anyone has figured out a solution, please let me know (I sent them email, but they promise a response “within about two business days” - ahh, the problem with free services).

Postfix and SpamAssassin on OS X Tiger

I wrote about setting up Postfix on Tiger before. But after quite a while of procrastination I decided I also wanted to do something about the flood of spam that was sent to hohndel.org. SpamAssassin seems to be the preferred method to go (if you are in the open source camp). It’s bundled with Mac OS X server - but why spend that money… it’s easy enough to set up from scratch.

These instructions are based on a posting by Kalinga Athulathmudali where he describes a similar setup, but not for OS X.

First make sure you have your CPAN setup straight. Some hints to make sure all is well are here.

Next, install SpamAssasin.
$ sudo perl -MCPAN -e shell
cpan[1]> install Mail::SpamAssassin
quit

Use the System Preferences of OS X to create a user named spamfilter. Give it a random password and make sure the user isn’t allowed to administer the system.

To work around a couple of issues with the way postfix deals with the return values of filters we’ll create a little script that will do the filtering.

# Clean up when done or when aborting.
trap "rm -f /tmp/out.$$" 0 1 2 3 15
# Pipe message to spamc
cat | /usr/bin/spamc -u spamfilter > /tmp/out.$$
/usr/sbin/sendmail -i "$@" < /tmp/out.$$
# Postfix returns the exit status of the Postfix sendmail command.
exit $?

Next we make sure that spamd is started whenever the system boots. For this we simply create an entry in the StartupItems - a tar file with the necessary instructions can be found on the SpamAssassin Wiki.

Now you need to make sure that SpamAssassin is called from postfix; edit /etc/postfix/master.cf with the following two changes. First make sure that smtp over inet looks like this

smtp inet n - n - - smtpd
    -o content_filter=spamfilter:dummy


and then add an entry for the spamfilter:

spamfilter unix - n n - - pipe
    flags=Rq user=spamfilter argv=/usr/local/bin/spamfilter -f ${sender} -- ${recipient}


the last line starting with flags=Rq is actually one line; this WordPress theme makes it a little hard to render this correctly.

Run postfix reload to force postfix to read the new configuration and watch your logfiles to make sure that spamd is called correctly (remember that you need to start spamd as root in the background - or you can just reboot which will take care of that as well). Your /var/log/mail.log file should contain entries like this:

spamd[???]: spamd: connection from localhost [127.0.0.1] at port 49671
spamd[???]: spamd: setuid to spamfilter succeeded
spamd[???]: spamd: processing message for spamfilter:???

The final step is now to filter the spam mail from your normal mail flow. I prefer to use procmail for that. A simple entry like this in your .procmailrc file should do the trick (but that depends on the folder layout of your preferred mail client… as I mentioned before, I prefer mutt. In that case this should work:

PATH=/usr/bin:/bin:/usr/local/bin:.
SPAMMAIL=$HOME/Mail/spam

:0:
* ^X-Spam-Level: \*\*\*\*\*
$SPAMMAIL

With this all mail with a Spam-Level of five or more will not be in your normal inbox but instead in a mailbox named spam in your mail folder.

As always, corrections and suggestions for improvement are welcome.

So Many Bots, So Little Time (part 2)

It was just a couple of weeks ago that I posted about the number of undesirable bots crawling my site. Over the last few days suddenly the number of genuine bots has just exploded. I’m back to more than two thirds of the requests coming from bots (actually closer to 80% at this point). I’m wondering if this is normal - and if I am spending too much time watching my log files :-)

I can identify at least eight different blog aggregators scanning my feeds - Technorati takes the price for most impatient with an access every ten minutes. I see a similar number of search engines, at least two of them getting thoroughly confused by my old Blosxom sites (that I mostly leave around so that they are available for people who have linked to them). As a result these bots are searching very odd URLs that are valid but redundant. I wonder if this helps or hurts my page ranking…

738 requests from Google in a day seems just a wee bit overkill. I am not posting THAT much. And I have a sitemap that theoretically tells Google which URLs to crawl and how frequently they are likely to change. Heck, it’s a protocol that they invented!

People are writing a lot about the fact that a large part of the internet traffic is spam. I am beginning to wonder how much of the web traffic is actual end users compared to bots crawling.

Next Page »

FireStats icon Powered by FireStats