drupal

Server moved, PHP rebuilt

Yesterday, the entire Ermarian Network was migrated to a different server. After spending the past four years on millhouse, an AMD Opteron machine with two 2GHz cores and 2GB RAM, it is now hosted on carvin a brand new Intel Xeon with four 2.5GHz cores (hyperthreaded to eight virtual cores) and 16GB RAM. These impressive numbers are of course slightly misleading, since I'm one of a great number of users sharing the server.

Naturally, the very first thing that happened after the server move was that all my sites stopped functioning. This was to be expected and warned about, since I was using a PHP interpreter I had compiled myself, dynamically linking against multiple system libraries. With a new system and different version numbers, there was bound to be a mismatch (in this case, libssl.so.0.9.7 being replaced by a newer version). I had to recompile PHP. While doing so, I also decided to switch to the new PHP 5.3.x release (I had been using 5.2.x before).

If you have ever installed software without a package manager, using only ./configure --prefix=$HOME && make && make install, you know what a royal pain this can be. But if you have never done this with PHP, you have no idea. Compiling only the barest essential extensions, those that are packaged with the PHP source, is not that bad. Even linking to mysql is easy. But I use a fairly wide mix of extensions that include cURL, libtidy, mcrypt and PEAR.

Installing this software and then linking with it is an intricate arcane ritual. For example, mcrypt requires both libmcrypt and mhash to be installed, but its installer offers no easy way of looking for those libraries in a custom folder. Since I have to do everything in my $HOME, and no access to /usr/local, that's tricky.

I solved this using the following command:

env LD_LIBRARY_PATH=$HOME/lib LD_FLAGS="-L$HOME/lib -I$HOME/include" ./configure --prefix=$HOME

The -L and -I linking flags tell the compiler to look for libraries and header files in these custom directories.

The second issue was far harder to identify: While running the PHP ./configure script, it cancelled mysteriously and warned that there was an error with the mysql library. Confusingly, it found the mysql library (changing to another path resulted in a different error), but the sample C++ code it tried to use to test the library failed.

Eventually, I found out this was not because of MySQL at all, but because the libltdl library it was trying to include was missing. This was an extra component of libmcrypt, and I had to install it separately using

cd ~/files/installs/libmcrypt/ ./configure --prefix=$HOME --enable-ltdl-install && make && make install

After doing this, the configure script ran through, as did the compilation and installation.

I then had to update a few symbolic links, rename the php-cgi file to php.cgi (to this day I have no idea why this is required), and add the old php.ini, and the Ermarian Network was once again functioning, now on PHP 5.3.3!

Note that Drupal 5 uses many deprecated functions, and its error handler does not recognize the new DEPRECATED bitmasks (8192 and 16384) of PHP 5.3.3, leading to a lot of "undefined offset" notices. That was also easy enough to fix, however.

Taking leave from blog.module

This is my first post with the new, custom-created article content type. There is very little to outwardly distinguish it from the blog content type provided by Drupal's blog.module, except that the latter does not have a link to "Arancaytar's blog" below each entry (which is superfluous seeing as I am the only author on this site, and my blog is equivalent to the front page).

I am exploring a way out of blog.module because I do follow Drupal's core development very closely, and I have the feeling that this module is becoming an unloved stepchild (and not without reason, since most of its features are hard-coded versions of what could be done with contributed modules like Views). Ultimately, since this is a single-user blog, the recommended content type for me is the article type (what used to be story in Drupal 5).

This post serves as a kind of prototype. If I can make sure article nodes have the same settings as blog nodes, I will begin experimenting with switching my old nodes to the new type all at once.

Blog redesign

Half a year has passed since the last theme change, so tradition dictates I pick a new one again. Actually, this time I wanted to change because Agregado lacks fluid-width support. Seeing your website on a 2000px monitor, all alone in a narrow vertical beam in the center, really makes you question your web skills.

This new one is Colourised, and it is lovely. Lovely I say. I particularly love the prismatic spectrum in the background.

Naturally, I had to tweak quite heavily, as with all themes. The default is for the content to start well in the bottom half of the screen, for no other clear reason than to offer a clear view of the background image. Which is admittedly lovely, but it can be admired just as well behind the text. My tweak shifts the header, slogan and content all up a few notches.

Afte setting up the theme, my next priority was getting my page to validate again. This is something you need to retest regularly, because there will always be something around to break it again. In this case, I had to rewrite Drupal's entire linebreak generator, which wasn't able to work around in-line divs and headings in XHTML-valid ways. Pleased to say that it works very well now.

You can also see I have added a Latest Tweets block to my sidebar, since I'm using Twitter quite heavily these days.


I think I've been coding on one thing or another (proxydb, dhtml_menu, shell scripts, xbbcode) for well over twelve hours now. Mostly to avoid having to follow the news. I don't think I've felt physically ill like this since March 2003. Why do politics need to turn wonderful human beings into martyrs?

Proxy DB 0.3

I've toiled on a few more features for the Proxy DB module (described in the past few posts).

Changes include:

  • Password protection is optional (if no password is set, the list of random proxies is freely displayed).
  • If the PHP installation supports GeoIP, and a list of flag icons is independently downloaded to flag-icons/ in the file directory, then the location of every proxy will be marked with a flag icon.
  • Whole lists of proxies can now be submitted in batch (by uploading a file or pasting its contents directly). These lists can include proxies with multiple ports open (such as 3128,8080-8090,55555). The lot will be tested using Drupal's Batch API, with a neat progress bar, and only functional proxies will be saved.

The Proxy DB module is available for download from the Ermarian Network, and there is a live production site at barred.ermarian.net.

* * *

In other news, it appears a young woman was killed in the protests (among many others) yesterday. RIP Neda.

Proxy database live

My last post on a Drupal-based database of proxy servers provided a link to the proxydb module I wrote. However, realistically the only potential user of the module, right now, is myself, since it is a very buggy unfinished version. So I set up a site ready for production use (after much further debugging).

The site runs on Drupal 7, which is extremely sleek. I still get almost 9M memory peak for bootstrap, unfortunately - but premature performance-tweaking is the root of all evil.

The newest code of the module can be downloaded at proxydb-7.x-0.2-r355.tar.gz.

The production site is at barred.ermarian.net. (I had the barred subdomain left over, and it seemed close enough in meaning to be repurposed for this).

---

Note that I am a newcomer to all this: Austin Heap already has a very good proxy list running (the development of which I'm following, and which I might contribute to as well). However, just as the proxies themselves, these resources are all at risk of filtering, so you could say "the more the merrier".

Proxy DB

After my last post on testing proxy servers, I spent the past 10 or so hours on what inevitably happens whenever I develop even the simplest web application: A Drupal module.

This particular module incorporates your basic "database table, create record, administer records" functionality for web proxies, but also uses a testing function similar to the one in the previous post, as well as a "continuous retesting" feature that rechecks the 15 oldest servers on each cron run (and eventually gives up on a server after a test repeatedly fails).

It's still full of bugs, since I don't actually have a database with dozens of proxies, and some parts are still untested. But I've managed to run it locally, add my own local proxy server to it, and then see it in the list.

The code is available over at the Ermarian Network, like my other Drupal modules that are not fit for d.o contrib (yet).

(Disclaimer: The above does not endorse any political view beyond that of defeating censorship in all its forms. As the sadly-fictional UN comissioner Pravin Lal said in Alpha Centauri: "Beware of anyone who would deny you access to information." I'll interfere in that any day. Smile )

Synching with Twitter

I'm trying to get my Drupal blog to submit every update to my Twitter account. Let's see if it works - I've definitely seen this in action among some Drupalers, so it shouldn't require any tweaking like the Drivel/Taxonomy thing did.

... update, some hours later... well, the Twitter part was easy; it was, predictably, the Drivel/Twitter part that was such horror.

Learning point: Drupal's Blog API has the problem that it implements the API of less powerful tools. MovableType doesn't even seem to have a free-tagging mechanism, let alone multiple vocabularies. The solution, really, is to make Drupal-spacific XMLRPC functions. Drupal offers features that leave other blogging engines in the dust, and the only way to make sure all these features can be used by client applications is to set its own standard rather than only implementing other APIs.

Inline Tags with Drivel

Whoops, a bit of a bug in my code. Let's test it now...

Update: Apparently, not only do I not know how to use str_replace, I'm also clueless about escaping characters in regular expressions.

Update 2: Unfortunately, I've had to hack the Blog API in about three different places to even allow multiple vocabularies and leave my Free Tag vocabulary alone. Then I needed to hack in a new 'blogapi prepare' op for hook_nodeapi to allow loading the tags into the body on editing - and rewrite half of the somewhat dusty Inline Tags module for better handling of the markup.

But at least it should work this time. Let's hope.

Update 3: Needed to hack out taxonomy validation entirely. But I think that's all.

Multisite environments for EVERY web application

As a webmaster who uses open source web applications, you frequently have to download software updates. Web software is susceptible to security flaws that must be patched quickly, and even when bug-free it is improved constantly.

If you operate a lot of sites on the same software, you run into a very annoying problem: Every site needs its own codebase. For five MediaWiki sites, you will unzip the same MediaWiki package five times, update it five times, possibly hack your own modifications in it five times, and need five times the diskspace.

Practically no CMS software (apart from Drupal, and arguably Typo3) is tailored for power users. They all assume that the codebase is equivalent to the site, and do not allow multiple sites using the same code.

Drupal shows that multisite support is extremely simple: Just use a sensible algorithm to convert URLs into folder names of varying specificity, then include the most specific settings file that matches the current URL. In fact, this idea is so beautiful that I decided to steal it.

To generalize the concept of multisites, you need the following things:

  • a sites/ subfolder in the software installation, where all site settings folders are contained.
  • a standard configuration file where the software expects it, which determines the active site and includes the appropriate settings file.

It's basically the concept of Apache's "virtual hosts" applied to PHP software.

First, I have the multisites.inc file, which contains the algorithm that finds site folders by the URL. This is basically cribbed from Drupal (with some modification), which means it follows the following priorities:

http://my.site.com/folder/sub will be looked for in: my.site.com.folder.sub site.com.folder.sub com.folder.sub my.site.com.folder site.com.folder com.folder my.site.com site.com com default

Here is the code:

<?php
/**
 * determines the active site by URL and returns 
 * the folder the settings file is in.
 * settings file and site directory can be customized, 
 * default is "settings.php" and "sites".
 */
function multisite_load_conf($conffile = 'settings.php', $confdir = 'sites') {
  static $conf = '';
  // If for some reason the conf file is loaded again 
  // in the same request, save some time:
  if ($conf) return $conf;
 
  // When running an update script from the command line, allow an alternative:
  if (empty($_SERVER['HTTP_HOST'])) {
    foreach ($_SERVER['argv'] as $i => $arg) {
      if ($arg == '-site' && $i+1 < count($_SERVER['argv'])) {
         $conf = $confdir . '/' . $_SERVER['argv'][$i+1];
      }
    }
    if (!$conf) exit ("You are running PHP in a command line. 
            Multisite requires '-site <confdir>' as an argument.n");
    return $conf;
  }
 
  // Start with the path and the domain split into tokens
  $uri = explode('/', $_SERVER['SCRIPT_NAME'] ? 
              $_SERVER['SCRIPT_NAME'] 
              : $_SERVER['SCRIPT_FILENAME']);
  $server = array_reverse(explode(':', rtrim($_SERVER['HTTP_HOST'], '.')));
  
  // Strip sub-domains from the left, and folders from the right.
  for ($i = count($uri) - 1; $i > 0; $i--) {
    for ($j = count($server); $j > 0; $j--) {
      $dir = implode('.', array_slice($server, -$j)) 
                 . implode('.', array_slice($uri, 0, $i));
      if (   file_exists("$confdir/$dir/settings.php") 
          || (!$include && file_exists("$confdir/$dir"))) {
        $conf = "$confdir/$dir";
        return $conf;
      }
    }
  }
  $conf = "$confdir/default";
  return $conf;
}
?>

Then, in the normal config file ("LocalSettings.php" or "config.php" or whatever) you just need the following code:

<?php
require_once 'multisites.inc'; // or wherever you put this file.
$settings = multisite_load_conf(); // adjust default file and folder names as needed.
include_once "$settings/settings.php"; // but first make sure it exists to avoid errors.
?>

The actual config file for the site has to be moved to the proper site folder. Needless to say, any software that generates this file automatically may cause you some trouble when installing. Yet compared to the strain of keeping several different codebases up to date, this trickery is very much worth it.

I use it with MediaWiki and phpBB successfully, and it should work with any software as long as it does not try to place static files into a hard-coded location within its own installation directory. If it does, you will need to hack this hard-coded value to allow it to be configured.

Syndicate content