Wednesday, November 23, 2011

HDFS namenode won't start after disk full error

If you have trouble restarting the NameNode in a Hadoop cluster after a disk full error, if you don't mind losing some data, you can do the following to get it back up.

Find the 'edits' file in the hadoop dfs.name.dir/current and write this sequence to it :

printf "\xff\xff\xff\xee\xff" > edits

After that, you should be able to start hadoop. Credit here.

Friday, November 04, 2011

Bash - prevent multiple copies of script from running

Since bash commands each spawn its own process, we can't lock files to achieve single copy running semantics. Why? Because file locks are per process and they are automatically cleared when the process dies. Thus it is nonsensical to expect a linux command to lock a file, why, when that command returns, the lock file will be automatically unlocked defeating the purpose of the lock completely!

One easy way to prevent multiple copies running is to find an atomic Linux command that can both do a certain operation and return whether that operation succeeded atomically. This command must fail on the second time. The command to make a directory - mkdir - is one such command.

So the script could try to mkdir a particular directory - let's call this the lock directory. If it fails, we don't start. Now if it works, we must remove the lock directory when the script ends so that the script can run again. We do this using the trap command - trap will make sure a given command will execute when the script exits at any point.

Here is the code:

#!/bin/bash                                                                                                                                                                                        
mkdir /tmp/locka 2>/dev/null || {
    exit
}
trap "rmdir /tmp/locka" EXIT
#script work, the sleep 10 below is to test this
#without having a real script.
sleep 10

Thursday, July 28, 2011

Java : write binary data to a mysql out file

I had the need to generate - within Java code - a mysql out file with both text and binary data. The binary data is for some content that has been gzipped and stored as a blob in a mysql table. While it is trivial to write binary data to a blob field directly using JDO, for performance reasons, we had to use the "load infile" approach. Thus the first step was to create an outfile.

Here is the function that would convert binary data to a form that can be written to an out file. It follows the algorithm implemented by mysql for its "SELECT INTO outfile" functionality as described here.

    public static byte[] getEscapedBlob(byte[] blob) {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        for (int i=0; i<blob.length; i++) {
            if (blob[i]=='\t' || blob[i]=='\n' || blob[i]=='\\') {
                bos.write('\\');
                bos.write(blob[i]);
            } else if (blob[i] == 0) {
                bos.write('\\');
                bos.write('0');
            } else {
                bos.write(blob[i]);
            }
        }
        return bos.toByteArray();
    }

This is how you would use this function to generate a mysql outfile.

                //gen infile for mysql
                byte[] out = getEscapedBlob(data);
                BufferedOutputStream f = new BufferedOutputStream(new FileOutputStream("/path/to/data.csv")) ;
                String nonBlobFields = "\\N\t10\t20100301\t18\t1102010\t2010-03-01 00:00:00\t";
                byte[] nonBlobData = nonBlobFields.getBytes("UTF-8");
                f.write(nonBlobData, 0, nonBlobData.length);
                f.write(out, 0, out.length);
                f.write('\n');
                f.close();


This writes some integer data followed by the blob data to the outfile, which can then be loaded back using "LOAD INFILE".

Thursday, July 21, 2011

Ubuntu : Install packages on a cluster of machines

Sometimes, you have a cluster of machines where some packages need to be installed. It would be nice to be able to automate this so that you could do everything from a single terminal. We have seen how a command can be run on multiple machines from a single terminal before. This only works if you have password-less ssh set up between the machine that you are running the command from and the cluster on which you want the command to actually run. The only aspect that makes this a little harder for installing software is that you need to be root to install packages and ssh keys are not generally set-up for root.

However, there is an option -S that you can provide sudo that will make sudo read the password from stdin. We can use this combined with the bash loop to come up with a one liner that would install a package across a cluster of machines.

for m in m1 m2 m3 m4 ; do echo $m; ssh $m "echo password | sudo -S apt-get -y install curl" ; done

The -S option makes sure that the command will not prompt you for a password or complain about a missing tty. The -y option for apt-get prevents it from prompting you prior to the install.

Friday, July 15, 2011

Mac / Microsoft Excel / newlines (\r \n)

It is a frequently the case that the business department hands over Excel files to the engineering department for some type of data processing. The first step here is to convert this to a proper comma separated text file (csv).

If you are doing this conversion using Microsoft Excel on a Mac, you'll note that the resulting file does not have Unix-style newlines. A Unix new line is the 0x0a character, also written as \n. What Excel produces is the 0x0d character, also written as \r.

Most Linux commands do not recognize \r as a line ending. There are several ways to convert the \r characters to proper Linux style line endings. Using the vi editor is a common method. However, there is also the issue that sometimes if the Excel spreadsheet has blank columns, Excel insists on writing a possibly large number of \r characters at the end of the  csv file. The vi method would write a newline per each of these \r characters and that is not ideal.

Instead, you could use this perl one-liner to accomplish both : turn all \r into \n except for the trailing \r characters :

perl -ne 's/([^\r])\r/$1\n/g; s/\r//g; print;'  imported.csv

The regular expression replaces any non \r character followed by \r with the non \r character followed by a \n. Since the trailing \r characters do not match this pattern, they are thus ignored. The second regexp removes these \r characters.

Wednesday, July 06, 2011

Linux Shell, HUP and process status on logout

It used to be the case that all processes a user starts are killed by the shell upon logout. Not any more, as recent experiments with Ubuntu 10.04 shows.

The shell can be configured to send a HUP signal to its children when the shell exits. This is controlled by the huponexit shell option as explained in the bash man page:

If the huponexit shell option has been set with shopt, bash sends a SIGHUP to all jobs when an interactive login shell exits.

Determine the setting of huponexit with:

shopt huponexit

If it is "off", then processes started by the user will remain running after logout. This setting makes it easier to start a long running process simply from within the shell, without invoking a screen and without having to wrap the process in nohup.

Here is a discussion on the issue.

However, this setting seems to cause problems for interactive sessions when a new user could start referring to an old user's now invalid processes.

Thursday, June 23, 2011

Asynchronous UDP server using Java NIO

UDP is a light-weight protocol as compared to TCP. When the data transmitted is small (in hundreds of bytes), and an occasional loss of data is not critical, UDP can be used to improve throughput of the program.

The native sockets library (C) provides the epoll function - available on Linux 2.6.x kernels - that can be used for both TCP and UDP sockets. In an earlier post, I described a framework that can be used to implement an asynchronous client that connects to multiple servers using TCP. I found several code examples that described how Java NIO can be used for this purpose. It turns out that it is even simpler to write a NIO server for UDP.

I would not recommend writing a UDP server if the request/response cannot be transmitted in a single UDP packet or if a packet has a dependency on an earlier packet. UDP packets can arrive out of order and the headers have no sequence numbers to enable re-ordering. If you want to handle reordering, you will be implementing what TCP provides for this purpose and it is probably a better idea to stick with TCP.

The following program does well when the request/response sticks in a single UDP packet. 512 bytes is generally considered the safe maximum size and the DNS protocol mandates a maximum packet size of 512 when it uses UDP.

public class ASyncUDPSvr {
    static int BUF_SZ = 1024;

    class Con {
        ByteBuffer req;
        ByteBuffer resp;
        SocketAddress sa;

        public Con() {
            req = ByteBuffer.allocate(BUF_SZ);
        }
    }

    static int port = 8340;
    private void process() {
        try {
            Selector selector = Selector.open();
            DatagramChannel channel = DatagramChannel.open();
            InetSocketAddress isa = new InetSocketAddress(port);
            channel.socket().bind(isa);
            channel.configureBlocking(false);
            SelectionKey clientKey = channel.register(selector, SelectionKey.OP_READ);
            clientKey.attach(new Con());
            while (true) {
                try {
                    selector.select();
                    Iterator selectedKeys = selector.selectedKeys().iterator();
                    while (selectedKeys.hasNext()) {
                        try {
                            SelectionKey key = (SelectionKey) selectedKeys.next();
                            selectedKeys.remove();

                            if (!key.isValid()) {
                              continue;
                            }

                            if (key.isReadable()) {
                                read(key);
                                key.interestOps(SelectionKey.OP_WRITE);
                            } else if (key.isWritable()) {
                                write(key);
                                key.interestOps(SelectionKey.OP_READ);
                            }
                        } catch (IOException e) {
                            System.err.println("glitch, continuing... " +(e.getMessage()!=null?e.getMessage():""));
                        }
                    }
                } catch (IOException e) {
                    System.err.println("glitch, continuing... " +(e.getMessage()!=null?e.getMessage():""));
                }
            }
        } catch (IOException e) {
            System.err.println("network error: " + (e.getMessage()!=null?e.getMessage():""));
        }
    }

    private void read(SelectionKey key) throws IOException {
        DatagramChannel chan = (DatagramChannel)key.channel();
        Con con = (Con)key.attachment();
        con.sa = chan.receive(con.req);
        System.out.println(new String(con.req.array(), "UTF-8"));
        con.resp = Charset.forName( "UTF-8" ).newEncoder().encode(CharBuffer.wrap("send the same string"));
    }

    private void write(SelectionKey key) throws IOException {
        DatagramChannel chan = (DatagramChannel)key.channel();
        Con con = (Con)key.attachment();
        chan.send(con.resp, con.sa);
    }

    static public void main(String[] args) {
        ASyncUDPSvr svr = new ASyncUDPSvr();
        svr.process();
    }
}

When dealing with small data sizes that fit in one packet, clearly if the NIO interface signals us that data is available to be read, then all the data must be available. Thus the protocol does not need to worry about accumulating network data in buffers. We still do need an object that is tied to each client connection as the reading and writing happen in two distinct parts of the code.

First, after establishing our UDP socket locally on the server, we signal NIO that the socket is ready for reads. When NIO wakes us up - via the select() call - we can immediately read the full request made by the client. At this point, we form our response but do not want to write it back to the network right away, as the kernel buffers may be full and the write may block. So, we store the response on the object attached to the client connection (via the SelectionKey object), signal NIO that we are now ready to write and go back to our select() loop.

Next when NIO wakes us up from the select() call, we can proceed to write. Again since the data fits in one packet, we know that the send() call need not be retried, and all data will be sent.

However, the nature of UDP does not provide the advantages TCP provides in epoll() mode. A UDP server does not provide a separate socket for each new client. Thus the epoll selector always has just the single socket. Each new client sends its datagrams to the single UDP receive buffer of the server.

A threaded server without the use of epoll() might be more advantageous. Each thread could wait on the single server socket, using a receive() call. The kernel will ensure that only one thread wakes up from the receive() call. I hope to use such an implementation and measure both designs.

Friday, June 10, 2011

Java splitting an empty string

Splitting an empty string results in an array whose single element is an empty string - not intuitive. The expected result is either a null array or a zero-length array.

Perl:

$$$:~$ perl -e '@x=split(/ /, ""); $s=@x;print "$s\n"'
0

Python:
$$$:~$ python -c 'list="".split();l=len(list);print(l)'
0

Thursday, June 09, 2011

/dev/urandom does not generate correct multi-byte sequences

If you use /dev/urandom with "tr" to generate random strings, you may have a problem if your shell uses a multi-byte locale. Upon encountering illegal bytes, tr will complain with "tr: Illegal byte sequence".

Setting the LC_TYPE=C before tr would do the trick:

cat /dev/urandom| LC_CTYPE=C tr -dc 'a-zA-Z0-9'

Friday, May 20, 2011

Some web servers are in love with 30X redirects

Here is a URL that redirects via 30X response headers no less than 10 times:

http://join.scoreondemand.com/strack/MTAwNC45LjQ0LjQ0LjI5LjAuMC4wLjA/scoreondemand/64/0/Default.aspx

The evidence:

mpire@seaxoaff01:~$ curl -I http://join.scoreondemand.com/strack/MTAwNC45LjQ0LjQ0LjI5LjAuMC4wLjA/scoreondemand/64/0/Default.aspx
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:56:12 GMT
Cneonction: close
Location: http://join.scoreondemand.com/track/MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w/Default.aspx?switched=1&strack=0
ScoreTracker: scash04
Content-Type: text/html
Set-Cookie: NSC_tdpsfdbti-obut-80=ffffffff090a1f1e45525d5f4f58455e445a4a423660;Version=1;Max-Age=3600;path=/;httponly

mpire@seaxoaff01:~$ curl -I "http://join.scoreondemand.com/track/MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w/Default.aspx?switched=1&strack=0"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:56:31 GMT
Set-Cookie: PHPSESSID=rbra31g59vc4me30bbofgei7d1; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
nnCoection: close
Set-Cookie: nats=MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w; expires=Mon, 30-May-2011 19:56:31 GMT; path=/; domain=scoreondemand.com
Set-Cookie: nats_cookie=No%2BReferring%2BURL; expires=Mon, 30-May-2011 19:56:31 GMT; path=/; domain=scoreondemand.com
Set-Cookie: nats_unique=MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w; expires=Sat, 21-May-2011 19:56:31 GMT; path=/; domain=scoreondemand.com
Set-Cookie: nats_sess=726064a93aca6b9d49d72dd57f477c57; expires=Sun, 28-Aug-2011 19:56:31 GMT; path=/; domain=scoreondemand.com
Location: http://www.scoreondemand.com/Default.aspx?nats=MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w&switched=1&strack=0
ScoreTracker: scash01
Content-Type: text/html
Set-Cookie: NSC_tdpsfdbti-obut-80=ffffffff090a1f1d45525d5f4f58455e445a4a423660;Version=1;Max-Age=3600;path=/;httponly

mpire@seaxoaff01:~$ curl -I "http://www.scoreondemand.com/Default.aspx?nats=MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w&switched=1&strack=0"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:56:47 GMT
X-AspNet-Version: 2.0.50727
Location: http://join.eboobstore.com/strack/MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w/eboobstore/64/0/apple/
Set-Cookie: ASP.NET_SessionId=xhivgbmef0bqm1uhihz1r455; path=/; HttpOnly
Set-Cookie: SVOD1=UserID=11133628&SessionID=1bM0125818hnmy5CAG9P; expires=Thu, 18-Aug-2011 19:56:47 GMT; path=/
Set-Cookie: NATS=MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w; expires=Thu, 18-Aug-2011 19:56:47 GMT; path=/
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 206

mpire@seaxoaff01:~$ curl -I "http://join.eboobstore.com/strack/MTAwNC42NC40Ny40Ny4yOS4wLjAuMC4w/eboobstore/64/0/apple/"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:57:13 GMT
Cneonction: close
Location: http://join.eboobstore.com/track/MTAwNC42NC41MC41MC4yOS4wLjAuMC4w/apple/?switched=1&strack=0
ScoreTracker: scash04
Content-Type: text/html; charset=UTF-8
Set-Cookie: NSC_tdpsfdbti-obut-80=ffffffff090a1f1e45525d5f4f58455e445a4a423660;Version=1;Max-Age=3600;path=/;httponly

mpire@seaxoaff01:~$ curl -I "http://join.eboobstore.com/track/MTAwNC42NC41MC41MC4yOS4wLjAuMC4w/apple/?switched=1&strack=0"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:57:39 GMT
Set-Cookie: PHPSESSID=utrmur64derb0n2onaotf08640; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
nnCoection: close
Set-Cookie: nats=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w; expires=Mon, 30-May-2011 19:57:39 GMT; path=/; domain=eboobstore.com
Set-Cookie: nats_cookie=No%2BReferring%2BURL; expires=Mon, 30-May-2011 19:57:39 GMT; path=/; domain=eboobstore.com
Set-Cookie: nats_unique=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w; expires=Sat, 21-May-2011 19:57:39 GMT; path=/; domain=eboobstore.com
Set-Cookie: nats_sess=a7568ec181d54a6316d6452656565dba; expires=Sun, 28-Aug-2011 19:57:39 GMT; path=/; domain=eboobstore.com
Location: http://www.eboobstore.com/apple/?nats=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w&switched=1&strack=0
ScoreTracker: scash01
Content-Type: text/html; charset=UTF-8
Set-Cookie: NSC_tdpsfdbti-obut-80=ffffffff090a1f1d45525d5f4f58455e445a4a423660;Version=1;Max-Age=3600;path=/;httponly

mpire@seaxoaff01:~$ curl -I "http://www.eboobstore.com/apple/?nats=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w&switched=1&strack=0"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:58:46 GMT
Location: http://www.eboobstore.com/urlmunge/munger/nats=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w&switched=1&strack=0_URL_apple/
Content-Type: text/html; charset=UTF-8

mpire@seaxoaff01:~$ curl -I "http://www.eboobstore.com/urlmunge/munger/nats=MTAwNC42NC41MC41MC4yOS4wLjAuMC4w&switched=1&strack=0_URL_apple/"
HTTP/1.1 302 Found
Date: Fri, 20 May 2011 19:58:58 GMT
Set-Cookie: PHPSESSID=33hcivt9bgl7o6qv5bpfv7aun4; path=/; domain=eboobstore.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: http://www.eboobstore.com/apple
ScoreTracker: web04
Content-Type: text/html; charset=UTF-8

mpire@seaxoaff01:~$ curl -I "http://www.eboobstore.com/apple"
HTTP/1.1 301 Moved Permanently
Date: Fri, 20 May 2011 19:59:10 GMT
Location: http://eboobstore.com/apple/
Content-Type: text/html; charset=UTF-8

mpire@seaxoaff01:~$ curl -I "http://eboobstore.com/apple/"
HTTP/1.1 301 Moved Permanently
Date: Fri, 20 May 2011 19:59:48 GMT
Location: http://www.eboobstore.com/apple/
Content-Type: text/html; charset=UTF-8

mpire@seaxoaff01:~$ curl -I "http://www.eboobstore.com/apple/"
HTTP/1.1 200 OK
Date: Fri, 20 May 2011 20:00:30 GMT
ScoreTracker: web06
Content-Type: text/html; charset=UTF-8

Wednesday, March 23, 2011

Insanely compressed html files

Today, I discovered a URL that sent some insanely compressed content. The compressed content was sent by the server using Content-Encoding: gzip and Transfer-encoding: chunked. The compressed size of the content was 2,921,925 bytes and it decompressed to 1,004,263,982 bytes. The decompressed content was roughly 344 times the size of the compressed content.

This caused certain things to go wrong in the production process. I had set a limit of a few Megs on all fetches and had assumed that a single fetch could not be more than a few Megs. This was the first time I have seen such a huge decompression rate. This caused a subsequent file mapping to fail due to inadequate memory.

The downloaded content suggested why this would compress so well. The URL was http://www.jeltel.com.au/news.php There seems to be a dynamically generated part on this URL. If you examine its source, you will see a marker like this:

<!-- JELTEL_CONTENT_BEGIN -->

Content after that seems dynamically generated. You will find markup like this:

<h2></h2> - <br/><h4>... <a href="">read more</a></h4>

On this particular instance, there was an unusually large amount of fake content generated. The downloaded file had just 33 lines, but the last long line was a huge repeating pattern of :

<a href="">read more</a></h4><br/><br/><h2></h2> - <br/><h4>... 

This would of course compress well.

Wednesday, March 16, 2011

Linux sort : bug with , separator and confusing period?

user@host:~/$ echo -e "alan,20,3,0\ngeorge,5,0,0\nalice,3,5,0\ndora,4,0.9,5" | sort -n -k 2 -t ,
dora,4,0.9,5
alice,3,5,0
george,5,0,0
alan,20,3,0user@host:~/$ 

The line with "dora" as the first term should be printed after "alice" and before "george", as we are asking sort to sort on the second column. The 3rd column value of "0.9" seems to confuse sort on this.

This is not a bug in sort but due to the locale setting on different operating systems.

On the above link, look for "Sort does not sort in normal order".

Setting LC_PATH=C sorts as expected:

user@host:~/$ echo -e "alan,20,3,0\ngeorge,5,0,0\nalice,3,5,0\ndora,4,0.9,5" | LC_ALL=C sort -n -k 2 -t ,
alice,3,5,0
dora,4,0.9,5
george,5,0,0
alan,20,3,0
user@host:~/$ 

Friday, March 11, 2011

A useful, scriptable way to remove offending known_hosts keys

You can use ssh-keygen -R to remove invalid keys from the known_hosts file. This becomes useful if the host names are hashed in the file. The default in Ubuntu/Lucid is to hash the host names.

Tuesday, February 22, 2011

Java code to tail -N a text file

This code allows you to go over the last N lines of a specified file. It has a "head" method, which simply allows you to go over the file from the beginning.
import java.io.*;
import java.nio.channels.FileChannel;
import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.util.Iterator;
import java.util.NoSuchElementException;public class MMapFile {

    public class MMapIterator implements Iterator<String> {
        private int offset;

        public MMapIterator(int offset) {
            this.offset = offset;
        }
        
        public boolean hasNext() {
            return offset < cb.limit();
        }

        public String next() {
            ByteArrayOutputStream sb = new ByteArrayOutputStream();
            if (offset >= cb.limit())
                throw new NoSuchElementException();
            for (; offset < cb.limit(); offset++) {
                byte c = (cb.get(offset));
                if (c == '\n') {
                    offset++;
                    break;
                }
                if (c != '\r') {
                    sb.write(c);
                }

            }
            try {
                return sb.toString("UTF-8");
            } catch (UnsupportedEncodingException e) {}
            return sb.toString();
        }

        public void remove() {

        }
    }


    private ByteBuffer cb;
    long size;
    private long numLines = -1;
    public MMapFile(String file) throws FileNotFoundException, IOException {
        FileChannel fc = new FileInputStream(new File(file)).getChannel();
        size = fc.size();
        cb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
    }

    public long getNumLines() {
        if (numLines != -1) return numLines;  //cache number of lines
        long cnt = 0;
        for (int i=0; i <size; i++) {
            if (cb.get(i) == '\n')
                cnt++;
        }
        numLines = cnt;
        return cnt;
    }

    public Iterator<String> tail(long lines) {
        long cnt=0;
        long i=0;
        for (i=size-1; i>=0; i--) {
            if (cb.get((int)i) == '\n') {
                cnt++;
                if (cnt == lines+1)
                    break;
            }
        }
        return new MMapIterator((int)i+1);
    }

    public Iterator<String> head() {
        return new MMapIterator(0);
    }

    static public void main(String[] args) {
        try {
            Iterator<String> it = new MMapFile("/test.txt").head();
            while (it.hasNext()) {
                System.out.println(it.next());
            }
        } catch (Exception e) {
            
        }

        System.out.println();

        try {
            Iterator<String> it = new MMapFile("/test.txt").tail(2);
            while (it.hasNext()) {
                System.out.println(it.next());
            }
        } catch (Exception e) {

        }

        System.out.println();

        try {
            System.out.println("lines: "+new MMapFile("/test.txt").getNumLines());
        } catch (Exception e) {

        }

    }

}

The technique is to simply map the file into memory using java.nio.channels.FileChannel.map in the Java NIO library and manipulate the file data using memory techniques.

For the "tail" function, we walk back the mapped bytes counting newlines. The MMapIterator class conveniently provides a way to iterate over lines once we find the starting line.

There is a point where care must be taken in the MMapIterator.next() implementation. That is making sure that bytes are converted to the appropriate string encoding. We use "UTF-8" but if you are dealing with a different encoding in the input file, this should be changed.

Monday, February 21, 2011

Java : all 8-bit data cannot be cast to char type

If you have a byte and want to make a String, a simple cast to a char will only work if you are dealing with 7-bit ASCII. If that byte could be extended ASCII, the cast will not encode to the correct character code.

This is the sure way to get all 8-bit characters represented in Strings:

new String(new byte[]{c}, "ISO-8859-1")

This is to do with the default encoding used by the JVM which is likely not "ISO-8859-1". On Linux, it is likely to be "UTF-8" and it is "MacRoman" on Snow Leopard (Mac).

Friday, February 18, 2011

Running HBase - some issues to be aware of

I want to take a moment to note down a few issues I had with setting up a distributed HBase environment in case it helps someone else.

First, I set up Hadoop from the 0.20 append branch as described here. I used two machines where the first machine was the master and both machines were used as slaves. This is a guide I used.

mkdir ~/hadoop-0.20-append
cd ~/hadoop-0.20-append
svn co https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append/ .
ant jar

At the end of this, you will have the hadoop jar file at ~/hadoop-0.20-append/build

The first mistake I made was to use the IP address of the name node for fs.default.name in the conf/core-site.xml file. There is a bug in Hadoop 0.20 release that prevents the use of IP address in this context.

Interestingly, the basic HDFS shell commands worked (ex: get, ls) with the IP address being used for fs.default.name. The problem only cropped up after I setup HBase and tried to use the HBase shell.

To setup HBase, I followed the steps outlined here.

Before I discovered the IP related issue, I encountered an error that showed I was not following the steps faithfully enough. While HBase ships with a version of Hadoop from presumably the 0.20 append branch, it was not identical to the version I built from the 0.20 append branch. As stated in the documentation, I then copied the Hadoop jar I built over the jar shipped with HBase.

Next, I ran into the IP issue. Generally, changing fs.default.name and restarting the Hadoop cluster is not enough in such cases as certain data has been written to HDFS name and data directories already and any mismatch emboldens further "namespace mismatch" errors. Thus, before changing the fs.default.name, I removed the directories specified by dfs.name.dir and dfs.data.dir. In case of dfs.data.dir, I had to remove it on both slaves. Then I changed the IP over to the machine name, formatted the name node and re-started the Hadoop cluster.

It still was not over. This time there was the issue of these machine names not being in the DNS. They happened to be simply names assigned to these machines which were not in the domain name system used by the machines to communicate to one another. Thus I went into the /etc/hosts file on both machines and added appropriate entries to allow each box to resolve the domain name to an IP.

After which, I could create a table and insert rows into it as explained here.

The next step was to programmatically create a table and add rows to it. I adapted the example from here. The programming API by default allows the code to find the hbase configuration files using the class path. Thus, I added the path to the hbase/conf directory to the classpath to get the program to work. Alternately, you could use
org.apache.hadoop.hbase.HbaseConfiguration.addHbaseResources(org.apache.hadoop.conf.Configuration) 
which would have you use
org.apache.hadoop.conf.Configuration.addResources(org.apache.hadoop.fs.Path)
to add the paths to individual configuration files.

Friday, February 11, 2011

Debugging a most curious hang / spin

Here is the stack trace from a JVM that shows 1/4 core of the machine being 100% utilized.

2011-02-11 14:48:05
Full thread dump Java HotSpot(TM) Server VM (19.0-b09 mixed mode):

"pool-23-thread-4" prio=10 tid=0x29e81800 nid=0x50dd waiting for monitor entry [0x2b83f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at net.htmlparser.jericho.TagType.getTagAt(TagType.java:667)
    at net.htmlparser.jericho.Tag.parseAllgetNextTag(Tag.java:631)
    at net.htmlparser.jericho.Tag.parseAll(Tag.java:607)
    at net.htmlparser.jericho.Source.fullSequentialParse(Source.java:609)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:217)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-3" prio=10 tid=0x29e7f400 nid=0x50dc waiting on condition [0x2c355000]
   java.lang.Thread.State: RUNNABLE
    at java.util.Arrays.copyOfRange(Arrays.java:3209)
    at java.lang.String.(String.java:215)
    at java.lang.StringBuilder.toString(StringBuilder.java:430)
    at net.htmlparser.jericho.StartTag.getStartDelimiter(StartTag.java:600)
    at net.htmlparser.jericho.StartTag.getNext(StartTag.java:660)
    at net.htmlparser.jericho.StartTag.getEndTag(StartTag.java:777)
    at net.htmlparser.jericho.StartTag.getEndTagInternal(StartTag.java:566)
    at net.htmlparser.jericho.StartTag.getElement(StartTag.java:167)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:327)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Source.getChildElements(Source.java:721)
    at net.htmlparser.jericho.Element.getParentElement(Element.java:282)
    at HTMLTagExtractorUsingJerichoParser.getRedirectURL(HTMLTagExtractorUsingJerichoParser.java:232)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:219)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-2" prio=10 tid=0x29e82400 nid=0x50db waiting for monitor entry [0x28cf6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at java.util.ArrayList.(ArrayList.java:112)
    at java.util.ArrayList.(ArrayList.java:119)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:309)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Source.getChildElements(Source.java:721)
    at net.htmlparser.jericho.Element.getParentElement(Element.java:282)
    at HTMLTagExtractorUsingJerichoParser.getRedirectURL(HTMLTagExtractorUsingJerichoParser.java:232)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:219)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-1" prio=10 tid=0x29e7cc00 nid=0x50da waiting for monitor entry [0x2a8e8000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at java.util.LinkedList.(LinkedList.java:78)
    at net.htmlparser.jericho.Attributes.construct(Attributes.java:109)
    at net.htmlparser.jericho.Attributes.construct(Attributes.java:78)
    at net.htmlparser.jericho.StartTagType.parseAttributes(StartTagType.java:672)
    at net.htmlparser.jericho.StartTagTypeGenericImplementation.constructTagAt(StartTagTypeGenericImplementation.java:132)
    at net.htmlparser.jericho.TagType.getTagAt(TagType.java:674)
    at net.htmlparser.jericho.Tag.parseAllgetNextTag(Tag.java:631)
    at net.htmlparser.jericho.Tag.parseAll(Tag.java:607)
    at net.htmlparser.jericho.Source.fullSequentialParse(Source.java:609)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:217)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"Low Memory Detector" daemon prio=10 tid=0x2efac400 nid=0x47dd runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x2efaa000 nid=0x47dc waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x2efa8000 nid=0x47db waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x2efa6800 nid=0x47da waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x2ef96400 nid=0x47d9 in Object.wait() [0x2ec65000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x35c53470> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
    - locked <0x35c53470> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x2ef94c00 nid=0x47d8 in Object.wait() [0x2ece6000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x35c53448> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:485)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    - locked <0x35c53448> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x091c2800 nid=0x47d2 waiting for monitor entry [0xb6b22000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
    at org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:233)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
    at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
    at org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:1002)
    - locked <0x35c98398> (a org.apache.lucene.index.DocumentsWriter)
    at org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:958)
    - locked <0x35c98398> (a org.apache.lucene.index.DocumentsWriter)
    at org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java:5207)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:4370)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:4209)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:4200)
    at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:4078)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4151)
    - locked <0x35c99188> (a java.lang.Object)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4124)
    at IndexerHelper.indexLinks(IndexerHelper.java:137)
    at indexLinks(Gauntlet.java:330)
    at battle(Gauntlet.java:353)
    at main(Gauntlet.java:435)

"VM Thread" prio=10 tid=0x2ef92400 nid=0x47d7 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x091c9c00 nid=0x47d3 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x091cb400 nid=0x47d4 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x091cc800 nid=0x47d5 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x091ce000 nid=0x47d6 runnable

"VM Periodic Task Thread" prio=10 tid=0x091dd000 nid=0x47de waiting on condition

The main thread has issued a cancellation to its children and proceeded to commit the Lucene buffers the child threads have been filling. However, none of the child threads seem to be in the part of the code where Lucene buffers are being modified. And the Lucene commit() does not finish. It is moving along as I can see the index being updated in the file system, but for some reason it is spinning.


The other 4 child threads seem to be spinning. Somehow, all 4 cores are not being used. Just one core is 100% utilized.


top - 15:00:19 up 22 days, 23:49,  2 users,  load average: 1.00, 1.00, 0.94
Tasks: 121 total,   1 running, 120 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.3%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3872764k total,  3791916k used,    80848k free,   121416k buffers
Swap:  3906552k total,     9192k used,  3897360k free,  1228492k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                      
18385 user     20   0 2388m 2.2g  10m S  100 58.8   2225:32 java -Xmx2G -Xss512k ...
26819 user     20   0  2548 1196  904 R    0  0.0   0:00.09 top                                                              

Tuesday, February 01, 2011

beware of using Lucene NIOFSDirectory from a thread pool

NIOFSDirectory does not handle a Thread.interrupt() well. If interrupted this way during I/O, it is known to throw a ClosedByInterruptException.

I was using a thread pool (using the java.util.concurrent) package and noticed that a termination of the thread pool could result in the ClosedByInterruptException being thrown.

The warning is provided in the new documentation.

You would still be ok if you could wait for the thread pool to finish its tasks. You could use ExecutorService.shutdown() followed by ExecutorService.awaitTermination() and these calls would not cause the concurrent package to interrupt the NIOFSDirectory implementation.

The problem crops up if you have to resort to a ExecutorService.shutdownNow() as that would try canceling the already queued up tasks, which would end up interrupting the NIOFSDirectory code.

Wednesday, January 12, 2011