I wrote
earlier on a performance improvement I made to Hadoop. Upon discussing with Hadoop devs, notably Chris Douglas, this change was made to the core org.apache.hadoop.io.Text class. This has the additional benefit of improving a core text handling class used commonly in Hadoop, and we avoid the additional memory foot-print created by having an additional instance of OutputStream.
This
improvement will be available in hadoop 0.21.0:
Note the difference in YourKit profiling data with the new Text class:
1 comment:
Hi nice reading yoour post
Post a Comment