Friday, February 11, 2011

Debugging a most curious hang / spin

Here is the stack trace from a JVM that shows 1/4 core of the machine being 100% utilized.

2011-02-11 14:48:05
Full thread dump Java HotSpot(TM) Server VM (19.0-b09 mixed mode):

"pool-23-thread-4" prio=10 tid=0x29e81800 nid=0x50dd waiting for monitor entry [0x2b83f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at net.htmlparser.jericho.TagType.getTagAt(TagType.java:667)
    at net.htmlparser.jericho.Tag.parseAllgetNextTag(Tag.java:631)
    at net.htmlparser.jericho.Tag.parseAll(Tag.java:607)
    at net.htmlparser.jericho.Source.fullSequentialParse(Source.java:609)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:217)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-3" prio=10 tid=0x29e7f400 nid=0x50dc waiting on condition [0x2c355000]
   java.lang.Thread.State: RUNNABLE
    at java.util.Arrays.copyOfRange(Arrays.java:3209)
    at java.lang.String.(String.java:215)
    at java.lang.StringBuilder.toString(StringBuilder.java:430)
    at net.htmlparser.jericho.StartTag.getStartDelimiter(StartTag.java:600)
    at net.htmlparser.jericho.StartTag.getNext(StartTag.java:660)
    at net.htmlparser.jericho.StartTag.getEndTag(StartTag.java:777)
    at net.htmlparser.jericho.StartTag.getEndTagInternal(StartTag.java:566)
    at net.htmlparser.jericho.StartTag.getElement(StartTag.java:167)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:327)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Source.getChildElements(Source.java:721)
    at net.htmlparser.jericho.Element.getParentElement(Element.java:282)
    at HTMLTagExtractorUsingJerichoParser.getRedirectURL(HTMLTagExtractorUsingJerichoParser.java:232)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:219)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-2" prio=10 tid=0x29e82400 nid=0x50db waiting for monitor entry [0x28cf6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at java.util.ArrayList.(ArrayList.java:112)
    at java.util.ArrayList.(ArrayList.java:119)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:309)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Element.getChildElements(Element.java:335)
    at net.htmlparser.jericho.Source.getChildElements(Source.java:721)
    at net.htmlparser.jericho.Element.getParentElement(Element.java:282)
    at HTMLTagExtractorUsingJerichoParser.getRedirectURL(HTMLTagExtractorUsingJerichoParser.java:232)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:219)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"pool-23-thread-1" prio=10 tid=0x29e7cc00 nid=0x50da waiting for monitor entry [0x2a8e8000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at java.util.LinkedList.(LinkedList.java:78)
    at net.htmlparser.jericho.Attributes.construct(Attributes.java:109)
    at net.htmlparser.jericho.Attributes.construct(Attributes.java:78)
    at net.htmlparser.jericho.StartTagType.parseAttributes(StartTagType.java:672)
    at net.htmlparser.jericho.StartTagTypeGenericImplementation.constructTagAt(StartTagTypeGenericImplementation.java:132)
    at net.htmlparser.jericho.TagType.getTagAt(TagType.java:674)
    at net.htmlparser.jericho.Tag.parseAllgetNextTag(Tag.java:631)
    at net.htmlparser.jericho.Tag.parseAll(Tag.java:607)
    at net.htmlparser.jericho.Source.fullSequentialParse(Source.java:609)
    at HTMLTagExtractorUsingJerichoParser.handleRedirects(HTMLTagExtractorUsingJerichoParser.java:217)
    at HTMLTagExtractorUsingJerichoParser.parse(HTMLTagExtractorUsingJerichoParser.java:51)
    at LinksProcessor.add(LinksProcessor.java:222)
    at LinksProcessor.run(LinksProcessor.java:77)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

"Low Memory Detector" daemon prio=10 tid=0x2efac400 nid=0x47dd runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x2efaa000 nid=0x47dc waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x2efa8000 nid=0x47db waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x2efa6800 nid=0x47da waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x2ef96400 nid=0x47d9 in Object.wait() [0x2ec65000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x35c53470> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
    - locked <0x35c53470> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x2ef94c00 nid=0x47d8 in Object.wait() [0x2ece6000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x35c53448> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:485)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    - locked <0x35c53448> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x091c2800 nid=0x47d2 waiting for monitor entry [0xb6b22000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
    at org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:233)
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
    at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
    at org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:1002)
    - locked <0x35c98398> (a org.apache.lucene.index.DocumentsWriter)
    at org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:958)
    - locked <0x35c98398> (a org.apache.lucene.index.DocumentsWriter)
    at org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java:5207)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:4370)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:4209)
    - locked <0x35c98198> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:4200)
    at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:4078)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4151)
    - locked <0x35c99188> (a java.lang.Object)
    at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4124)
    at IndexerHelper.indexLinks(IndexerHelper.java:137)
    at indexLinks(Gauntlet.java:330)
    at battle(Gauntlet.java:353)
    at main(Gauntlet.java:435)

"VM Thread" prio=10 tid=0x2ef92400 nid=0x47d7 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x091c9c00 nid=0x47d3 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x091cb400 nid=0x47d4 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x091cc800 nid=0x47d5 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x091ce000 nid=0x47d6 runnable

"VM Periodic Task Thread" prio=10 tid=0x091dd000 nid=0x47de waiting on condition

The main thread has issued a cancellation to its children and proceeded to commit the Lucene buffers the child threads have been filling. However, none of the child threads seem to be in the part of the code where Lucene buffers are being modified. And the Lucene commit() does not finish. It is moving along as I can see the index being updated in the file system, but for some reason it is spinning.


The other 4 child threads seem to be spinning. Somehow, all 4 cores are not being used. Just one core is 100% utilized.


top - 15:00:19 up 22 days, 23:49,  2 users,  load average: 1.00, 1.00, 0.94
Tasks: 121 total,   1 running, 120 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.3%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3872764k total,  3791916k used,    80848k free,   121416k buffers
Swap:  3906552k total,     9192k used,  3897360k free,  1228492k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                      
18385 user     20   0 2388m 2.2g  10m S  100 58.8   2225:32 java -Xmx2G -Xss512k ...
26819 user     20   0  2548 1196  904 R    0  0.0   0:00.09 top                                                              

No comments: