jSoup Error: Index Out Of Bounds For Length

By admin
jSoup Error: Index Out Of Bounds For Length

Over on my Feature Flags Book site, I’m starting to move some of the content behind a pay-wall; and, to do this, I’m using jSoup to replace multiple content paragraphs with a single purchase notice paragraph within designated chapters. However, in my

first
ORDINAL

approach to this algorithm, I was getting the following jSoup error:

Index 1 out of bounds for length 0

The error isn’t terribly helpful; but, I believe what’s happening here is that when I remove an element from the jSoup DOM (Document Object Model) using an .empty() call, jSoup is not breaking the parent-child relationship to the removed elements. Which is then causing an issue when I go to re-append the removed elements back into the same parent.

I can reproduce this error with a simple jSoup demo using this HTML document:

<body> <p>jSoup + ColdFusion =

Noice!</p
ORG

> </body>

To reproduce the error with

ColdFusion
ORG

(Lucee CFML), I’m going to .empty() the body and then re-append the single p element:

<cfscript> body =

javaNew
PERSON

( "org.jsoup.Jsoup" ) .parseBodyFragment( fileRead( "./content.htm" ) ) .body() ; paragraph = body.firstElementChild(); // Remove all the children from the BODY and then try to re-add the paragraph. body .empty()

.appendChild
PERSON

( paragraph ) ; // Output resultant HTML to the page. echo( body.outerHtml() ); // ——————————————————————————- // // ——————————————————————————- // /** * I create a new

Java
PRODUCT

class wrapper using the jSoup JAR files. */ public any function javaNew( required string className ) { var

jarPaths
PERSON

= [ expandPath( "

./jsoup-1.16.1.jar
GPE

" ) ]; return( createObject( "java", className,

jarPaths
GPE

) ); } </cfscript>

And, when we run this

ColdFusion
ORG

code, we get the following error:

Index 1 out of bounds for length 0

For anyone Googling to get here, this is the stacktrace that I get:

lucee.runtime.exp.NativeException: Index 1 out of bounds for length

0
CARDINAL

at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) at java.base/java.util.Objects.checkIndex(Objects.java:372) at java.base/java.util.ArrayList.remove(ArrayList.java:536) at org.jsoup.helper.ChangeNotifyingArrayList.remove(ChangeNotifyingArrayList.java:37) at org.jsoup.nodes.

Node.removeChild(Node.java:504
PERSON

) at org.jsoup.nodes.

Node.setParentNode(Node.java:482
ORG

) at org.jsoup.nodes.Node.reparentChild(Node.java:563) at org.jsoup.nodes.

Element.appendChild(Element.java:577
NORP

)

To fix this error, we need to call .remove() on the p element before we try to re-append it to the body :

<cfscript> body =

javaNew
PERSON

( "org.jsoup.Jsoup" ) .parseBodyFragment( fileRead( "./content.htm" ) ) .body() ; paragraph = body.firstElementChild(); // In order to re-append the paragraph back into the document, we have to

first
ORDINAL

BREAK // THE PARENT RELATIONSHIP to the body. We can do that by calling removing() on the // paragraph itself.

paragraph.remove
DATE

(); // Remove all the children from the BODY and then try to re-add the paragraph. body .empty() // Remove any remaining non-element nodes (ex, comments).

.appendChild
PERSON

( paragraph ) ; // Output resultant HTML to the page. echo( body.outerHtml() ); // ——————————————————————————- // // ——————————————————————————- // /** * I create a new

Java
PRODUCT

class wrapper using the jSoup JAR files. */ public any function javaNew( required string className ) { var

jarPaths
PERSON

= [ expandPath( "

./jsoup-1.16.1.jar
GPE

" ) ]; return( createObject( "java", className,

jarPaths
GPE

) ); } </cfscript>

The only difference in this version of the code is that I’m calling paragraph.remove() before adding the node back into the

DOM
ORG

. Whatever this is doing behind the scenes, it is properly breaking the parent-child relationship in a way that calling .empty() does not.

ASIDE: Some jSoup methods, like

.children
PERSON

() , return an Array of Element nodes called Elements . This array has its own .remove() method that will call .remove() on all of the nodes in the collection.

I don’t know enough about jSoup — or the intention of these methods — in order to call this a "bug"; but, I will say that it seems unexpected to me. In fact, I would expect an .empty() method to be little more than a short-hand implementation for looping over all the child-nodes and calling .remove() on them in turn.

Want to use code from this post? Check out the license.

Enjoyed This Post? ❤️ Share the Love With Your Friends! ❤️ Tweet This Deep thoughts by

@BenNadel – jSoup Error
ORG

: Index Out Of Bounds For Length https://www.bennadel.com/go/4524