INSTR and SUBSTR performance on Oracle database systems

The largest string PL/SQL can handle is NVARCHAR2(32767).

If you need larger strings, you need to use the NCLOB datatype.

But although SUBSTR and INSTR work on this datatype, too, performance is abysmal, your CPU and RAM will burn and your disks will get rills on the TEMP tablespace.

This is when you turn over to the DBMS_LOB package that provides specialized support for accessing large objects, including INSTR and SUBSTR on NCLOB types.

However, INSTR is still slow.

In fact, it is so slow, it’s useful to SUBSTR the NCLOB into NVARCHAR(32767/n) chunks and INSTR those chunks.

I went from 23 seconds to 0.15 seconds splitting a 230 KiB NCLOB on a ML370.

I probably should go and double-check the result. Everything can be fast if it doesn’t have to be correct.

Update: DBMS_LOB.SUBSTR will return 32767/n characters, where n is the maximum byte-width of the target character set, e.g. 3 for UTF-8.

Advertisements
This entry was posted in Software Development. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s