#SQL Parsing data out of Note Text Fields

David Meego - Click for blog homepageToday, I was asked by a consultant how to parse specific data out of the record note associated with a transaction or master record in Microsoft Dynamics GP using a SQL Server query.

GP uses the SY_Record_Notes_MSTR (SY03900) Record Notes Master table to store a text field of up to 32767 characters linked to master or transaction record using the Note Index (NOTEINDX) field. The next available Note Index is stored in the SY_Company_MSTR (SY01500) Company Master in the system (DYNAMICS) database and is assigned to new records when they are created.

The request was to parse the text field to get the contents of lines stored with a prefixed label. The example I used for testing was a record note attached to everyone’s favourite sample company customer, Aaron Fitz Electrical:


Note: There is no blank line after the second line.

The consultant had sent me what they had completed so far and it was heading in the right direction but needed some fine tuning.

Below are the steps I went through to create the final working query:

Start by getting the raw data displayed using a select statement with a simple join:

select N.TXTFIELD
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

Then we can use the SUBSTR() function to extract characters from the text field and use then CHARINDEX() function to find the location of the data we are interested in, with a constant value (for now) for the length:


select N.TXTFIELD,
SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD),
  20)
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

We then need to skip past the label text and remove the leading space character after the colon, which we do by adding the length of the label with the LEN() function and removing leading and trailing spaces with the TRIM() function:


select N.TXTFIELD,
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  20))
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

Our problem now is knowing when to stop the SUBSTRING() function. The consultant’s script used the next label as point to stop, however they found that these was not reliable as the labels in the data were not always in the same sequence.

So, the method I used is to find the carriage return at the end of the line. Dexterity uses the <CR> ASCII value 13 as line break character, so I could look for CHAR(13) to find the end of the line.

However, it is possible (as it my test data) that there is no final carriage return at the end of the text. To avoid having to use complex code to handle this, the easiest method is to manually add a carriage return to the end of the text field when searching for carriage returns. Of course, nothing is as simple as just adding + CHAR(13) to the end because the text datatype and char datatypes cannot be concatenated with a simple + symbol. This issue was solved by using the CONCAT() function:


select N.TXTFIELD,
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)) ) ))
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

The code above is getting closer, but the third parameter to the SUBSTRING() is not the position where to stop the substring, but is the actual length needed. To convert the position to a length we need to remove the starting position (which is the start of the prompt plus the length of the prompt):


select N.TXTFIELD,
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)) ) -
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) ))
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

This appears to work well so far, but actually has a fatal flaw. If used this code to try and retrieve the data at the second label or if there were any lines above the first label, the query would fail with the following error:

Invalid length parameter passed to the LEFT or SUBSTRING function.

This is because the length value passed to the SUBSTRING() function is actually a negative value because the position of the carriage return character found by CHARINDEX() is before the position of the start of the characters we want (found by the start of the label text plus length of the label text).

The missing piece of the puzzle is that we must ask CHARINDEX() to only look for the carriage return character starting after the position of the label text. Using the third optional parameter we can define where to start the search:


select N.TXTFIELD,
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO Name:', N.TXTFIELD)) -
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) ))
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

The final changes are to name the column returned and add the column for the second label:


select N.TXTFIELD,
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'), 
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO Name:', N.TXTFIELD)) - 
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) )) as [IO Name],
TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:'), 
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO#:', N.TXTFIELD)) - 
  (CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:')) )) as [IO No]
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

[EDIT] Just after publishing this article, I thought “What would happen if the label prompt was not found in the note text?”. A quick test found that the code failed to work correctly in that situation, so here is one more update with a check to ensure that the label prompt is found:


select N.TXTFIELD,
CASE WHEN CHARINDEX('IO Name:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO Name:', N.TXTFIELD)) -
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) )) ELSE '' END as [IO Name],
CASE WHEN CHARINDEX('IO#:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO#:', N.TXTFIELD)) -
  (CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:')) )) ELSE '' END as [IO No]
from SY03900 N
left join RM00101 C on C.NOTEINDX = N.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

[EDIT 2] Then I thought, the chances are that most people would use this query with the Note table as the joined table and it is possible that there is no record in the Note table. So this another version with the Note table as the joined table and the ISNULL() function used to avoid the NULL value when the Record Note is missing:


select N.TXTFIELD,
ISNULL(CASE WHEN CHARINDEX('IO Name:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO Name:', N.TXTFIELD)) -
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) )) ELSE '' END, '') as [IO Name],
ISNULL(CASE WHEN CHARINDEX('IO#:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  TRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO#:', N.TXTFIELD)) -
  (CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:')) )) ELSE '' END, '') as [IO No]
from RM00101 C
left join SY03900 N on N.NOTEINDX = C.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

[EDIT 3] The consultant I wrote this example code for has just reminded me that the TRIM() function is not available on the older SQL versions. So below is the updated code using LTRIM() and RTRIM(). Another solution would be to create a User Defined Function (UDF) to add TRIM() as per the article below:


select N.TXTFIELD,
ISNULL(CASE WHEN CHARINDEX('IO Name:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  LTRIM(RTRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO Name:', N.TXTFIELD)) -
  (CHARINDEX('IO Name:', N.TXTFIELD)+LEN('IO Name:')) ))) ELSE '' END, '') as [IO Name],
ISNULL(CASE WHEN CHARINDEX('IO#:', N.TXTFIELD) &amp;amp;gt; 0 THEN
  LTRIM(RTRIM(SUBSTRING(N.TXTFIELD, CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:'),
  CHARINDEX(CHAR(13), CONCAT(N.TXTFIELD, CHAR(13)), CHARINDEX('IO#:', N.TXTFIELD)) -
  (CHARINDEX('IO#:', N.TXTFIELD)+LEN('IO#:')) ))) ELSE '' END, '') as [IO No]
from RM00101 C
left join SY03900 N on N.NOTEINDX = C.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

[EDIT 4] The previous edit then made me think. Why not use a User Defined Function (UDF) to wrap up the parsing code to make it easier to use? So, this (hopefully) final version creates a PARSETEXT() function which can be used in the query:


IF EXISTS(SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[PARSETEXT]') AND type in (N'IF',N'FN',N'TF',N'FS',N'FT'))
  DROP FUNCTION dbo.PARSETEXT
GO
CREATE FUNCTION dbo.PARSETEXT(@string VARCHAR(MAX), @prompt VARCHAR(MAX))
RETURNS VARCHAR(MAX)
BEGIN
RETURN CASE WHEN CHARINDEX(@prompt, @string) > 0 THEN
  LTRIM(RTRIM(SUBSTRING(@string, CHARINDEX(@prompt, @string)+LEN(@prompt),
  CHARINDEX(CHAR(13), CONCAT(@string, CHAR(13)), CHARINDEX(@prompt, @string)) -
  (CHARINDEX(@prompt, @string)+LEN(@prompt)) ))) ELSE '' END
END
GO

select N.TXTFIELD,
ISNULL(dbo.PARSETEXT(N.TXTFIELD, 'IO Name:'), '') as [IO Name], 
ISNULL(dbo.PARSETEXT(N.TXTFIELD, 'IO#:'), '') as [IO No] 
from RM00101 C
left join SY03900 N on N.NOTEINDX = C.NOTEINDX
where C.CUSTNMBR = 'AARONFIT0001'

Note: Once the PARSETEXT() User Defined Function has been created, you only need to execute the select statement portion of the code above.

Writing this article hopefully will provide a reference anyone looking for these techniques in the future.

I hope you found the process used to build the query helpful.

David

26-May-2023: Edits for handling of edge cases.
01-Jun-2023: Edits for older SQL version that does not have TRIM() function, and to create a User Defined Function (UDF).

This article was originally posted on http://www.winthropdc.com/blog.

Please post feedback or comments

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.