ISSUE 197

Number 197
Category errata
Synopsis 17.2.4.3: sscanf/"string" incompatibility
State lrmdraft
Class errata-discuss
Arrival-DateNov 18 2002
Originator Gordon Vreugdenhil <gvreugde@synopsys.com>
Release 2001b: 17.2.4.3
Environment
Description
Normal "string" assignment and other string input ($fscanf, etc)
assign to the target's right-most portion if the target is
wider than the source. This is unfortunate in the context of
$sscanf since it does *not* skip leading nulls. Thus if one has
the following:

module top;
reg [5*8:1] r1;
integer i;
integer code;
initial begin
r1 = "10";
code = $sscanf(r1, "%d",i);
$display("%d %d",code,i);
end
endmodule

the $sscanf finds no matches (code is 0) and i remains at x.

Two alternatives:
1) extend 17.2.4.3 (a) to include "null" as a white space character
2) add a new format specifier to allow one to read a sequence
of nulls

I think that (1) is a better choice and would like to suggest that
we make that change.

Gord.
--
----------------------------------------------------------------------
Gord Vreugdenhil gvreugde@synopsys.com
Staff Engineer, VCS (Verification Tech. Group) (503) 547-6054
Synopsys Inc., Beaverton OR
Fix

This is Charles Dawson's proposal from Jan. 12, 2004.

In 17.2.4.3, following "The control string can contain"
(end of 4th paragraph following the syntax).

CHANGE

"a)White-space characters (blanks, tabs, new-lines, or
form-feeds) that, except in one case described below, cause
input to be read up to the next non-white-space character."

TO

"a) White-space characters (blanks, tabs, new-lines, or
form-feeds) that, except in one case described below, cause
input to be read up to the next non-white-space character.
For $sscanf, null characters shall also be considered
white-space."


(Note that this change also affects null characters that
are embedded within the string.)

Audit-Trail

From: Michael McNamara <mac@verisity.com>
To: Gordon Vreugdenhil <gvreugde@synopsys.com>
Cc: etf-bugs@boyd.com
Subject: RE: errata/197: sscanf/"string" incompatibility
Date: Mon, 18 Nov 2002 14:13:26 -0800

>Category: errata
>Confidential: no
>Originator: Michael McNamara <mac@verisity.com>
>Release: 2001b
>Class: TBD
>Description:

Gordon Vreugdenhil writes:
> Precedence: bulk
>
>
> >Number: 197
> >Category: errata
> >Originator: Gordon Vreugdenhil <gvreugde@synopsys.com>
> >Environment:
> >Description:
>
> Normal "string" assignment and other string input ($fscanf, etc)
> assign to the target's right-most portion if the target is
> wider than the source. This is unfortunate in the context of
> $sscanf since it does *not* skip leading nulls. Thus if one has
> the following:
>
> module top;
> reg [5*8:1] r1;
> integer i;
> integer code;
> initial begin
> r1 = "10";
> code = $sscanf(r1, "%d",i);
> $display("%d %d",code,i);
> end
> endmodule
>
> the $sscanf finds no matches (code is 0) and i remains at x.
>
> Two alternatives:
> 1) extend 17.2.4.3 (a) to include "null" as a white space character
> 2) add a new format specifier to allow one to read a sequence
> of nulls
>
> I think that (1) is a better choice and would like to suggest that
> we make that change.
>
> Gord.
> --
> ----------------------------------------------------------------------
> Gord Vreugdenhil gvreugde@synopsys.com
> Staff Engineer, VCS (Verification Tech. Group) (503) 547-6054
> Synopsys Inc., Beaverton OR

My intention in writing 17.2.4.3, and my reading of the actual
implementation in 17.2.4.3 in 1364-2001, was that sscanf would indeed
skip leading nulls. Basically, the string in a register starts with
the first non null character, as far as the user is concerned. (there
is a special case described in section 4.2.3.2, that arises from using
the concatenate operator {} on two sirngs that are in registers).

So I don't believe there is an error in 1364-2001; rather the error is
in the implementation you are examining. That said, there are many
miles between intent and reality; and so if you could point out text
in 1364-2001 that lead you to believe otherwise, please let us know.

Specifically, where does 1364-2001 state that $sscanf shall not skip
leading nulls?

-mac


From: Gordon Vreugdenhil <gvreugde@synopsys.com>
To: mac@verisity.com
Cc: Gordon Vreugdenhil <Gordon.Vreugdenhil@synopsys.com>, etf-bugs@boyd.com
Subject: Re: errata/197: sscanf/"string" incompatibility
Date: Mon, 18 Nov 2002 14:27:57 -0800

>Category: errata
>Confidential: no
>Originator: Gordon Vreugdenhil <gvreugde@synopsys.com>
>Release: 2001b
>Class: TBD
>Description:
Michael McNamara wrote:
>
> My intention in writing 17.2.4.3, and my reading of the actual
> implementation in 17.2.4.3 in 1364-2001, was that sscanf would indeed
> skip leading nulls. Basically, the string in a register starts with
> the first non null character, as far as the user is concerned. (there
> is a special case described in section 4.2.3.2, that arises from using
> the concatenate operator {} on two sirngs that are in registers).
>
> So I don't believe there is an error in 1364-2001; rather the error is
> in the implementation you are examining. That said, there are many
> miles between intent and reality; and so if you could point out text
> in 1364-2001 that lead you to believe otherwise, please let us know.
>
> Specifically, where does 1364-2001 state that $sscanf shall not skip
> leading nulls?


This may be a simple ommission. In 17.2.4.3 (a) the text is as
follows:

a) White-space characters (blanks, tabs, new-lines, or form-feeds)
that, except in one case described below, cause input to be read
up to the next non-white-space character.

I would suggest changing this to:

a) White-space characters (ASCII nulls, spaces, tabs, new-lines, or
form-feeds)
that, except in one case described below, cause input to be read
up to the next non-white-space character.

unless the intent of the word "blanks" is to cover both null characters
and space characters.

Or are there other characters that are meant to be included in the
term "blanks"?

Gord.

--
----------------------------------------------------------------------
Gord Vreugdenhil gvreugde@synopsys.com
Staff Engineer, VCS (Verification Tech. Group) (503) 547-6054
Synopsys Inc., Beaverton OR

From: Steven Sharp <sharp@cadence.com>
To: etf-bugs@boyd.com
Cc:
Subject: Re: errata/197: sscanf/"string" incompatibility
Date: Wed, 20 Nov 2002 15:22:55 -0500 (EST)

>Category: errata
>Confidential: no
>Originator: Steven Sharp <sharp@cadence.com>
>Release: 2001b
>Class: TBD
>Description:

> I would suggest changing this to:
>
> a) White-space characters (ASCII nulls, spaces, tabs, new-lines, or
> form-feeds)
> that, except in one case described below, cause input to be read
> up to the next non-white-space character.
>
> unless the intent of the word "blanks" is to cover both null characters
> and space characters.
>
> Or are there other characters that are meant to be included in the
> term "blanks"?

I think Mac's point is that the leading nulls are skipped when the value
gets treated as a string. $sscanf does not consider them as white-space,
because it doesn't consider them at all. I don't know if that is a
proper interpretation.

Whatever the reason for it, I can confirm that NC-Verilog skips the leading
nulls and produces an output of 1 and 10 for your test program.

Steven Sharp
sharp@cadence.com


From: Steven Sharp <sharp@cadence.com>
To: etf-bugs@boyd.com
Cc:
Subject: Re: errata/197: sscanf/"string" incompatibility
Date: Wed, 20 Nov 2002 15:25:46 -0500 (EST)

>Category: errata
>Confidential: no
>Originator: Steven Sharp <sharp@cadence.com>
>Release: 2001b
>Class: TBD
>Description:
Further testing indicates that NC-Verilog ignores embedded nulls in the
$sscanf format string completely, rather than treating them as white space.

Steven Sharp
sharp@cadence.com


From: Gordon Vreugdenhil <gvreugde@synopsys.com>
To: Steven Sharp <sharp@cadence.com>
Cc: etf-bugs@boyd.com
Subject: Re: errata/197: sscanf/"string" incompatibility
Date: Wed, 20 Nov 2002 15:26:57 -0800

>Category: errata
>Confidential: no
>Originator: Gordon Vreugdenhil <gvreugde@synopsys.com>
>Release: 2001b
>Class: TBD
>Description:
Steven Sharp wrote:
>
> Precedence: bulk
>
> The following reply was made to PR errata/197; it has been noted by GNATS.
>
> From: Steven Sharp <sharp@cadence.com>
> To: etf-bugs@boyd.com
> Cc:
> Subject: Re: errata/197: sscanf/"string" incompatibility
> Date: Wed, 20 Nov 2002 15:25:46 -0500 (EST)
>
> >Category: errata
> >Confidential: no
> >Originator: Steven Sharp <sharp@cadence.com>
> >Release: 2001b
> >Class: TBD
> >Description:
> Further testing indicates that NC-Verilog ignores embedded nulls in the
> $sscanf format string completely, rather than treating them as white space.
>
> Steven Sharp
> sharp@cadence.com
>

So does this imply that if you have a register containing
'1'NULL'0' in three bytes and read a %d that you will get
the value 10 ? If so, that is clearly a different behavior
that isn't covered in the LRM and we should figure out which
behavior to adopt.

A couple of additional questions:
1) does this also apply to $fscanf?
2) does this apply if you are reading using %c, %s, etc?
If so, is there any way for a user to determine that
there is a null byte in the input?

Gord.

Gord.
--
----------------------------------------------------------------------
Gord Vreugdenhil gvreugde@synopsys.com
Staff Engineer, VCS (Verification Tech. Group) (503) 547-6054
Synopsys Inc., Beaverton OR

From: Steven Sharp <sharp@cadence.com>
To: sharp@cadence.com, gvreugde@synopsys.com, chas@cadence.com
Cc: etf-bugs@boyd.com
Subject: Re: errata/197: sscanf/"string" incompatibility
Date: Wed, 20 Nov 2002 18:46:39 -0500 (EST)

>Category: errata
>Confidential: no
>Originator: Steven Sharp <sharp@cadence.com>
>Release: 2001b
>Class: TBD
>Description:

>So does this imply that if you have a register containing
>'1'NULL'0' in three bytes and read a %d that you will get
>the value 10 ? If so, that is clearly a different behavior
>that isn't covered in the LRM and we should figure out which
>behavior to adopt.

Yes, that was the behavior that I saw when testing with NC-Verilog.
I am not suggesting that this is what the LRM says to do, or that
we should necessarily adopt it as standard behavior. Such a string
$displayed with %s appears to print out the leading and embedded NULLs
as spaces. That tends to imply that $sscanf should treat them as white
space.

>A couple of additional questions:
> 1) does this also apply to $fscanf?
> 2) does this apply if you are reading using %c, %s, etc?
> If so, is there any way for a user to determine that
> there is a null byte in the input?

All very good questions.

Steven Sharp
sharp@cadence.com

From: Charles Dawson <chas@cadence.com>
To: ETF <etf-bugs@boyd.com>
Cc:
Subject: errata/197: example
Date: Mon, 12 Jan 2004 13:10:33 -0500

This is a multi-part message in MIME format.
--------------000900020201000308050803
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Hi All,

Please find attached the example.

-Chas

--
Charles Dawson
Senior Member of Consulting Staff - Project Lead
NC-Verilog Team
Cadence Design Systems, Inc.
270 Billerica Road
Chelmsford, MA 01824
(978) 262 - 6273
chas@cadence.com


--------------000900020201000308050803
Content-Type: application/octet-stream;
name="file.input"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="file.input"

AAAxADA=
--------------000900020201000308050803
Content-Type: text/plain;
name="test.v"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="test.v"

module top;
reg [5*8:1] r1;
reg [5*8:1] r2;
integer i;
integer code;
integer file;
reg [8:1] tmp;

initial
begin
r1[40:33] = 0;
r1[32:25] = 0;
r1[24:17] = "1";
r1[16:9] = 0;
r1[8:1] = "0";

// display %s
$display("display - s r1:[%s]", r1);

// display %c
$display("display - c r1[40:33]:[%c]", r1[40:33]);
$display("display - c r1[32:25]:[%c]", r1[32:25]);
$display("display - c r1[24:17]:[%c]", r1[24:17]);
$display("display - c r1[16:9]:[%c]", r1[16:9]);
$display("display - c r1[8:1]:[%c]", r1[8:1]);

// display %h
$display("display - h r1[40:33]:[%h]", r1[40:33]);
$display("display - h r1[32:25]:[%h]", r1[32:25]);
$display("display - h r1[24:17]:[%h]", r1[24:17]);
$display("display - h r1[16:9]:[%h]", r1[16:9]);
$display("display - h r1[8:1]:[%h]", r1[8:1]);

i = 1;
$display("display - d i:[%d]", i);

// sscanf %s
code = $sscanf(r1, "%s", r2);
$display("sscanf - s: code: %d r2:[%s]", code, r2);

// sscanf %c
code = $sscanf(r1[40:33], "%c", tmp);
$display("sscanf - c: code: %d tmp 40:33:[%c]", code, tmp);
$display("sscanf - c: code: %d tmp 40:33:[%h]", code, tmp);
code = $sscanf(r1[32:25], "%c", tmp);
$display("sscanf - c: code: %d tmp 32:25:[%c]", code, tmp);
$display("sscanf - c: code: %d tmp 32:25:[%h]", code, tmp);
code = $sscanf(r1[24:17], "%c", tmp);
$display("sscanf - c: code: %d tmp 24:17:[%c]", code, tmp);
$display("sscanf - c: code: %d tmp 24:17:[%h]", code, tmp);
code = $sscanf(r1[16:9], "%c", tmp);
$display("sscanf - c: code: %d tmp 16:9:[%c]", code, tmp);
$display("sscanf - c: code: %d tmp 16:9:[%h]", code, tmp);
code = $sscanf(r1[8:1], "%c", tmp);
$display("sscanf - c: code: %d tmp 8:1:[%c]", code, tmp);
$display("sscanf - c: code: %d tmp 8:1:[%h]", code, tmp);

// sscanf %d
code = $sscanf(r1, "%d", i);
$display("sscanf - d: code: %d i:[%d]", code, i);

// sscanf %h
code = $sscanf(r1, "%h", r2);
$display("sscanf - h: code: %d r2:[%h]", code, r2);

// fscanf %s
file = $fopen("file.input", "r");
code = $fscanf(file, "%s", r2);
$display("fscanf - s: code: %d, r2:[%s]", code, r2);
$fclose(file);

// fscanf %d
file = $fopen("file.input", "r");
code = $fscanf(file, "%d", i);
$display("fscanf - d: code: %d, i:[%d]", code, i);
$fclose(file);

// fscanf %c
file = $fopen("file.input", "r");
code = $fscanf(file, "%c", tmp);
$display("fscanf - c: code: %d, tmp:[%c]", code, tmp);
$display("fscanf - h: code: %d, tmp:[%h]", code, tmp);
code = $fscanf(file, "%c", tmp);
$display("fscanf - c: code: %d, tmp:[%c]", code, tmp);
$display("fscanf - h: code: %d, tmp:[%h]", code, tmp);
code = $fscanf(file, "%c", tmp);
$display("fscanf - c: code: %d, tmp:[%c]", code, tmp);
$display("fscanf - h: code: %d, tmp:[%h]", code, tmp);
code = $fscanf(file, "%c", tmp);
$display("fscanf - c: code: %d, tmp:[%c]", code, tmp);
$display("fscanf - h: code: %d, tmp:[%h]", code, tmp);
code = $fscanf(file, "%c", tmp);
$display("fscanf - c: code: %d, tmp:[%c]", code, tmp);
$display("fscanf - h: code: %d, tmp:[%h]", code, tmp);
$fclose(file);

end
endmodule

--------------000900020201000308050803
Content-Type: text/plain;
name="results.log"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="results.log"

display - s r1:[ 1 0]
display - c r1[40:33]:[]
display - c r1[32:25]:[]
display - c r1[24:17]:[1]
display - c r1[16:9]:[]
display - c r1[8:1]:[0]
display - h r1[40:33]:[00]
display - h r1[32:25]:[00]
display - h r1[24:17]:[31]
display - h r1[16:9]:[00]
display - h r1[8:1]:[30]
display - d i:[ 1]
sscanf - s: code: 1 r2:[ 1]
sscanf - c: code: 1 tmp 40:33:[]
sscanf - c: code: 1 tmp 40:33:[00]
sscanf - c: code: 1 tmp 32:25:[]
sscanf - c: code: 1 tmp 32:25:[00]
sscanf - c: code: 1 tmp 24:17:[1]
sscanf - c: code: 1 tmp 24:17:[31]
sscanf - c: code: 1 tmp 16:9:[]
sscanf - c: code: 1 tmp 16:9:[00]
sscanf - c: code: 1 tmp 8:1:[0]
sscanf - c: code: 1 tmp 8:1:[30]
sscanf - d: code: 1 i:[ 1]
sscanf - h: code: 1 r2:[0000000001]
fscanf - s: code: 1, r2:[ 1]
fscanf - d: code: 1, i:[ 1]
fscanf - c: code: 1, tmp:[]
fscanf - h: code: 1, tmp:[00]
fscanf - c: code: 1, tmp:[]
fscanf - h: code: 1, tmp:[00]
fscanf - c: code: 1, tmp:[1]
fscanf - h: code: 1, tmp:[31]
fscanf - c: code: 1, tmp:[]
fscanf - h: code: 1, tmp:[00]
fscanf - c: code: 1, tmp:[0]
fscanf - h: code: 1, tmp:[30]

--------------000900020201000308050803--

From: Charles Dawson<chas@cadence.com>
To: <etf-bugs@boyd.com>
Cc:
Subject: errata/197: whitespace issue
Date: Mon, 12 Jan 2004 08:50:00 -0800

We think the solution for this issue is to include nulls in the
white-space characters specification in 17.2.4.3 (a) (as previously suggested
by Gord).

Specifically, the text should read:

a) White-space characters (blanks, tabs, new-lines, or form-feeds) that,
except in one case described below, cause input to be read up to the
next non-white-space character. For $sscanf, null characters shall
also be considered white-space.

Note that this change also effects null characters that are embedded within
the string.

I should have an example to illustrate this shortly.


Fix replaced by Shalom.Bresticker@motorola.com on Thu Feb 12 03:59:44 2004

This is Charles Dawson's proposal from Jan. 12, 2004.

In 17.2.4.3, following "The control string can contain"
(end of 4th paragraph following the syntax).

CHANGE

"a)White-space characters (blanks, tabs, new-lines, or
form-feeds) that, except in one case described below, cause
input to be read up to the next non-white-space character."

TO

"a) White-space characters (blanks, tabs, new-lines, or
form-feeds) that, except in one case described below, cause
input to be read up to the next non-white-space character.
For $sscanf, null characters shall also be considered
white-space."


(Note that this change also affects null characters that
are embedded within the string.)



Unformatted


Hosted by Boyd Technology