For Programmers: Free Programming Magazines  


Home > Archive > Software Testing > September 2007 > Requesting critique of a C unit test environment









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Requesting critique of a C unit test environment
Ark Khasin

2007-08-26, 10:17 pm

[First, I apologize for cross-posting. I just think a wider audience can
critique from different vantage points.]

Unit testing is an integral component of both "formal" and "agile"
models of development. Alas, it involves a significant amount of tedious
labor.

There are test automation tools out there but from what limited exposure
I've had, they are pricey, reasonably buggy, and require compiler/target
adaptation.

Out of my frustration with two out of two of them came my own. Its
instrumentation approach is based solely on profound abuse of the C
preprocessor (and in this respect it is equally applicable to C++).

I would like to ask to evaluate the approach
- whether it has gaping holes in ideology or implementation
- whether in your opinion it has merits

A preliminary draft description is at
http://www.macroexpressions.com/dl/... /> estring.pdf

A reference implementation (with a C99 accent) with a runnable example is at
http://www.macroexpressions.com/dl/maestra.zip

Please reply to a newsgroup or via email as you find convenient.
Thank you for your anticipated feedback,

-- Ark

Ian Collins

2007-08-26, 10:17 pm

Ark Khasin wrote:
> [First, I apologize for cross-posting. I just think a wider audience can
> critique from different vantage points.]
>
> Unit testing is an integral component of both "formal" and "agile"
> models of development. Alas, it involves a significant amount of tedious
> labor.
>
> There are test automation tools out there but from what limited exposure
> I've had, they are pricey, reasonably buggy, and require compiler/target
> adaptation.
>
> Out of my frustration with two out of two of them came my own. Its
> instrumentation approach is based solely on profound abuse of the C
> preprocessor (and in this respect it is equally applicable to C++).
>
> I would like to ask to evaluate the approach
> - whether it has gaping holes in ideology or implementation
> - whether in your opinion it has merits
>

Why not just use one of the free frameworks such as CppUnit?

It works well with both C (with a little fiddling like you do in your
paper for "static") and C++. I'm sure the same applies for other
frameworks.

--
Ian Collins.
Ark Khasin

2007-08-26, 10:17 pm

Ian Collins wrote:
<snip>
> Why not just use one of the free frameworks such as CppUnit?
>
> It works well with both C (with a little fiddling like you do in your
> paper for "static") and C++. I'm sure the same applies for other
> frameworks.
>

Ian,
Thank you for your response.

Please correct me if I am wrong, but AFAIK CppUnit doesn't provide a
code execution trace, so it's pretty darn hard to prove code coverage.
[There must be reasons why testing tools vendors command big money.]

Also, if I use C in non-C++ compatible way (e.g. tentative definitions),
my source won't even compile for CppUnit.

And finally there is a port issue (it's an embedded type talking :)). I
am proposing something that requires only the compiler.

Regards,
Ark
Ian Collins

2007-08-27, 4:44 am

Ark Khasin wrote:
> Ian Collins wrote:
> <snip>
> Ian,
> Thank you for your response.
>
> Please correct me if I am wrong, but AFAIK CppUnit doesn't provide a
> code execution trace, so it's pretty darn hard to prove code coverage.
> [There must be reasons why testing tools vendors command big money.]
>

If you develop your software test first, you get all the code coverage
you need.

> Also, if I use C in non-C++ compatible way (e.g. tentative definitions),
> my source won't even compile for CppUnit.
>

If you mean K&R style prototypes, don't use them. Write and compile
your tests in C++ and your code in C. Don't attempt to compile your C
with a C++ compiler.

> And finally there is a port issue (it's an embedded type talking :)). I
> am proposing something that requires only the compiler.
>

Shouldn't matter for unit testing, develop and test on a hosted system.
If you require bits of the target environment, mock (simulate) them.

--
Ian Collins.
Ark Khasin

2007-08-27, 4:44 am

Ian Collins wrote:
<snip>
> If you develop your software test first, you get all the code coverage
> you need.
>

Test first is a nice model but not of a universal applicability.
Besides, I need to demonstrate test coverage to the certifying/auditing
entity.
OTOH, I wonder if the proposed instrumentation can be made a part of
CppUnit. I think, there is nothing in either that would prohibit it.
<snip>[color=darkred]
> Write and compile
> your tests in C++ and your code in C. Don't attempt to compile your C
> with a C++ compiler.

Right. It just didn't occur to me :(
>
> Shouldn't matter for unit testing, develop and test on a hosted system.
> If you require bits of the target environment, mock (simulate) them.
>

The farthest I can go away from the target is a software simulator of
the instruction set. Same compiler, same version, perhaps, more
"memory". I think I am not alone in this...

-- Ark
Ian Collins

2007-08-27, 4:44 am

Ark Khasin wrote:
> Ian Collins wrote:
> <snip>
> Test first is a nice model but not of a universal applicability.
> Besides, I need to demonstrate test coverage to the certifying/auditing
> entity.


You are not alone in that, I'd suggest you take this to a TDD list for
advice.

> The farthest I can go away from the target is a software simulator of
> the instruction set. Same compiler, same version, perhaps, more
> "memory". I think I am not alone in this...
>

Why?

--
Ian Collins.
Colin Paul Gloster

2007-08-27, 11:52 am

On 2007-08-27, Ark Khasin <akhasin@macroexpressions.com> wrote:

|--------------------------------------------------------------------------------------|
|"[..] |
| |
|Unit testing is an integral component of [..] "formal" [..] |
|models of development. [..] |
| |
|[..]" |
|--------------------------------------------------------------------------------------|

Testing is not an intgeral component of formal methods intended to
reduce testing.

Regards,
Colin Paul Gloster
Phlip

2007-08-27, 11:52 am

Ark Khasin wrote:

> Besides, I need to demonstrate test coverage to the certifying/auditing
> entity.


It sounds like learning what their requirements are and meeting them is more
important than guessing if TDD will incidentally meet their requirements.

> Test first is a nice model but not of a universal applicability.


You can keep the TDD thing a secret...

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
^ assert_xpath
http://tinyurl.com/23tlu5 <-- assert_raise_message
Phlip

2007-08-27, 11:52 am

Colin Paul Gloster wrote:

> Testing is not an intgeral component of formal methods intended to
> reduce testing.
>
> Colin Paul Gloster


Why would a formal method intend to reduce a Good Thing??

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
^ assert_xpath
http://tinyurl.com/23tlu5 <-- assert_raise_message
Erik Wikström

2007-08-27, 11:52 am

On 2007-08-27 09:08, Phlip wrote:
> Colin Paul Gloster wrote:
>
>
> Why would a formal method intend to reduce a Good Thing??


Testing is used to find errors, while formal methods are used to prove
that there are no errors, at least that's the goal. So if you can prove
that there are no errors why test for them?

--
Erik Wikström
Richard Heathfield

2007-08-27, 11:52 am

Erik Wikström said:

<snip>

> Testing is used to find errors, while formal methods are used to prove
> that there are no errors, at least that's the goal. So if you can
> prove that there are no errors why test for them?


"Beware of bugs in the above code; I have only proved it correct, not
tried it." - Donald E Knuth.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Phlip

2007-08-27, 11:52 am

Erik Wikström wrote:

> Testing is used to find errors, while formal methods are used to prove
> that there are no errors, at least that's the goal. So if you can prove
> that there are no errors why test for them?


I use testing as a formal method, to prevent errors. I have not tried
the "proof" systems (and please don't try to tell the mathematicians I used
to hang out with that they are really "proofs").

You are describing writing and running tests in isolation from the
development process. Don't do that.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
^ assert_xpath
http://tinyurl.com/23tlu5 <-- assert_raise_message
Richard

2007-08-27, 11:52 am

Richard Heathfield <rjh@see.sig.invalid> writes:

> Erik Wikström said:
>
> <snip>
>
>
> "Beware of bugs in the above code; I have only proved it correct, not
> tried it." - Donald E Knuth.


Proof of Correctness depends very much on how and when "correct" is
used. It is a crock of shit in most day to day SW development.
S Perryman

2007-08-27, 9:09 pm

Phlip wrote:

> Erik Wikström wrote:


[color=darkred]
> I use testing as a formal method, to prevent errors.


1. As Dijkstra said : a test can only prove the presence of an error, not
their absence.

2. As Myers said : a test is an input with the intent of finding a defect.


Feel free to show how whatever "testing" method you use can ever equate to
a formal method (or actually for that matter address 1 or 2) ...


> You are describing writing and running tests in isolation from the
> development process. Don't do that.


1. There is *nothing* in the text quoted ">>" that states anything about
"writing and running tests" . Therefore you have mis-represented the
posters' words.

2. The act of "writing and running tests in isolation from the development
process" is something that has been done before. Now what was that approach
called ... ah yes ... the test method for the Cleanroom process.

How does whatever "testing" method you use (which we can only assume does
*not* involve "writing and running tests in isolation from the development
process" ) compare to the Cleanroom defect detection/prevention rate ...??


Steven Perryman
Flash Gordon

2007-08-27, 9:09 pm

Ian Collins wrote, On 27/08/07 08:14:
> Ark Khasin wrote:

<snip>
[color=darkred]
> Why?


There are many possible *valid* reasons for this. One is that if you are
not using the same version of the same compiler with the same switches
then the code you are testing is not the same as the code that will be
run. Since compilers *do* have bugs it is possible that the bug will be
triggered in the real environment but not in the test environment unless
you ensure that they are the same.

If I was doing QA for a product I would insist than you either use the
same version of the same compiler or you provide testing to the same
level that the deliverable SW requires of *both* the compiler used for
test *and* the compiler used for the final SW. The more critical the SW,
the more insistent I would be on this, and the more testing you would
have to do on the compilers, for safety critical SW this would probably
kill the project dead if you did not use identical SW to build for test
and build for delivery. BTW, I *have* rejected SW and documentation at
review, and even told developers that there was no point in putting it
in for review because I would fail it.
--
Flash Gordon
Ben Bacarisse

2007-08-27, 9:09 pm

Richard Heathfield <rjh@see.sig.invalid> writes:

> Erik Wikström said:
>
> <snip>
>
>
> "Beware of bugs in the above code; I have only proved it correct, not
> tried it." - Donald E Knuth.


But this was a "by hand" proof in 1977. A machine assisted proof of
the actual code could be expected to inspire a little more confidence.

--
Ben.
Richard Heathfield

2007-08-27, 9:09 pm

Ben Bacarisse said:

> Richard Heathfield <rjh@see.sig.invalid> writes:
>
>
> But this was a "by hand" proof in 1977. A machine assisted proof of
> the actual code could be expected to inspire a little more confidence.


Why? Presumably the machine that is doing the assisting is itself a
computer program. What makes you think the assistance program is
correct?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Ian Collins

2007-08-27, 9:09 pm

Flash Gordon wrote:
> Ian Collins wrote, On 27/08/07 08:14:
>
> <snip>
>
>
> There are many possible *valid* reasons for this. One is that if you are
> not using the same version of the same compiler with the same switches
> then the code you are testing is not the same as the code that will be
> run. Since compilers *do* have bugs it is possible that the bug will be
> triggered in the real environment but not in the test environment unless
> you ensure that they are the same.
>

But one has to differentiate between developer unit testing (the subject
of this post) and QA (customer) acceptance testing. The former can be
performed on any environment the developer chooses, the later must be
run on the target.

> If I was doing QA for a product I would insist than you either use the
> same version of the same compiler or you provide testing to the same
> level that the deliverable SW requires of *both* the compiler used for
> test *and* the compiler used for the final SW. The more critical the SW,
> the more insistent I would be on this, and the more testing you would
> have to do on the compilers, for safety critical SW this would probably
> kill the project dead if you did not use identical SW to build for test
> and build for delivery. BTW, I *have* rejected SW and documentation at
> review, and even told developers that there was no point in putting it
> in for review because I would fail it.


Again, this is different from developer unit testing, I don't think
anyone would be daft enough to release a product that hadn't been
through acceptance testing on the target platform.

--
Ian Collins.
Ben Bacarisse

2007-08-27, 9:09 pm

Richard Heathfield <rjh@see.sig.invalid> writes:

> Ben Bacarisse said:
>
>
> Why? Presumably the machine that is doing the assisting is itself a
> computer program. What makes you think the assistance program is
> correct?


What do you test your software with if not more software?

If you think that a machine assisted proof would not inspire "a little
more" confidence than a hand proof, then I won't try to persuade you
(it was a modest enough claim) but the fact that a proof system is
software does not invalidate the method any more than testing is
invalidated by being done in software.

--
Ben.
Richard Heathfield

2007-08-27, 9:09 pm

Ben Bacarisse said:

> Richard Heathfield <rjh@see.sig.invalid> writes:
>
<snip>[color=darkred]
>
> What do you test your software with if not more software?


A rolling pin. Any software that can withstand the pastry test is likely
to be able to withstand anything else too.

> If you think that a machine assisted proof would not inspire "a little
> more" confidence than a hand proof, then I won't try to persuade you
> (it was a modest enough claim)


Yes, on reflection I see that I'm guilty of (accidentally) extending
your claim, which was indeed modest enough.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Ben Pfaff

2007-08-27, 9:09 pm

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Richard Heathfield <rjh@see.sig.invalid> writes:
>
>
> But this was a "by hand" proof in 1977. A machine assisted proof of
> the actual code could be expected to inspire a little more confidence.


YMMV of course, but if I could get Donald Knuth to prove my
programs correct "by hand", I'd feel no need for additional
confidence.
--
Ben Pfaff
http://benpfaff.org
Flash Gordon

2007-08-27, 9:09 pm

Ian Collins wrote, On 27/08/07 21:51:
> Flash Gordon wrote:
> But one has to differentiate between developer unit testing (the subject
> of this post) and QA (customer) acceptance testing.


You have missed out internal formal testing which in many environments
is far more complete than acceptance testing. For example, I've worked
on projects where a formal test literally takes a w to complete but
the customer acceptance testing takes only a few hours.

Unit test can also be formal, and in a lot of environments, including
the afore mentioned safety critical projects, you are *required* to
perform formal unit tests.

> The former can be
> performed on any environment the developer chooses,


Informal testing can be run in any environment the developer has
available. Formal testing, which is the only sort of testing that you
can guarantee will be available and working for those maintaining later,
is another matter.

> the later must be
> run on the target.


All formal testing, whether unit testing or testing at a higher level
has to be run on code compiled with the correct compiler, although not
always on an identical target.

>
> Again, this is different from developer unit testing, I don't think
> anyone would be daft enough to release a product that hadn't been
> through acceptance testing on the target platform.


Acceptance testing has very little to do with proving whether the system
works, it is just to give the customer some confidence. The real worth
while formal testing has to be completed *before* doing customer
acceptance testing and done with the correct compiler. At least, this is
the case in many environments, including all the projects where I have
been involved in QA, and on the safety critical project I was involved in.

If your customer acceptance testing is sufficient to prove the SW is
sufficiently correct then your customer has either very little trust in
your company or a lot of time to waste. If your customer acceptance
testing is the only testing done with the correct compiler and it is not
sufficient to prove your SW is sufficiently correct then your SW is not
tested properly. At least, not according to any standard of testing I
have come across.
--
Flash Gordon
Phlip

2007-08-27, 9:09 pm

Flash Gordon wrote:

> You have missed out internal formal testing which in many environments
> is far more complete than acceptance testing. For example, I've worked
> on projects where a formal test literally takes a w to complete but
> the customer acceptance testing takes only a few hours.


What did y'all do if the "formal" test failed?

What I look for is this: Replicate the failure as a short unit test. Not a
proof - just a stupid test that fails because the code change needed to fix
that formal test isn't there.

The point is to make the fast tests higher value as you go...

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
Ben Bacarisse

2007-08-27, 9:09 pm

Ben Pfaff <blp@cs.stanford.edu> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>
> YMMV of course, but if I could get Donald Knuth to prove my
> programs correct "by hand", I'd feel no need for additional
> confidence.


No, MMIS[1]. I did not intend to disparage Prof. Knuth's "hand
proofs" (what a thought!) but rather to say that the problem he is
referring to is as likely to be that one proves something other than
the program one has written (or later writes) as it is to be that ones
proof is (internally) flawed.

I suspect that he is not entirely happy with the way that quip is used
so often to suggest the pointlessness of proofs[2] (after all, what
did he choose to do with his "Notes on van Emde Boas constriction of
priority deques" -- a proof rather than a test implementation!).

[1] "My mileage is similar".
[2] This not one of those times -- RH was just countering the much
stronger assertion that proof => no need to test.

--
Ben.
user923005

2007-08-27, 9:09 pm

On Aug 27, 4:17 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
[snip]
> I suspect that he is not entirely happy with the way that quip is used
> so often to suggest the pointlessness of proofs[2] (after all, what
> did he choose to do with his "Notes on van Emde Boas constriction of
> priority deques" -- a proof rather than a test implementation!).


Aside:
Working vEB tree implementation here:
http://www.itu.dk/people/kokholm/veb/

Ian Collins

2007-08-27, 11:44 pm

Flash Gordon wrote:
> Ian Collins wrote, On 27/08/07 21:51:
>
> You have missed out internal formal testing which in many environments
> is far more complete than acceptance testing. For example, I've worked
> on projects where a formal test literally takes a w to complete but
> the customer acceptance testing takes only a few hours.
>

If performed, internal formal testing is still a step away from
developer testing.

>
> Informal testing can be run in any environment the developer has
> available. Formal testing, which is the only sort of testing that you
> can guarantee will be available and working for those maintaining later,
> is another matter.
>

How so? A unit test suite doesn't just vanish when the code is
released, it is an essential part of the code base.
>
> Acceptance testing has very little to do with proving whether the system
> works, it is just to give the customer some confidence.


That depends on your definition of Acceptance tests. In our case, they
are the automated suite of tests that have to pass before the product is
released to customers.

> The real worth
> while formal testing has to be completed *before* doing customer
> acceptance testing and done with the correct compiler. At least, this is
> the case in many environments, including all the projects where I have
> been involved in QA, and on the safety critical project I was involved in.
>

Again, that depends on your process.

> If your customer acceptance testing is sufficient to prove the SW is
> sufficiently correct then your customer has either very little trust in
> your company or a lot of time to waste. If your customer acceptance
> testing is the only testing done with the correct compiler and it is not
> sufficient to prove your SW is sufficiently correct then your SW is not
> tested properly. At least, not according to any standard of testing I
> have come across.


Why? Our acceptance test are very comprehensive, written by
professional testers working with a product manager (the customer).

It sounds like you don't have fully automated acceptance tests. Where
ever possible, all tests should be fully automated.

--
Ian Collins.
Richard Heathfield

2007-08-28, 6:19 am

Ben Bacarisse said:

<snip>
>
> I suspect that [DEK] is not entirely happy with the way that quip
> is used so often to suggest the pointlessness of proofs[2]


<snip>

> [2] This not one of those times -- RH was just countering the much
> stronger assertion that proof => no need to test.


Right. One problem is that they don't always prove what you asked them
to prove. What you actually want to know is "does this program properly
do what I need it to do?", but what a prover actually tells you is
whether program X conforms to a particular expression of specification
Y. It makes no comment whatsoever on whether specification Y
corresponds to wishlist Z. And, very often, such correspondence is far
from perfect.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jonathan Kirwan

2007-08-28, 6:19 am

On Mon, 27 Aug 2007 19:14:04 +1200, Ian Collins <ian-news@hotmail.com>
wrote:

><snip>
>Why?


Perhaps because this is an example of a medical product where that is
felt to be required. There is a list of various kinds of "structural
coverages," those selected commensurate with the level of risk posed
by the software. (Saying something has "coverage," I think, always
implies 100% coverage, too. Not partial. So you either have coverage
or you don't.)

Borrowing from one of the US CDRH PDFs I have laying about :

· Statement Coverage – This criteria requires sufficient test cases
for each program statement to be executed at least once; however,
its achievement is insufficient to provide confidence in a
software product's behavior.

· Decision (Branch) Coverage – This criteria requires sufficient test
cases for each program decision or branch to be executed so that
each possible outcome occurs at least once. It is considered to be
a minimum level of coverage for most software products, but
decision coverage alone is insufficient for high-integrity
applications.

· Condition Coverage – This criteria requires sufficient test cases
for each condition in a program decision to take on all possible
outcomes at least once. It differs from branch coverage only when
multiple conditions must be evaluated to reach a decision.

· Multi-Condition Coverage – This criteria requires sufficient test
cases to exercise all possible combinations of conditions in a
program decision.

· Loop Coverage – This criteria requires sufficient test cases for
all program loops to be executed for zero, one, two, and many
iterations covering initialization, typical running and termination
(boundary) conditions.

· Path Coverage – This criteria requires sufficient test cases for
each feasible path, basis path, etc., from start to exit of a
defined program segment, to be executed at least once. Because of
the very large number of possible paths through a software program,
path coverage is generally not achievable. The amount of path
coverage is normally established based on the risk or criticality
of the software under test.

· Data Flow Coverage – This criteria requires sufficient test cases
for each feasible data flow to be executed at least once. A number
of data flow testing strategies are available.

For potentially high risk software, you may not just use a different
compiler or a different operating system environment or change even
the optimization options. As the OP mentioned, it's probably going to
enough just justifying an instruction simulator.

I can easily see a desire for an automated way of demonstrating that
structural testing has achieved one or more of these cases. If I read
the OP right about this, anyway.

Jon
Richard Bos

2007-08-28, 6:19 am

Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:

> Richard Heathfield <rjh@see.sig.invalid> writes:
>
>
> What do you test your software with if not more software?


Sed quis custodiet ipsos custodes?

Richard
Ian Collins

2007-08-28, 6:19 am

Jonathan Kirwan wrote:
> On Mon, 27 Aug 2007 19:14:04 +1200, Ian Collins <ian-news@hotmail.com>
> wrote:
>
>
> Perhaps because this is an example of a medical product where that is
> felt to be required. There is a list of various kinds of "structural
> coverages," those selected commensurate with the level of risk posed
> by the software. (Saying something has "coverage," I think, always
> implies 100% coverage, too. Not partial. So you either have coverage
> or you don't.)
>

The why was prompted by the posting subject "critique of a C unit test
environment". To my way of thinking (TDD), unit tests are developer
tool, not formal product tests.


<interesting stuff snipped>


--
Ian Collins.
Colin Paul Gloster

2007-08-28, 6:19 am

On 2007-08-27, Ben Pfaff <blp@cs.stanford.edu> wrote:

|--------------------------------------------------------------------------|
|"[..] |
| |
|YMMV of course, but if I could get Donald Knuth to prove my |
|programs correct "by hand", I'd feel no need for additional |
|confidence." |
|--------------------------------------------------------------------------|

Such as the way Donald E. Knuth told Leslie Lamport that TeX would
hardly change at all? From
HTTP://research.Microsoft.com/users...x-interview.pdf
:"[..]
[..] When Don
was writing TEX80, he announced that it would be a
reimplementation of TEX78, but he was not going to
add new features. I took him seriously and asked for
almost no changes to TEX itself. [..] However, there were many other
im-
provements that I could have suggested but didn't. In
the end, Don wound up making very big changes to
TEX78. But they were all incremental, and there was
never a point where he admitted that he was willing
to make major changes. Had I known at the begin-
ning how many changes he would be making, I would
have tried to participate in the redesign. [..]
[..]"

Regards,
Colin Paul Gloster
Colin Paul Gloster

2007-08-28, 6:19 am

On 2007-08-27, Richard Heathfield <rjh@see.sig.invalid> wrote:

|------------------------------------------------------------------------|
|"Ben Bacarisse said: |
| |
|> Richard Heathfield <rjh@see.sig.invalid> writes: |
|> |
|>> Erik Wikström said: |
|>> |
|>> <snip> |
|>> |
|>>> Testing is used to find errors, while formal methods are used to |
|>>> prove that there are no errors, at least that's the goal. So if you |
|>>> can prove that there are no errors why test for them? |
|>> |
|>> "Beware of bugs in the above code; I have only proved it correct, not|
|>> tried it." - Donald E Knuth. |
|> |
|> But this was a "by hand" proof in 1977. A machine assisted proof of |
|> the actual code could be expected to inspire a little more confidence.|
| |
|Why? Presumably the machine that is doing the assisting is itself a |
|computer program. What makes you think the assistance program is |
|correct?" |
|------------------------------------------------------------------------|

Full points to Mister Heathfield.
Phlip

2007-08-28, 7:22 pm

Colin Paul Gloster wrote:

> Such as the way Donald E. Knuth told Leslie Lamport that TeX would
> hardly change at all? From
>

HTTP://research.Microsoft.com/users...x-interview.pdf
> :"[..]
> [..] When Don
> was writing TEX80, he announced that it would be a
> reimplementation of TEX78, but he was not going to
> add new features. I took him seriously and asked for
> almost no changes to TEX itself. [..] However, there were many other
> im-
> provements that I could have suggested but didn't. In
> the end, Don wound up making very big changes to
> TEX78. But they were all incremental, and there was
> never a point where he admitted that he was willing
> to make major changes. Had I known at the begin-
> ning how many changes he would be making, I would
> have tried to participate in the redesign. [..]


"Principle 4

" * Level out the workload (heijunka). (Work like the tortoise, not the
hare).

"This helps achieve the goal of minimizing waste (muda), not overburdening
people or the equipment (muri), and not creating uneven production levels
(mura)."

http://en.wikipedia.org/wiki/The_Toyota_Way

--
Phlip
Paul E. Black

2007-08-28, 7:22 pm

Erik Wikström wrote:

>Testing is used to find errors, while formal methods are used to prove
>that there are no errors, at least that's the goal. So if you can prove
>that there are no errors why test for them?


Formal methods answers the questions you ask. Testing may answer a
question you didn't think to ask.

Just as experiment and "theory" (mathematical modeling) compliment
each other in science, so testing helps validate that the formal
models, logic, assumptions, reasoning, etc. were correct.

-paul-
p.black@acm.org

Walter Banks

2007-08-28, 7:22 pm

I will second that, well put.

The Achilles heel for either Testing or formal methods is
contaminating the evaluation process with information
from the implementation.

I have seen unit tests contaminated from just knowing
the application area the code was going to be used.

w..


Colin Paul Gloster wrote:

> On 2007-08-27, Richard Heathfield <rjh@see.sig.invalid> wrote:
>
> |------------------------------------------------------------------------|
> |"Ben Bacarisse said: |
> | |
> |> Richard Heathfield <rjh@see.sig.invalid> writes: |
> |> |
> |>> Erik Wikström said: |
> |>> |
> |>> <snip> |
> |>> |
> |>>> Testing is used to find errors, while formal methods are used to |
> |>>> prove that there are no errors, at least that's the goal. So if you |
> |>>> can prove that there are no errors why test for them? |
> |>> |
> |>> "Beware of bugs in the above code; I have only proved it correct, not|
> |>> tried it." - Donald E Knuth. |
> |> |
> |> But this was a "by hand" proof in 1977. A machine assisted proof of |
> |> the actual code could be expected to inspire a little more confidence.|
> | |
> |Why? Presumably the machine that is doing the assisting is itself a |
> |computer program. What makes you think the assistance program is |
> |correct?" |
> |------------------------------------------------------------------------|
>
> Full points to Mister Heathfield.


Flash Gordon

2007-08-28, 7:22 pm

Ian Collins wrote, On 28/08/07 04:46:
> Flash Gordon wrote:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^[color=
darkred]
> If performed, internal formal testing is still a step away from
> developer testing.


Yes. However, above you said that it should not matter for unit testing
whether you use the same compiler or not. Since unit testing can and
often *is* formal such a statement is at least misleading. Had you said
that it did not matter for informal testing and had the OP been asking
about informal testing you might have a point, but it was never stated
that the unit testing was informal.

> How so? A unit test suite doesn't just vanish when the code is
> released, it is an essential part of the code base.


Simple. If it is not formal then you (the next developer) have no
guarantee that it is in a usable state. So you, the next developer, have
to fully validate any tests you will rely on during your development.

>
> That depends on your definition of Acceptance tests. In our case, they
> are the automated suite of tests that have to pass before the product is
> released to customers.


Yes, this could be a matter of definition. To me an acceptance test is
the customer coming in and witnessing some pre-agreed tests where if
they pass the customer will accept the SW and/or HW (and pay for it). It
has nothing to do with whether the company is prepared to give the SW to
the customer.

> Again, that depends on your process.


I've not worked for a company where they would be prepared to try and
get a customer to accept SW before having a decent level of confidence
that it is correct *and* acceptable to the customer.

>
> Why? Our acceptance test are very comprehensive, written by
> professional testers working with a product manager (the customer).
>
> It sounds like you don't have fully automated acceptance tests. Where
> ever possible, all tests should be fully automated.


It is not possible for a reasonable cost to fully automate all testing.
On a number of projects I have worked on the formal testing included
deliberately connecting up the system incorrectly (and changing the
physical wiring whilst the SW is running), inducing faults in the HW
that the SW was intended to test, responding either correctly and
incorrectly to operator prompts, putting a plate in front of a camera so
that it could not see the correct image whilst the SW is looking at it,
swapping a card in the system for a card from a system with a different
specification etc. It would literally require a robot to automate some
of this testing, and some of the rest of it would require considerable
investment to automate. Compared to the cost of the odd few man-ws to
manually run through the formal testing with a competent whiteness the
cost of automation would be stupid.

BTW, on the SW I am mainly thinking of there were so few bug reports
that on one occasion when the customer representative came to us for
acceptance testing, a few years after the previous version, both the
customer representative and I could remember all of the fault reports
and discuss why I new none of them were present in the new version. The
customer representative was *not* a user (he worked for a "Procurement
Executive" and not for the organisation that used the kit), so he would
not have seen it for several years.

If you doubt the quality of the manual testing, then look at how many
50000 line pieces of SW have as few as 10 fault reports from customers
over a 15 year period. Most of those fault reports were in the early
years, and *none* were after the last few deliveries I was involved in.

BTW, if they are still using the SW at the start of 2028 we have a
problem, but that is documented and could easily be worked around.
--
Flash Gordon
Ian Collins

2007-08-28, 7:22 pm

Flash Gordon wrote:
> Ian Collins wrote, On 28/08/07 04:46:
>
> Yes. However, above you said that it should not matter for unit testing
> whether you use the same compiler or not. Since unit testing can and
> often *is* formal such a statement is at least misleading. Had you said
> that it did not matter for informal testing and had the OP been asking
> about informal testing you might have a point, but it was never stated
> that the unit testing was informal.
>

We all work from our own point of reference, in mine, units tests are a
developer tool so that's why I answered as I did.

>
> Simple. If it is not formal then you (the next developer) have no
> guarantee that it is in a usable state. So you, the next developer, have
> to fully validate any tests you will rely on during your development.
>

Again, as one who uses TDD, the tests are always up to date as they
document the workings of the code. All down to process.

>
> Yes, this could be a matter of definition. To me an acceptance test is
> the customer coming in and witnessing some pre-agreed tests where if
> they pass the customer will accept the SW and/or HW (and pay for it). It
> has nothing to do with whether the company is prepared to give the SW to
> the customer.
>

Ah, that explains a lot!

>
> I've not worked for a company where they would be prepared to try and
> get a customer to accept SW before having a decent level of confidence
> that it is correct *and* acceptable to the customer.
>

Neither have I.

>
> It is not possible for a reasonable cost to fully automate all testing.


True, but with care you can automate the majority of them. The beauty
of automated tests is they cost next to nothing to run, so they can be
continuously run against your code repository.

> On a number of projects I have worked on the formal testing included
> deliberately connecting up the system incorrectly (and changing the
> physical wiring whilst the SW is running), inducing faults in the HW
> that the SW was intended to test, responding either correctly and
> incorrectly to operator prompts, putting a plate in front of a camera so
> that it could not see the correct image whilst the SW is looking at it,
> swapping a card in the system for a card from a system with a different
> specification etc. It would literally require a robot to automate some
> of this testing, and some of the rest of it would require considerable
> investment to automate. Compared to the cost of the odd few man-ws to
> manually run through the formal testing with a competent whiteness the
> cost of automation would be stupid.
>

There you have what I'd call integration testing, something we also do
with any software that interacts with other equipment.

>
> If you doubt the quality of the manual testing, then look at how many
> 50000 line pieces of SW have as few as 10 fault reports from customers
> over a 15 year period. Most of those fault reports were in the early
> years, and *none* were after the last few deliveries I was involved in.
>

I don't doubt it, I just prefer to send my resources elsewhere. We go
through the full manual integration tests for major software releases
(adding acceptance and unit tests to reproduce any bugs found). This
process of feeding back tests into the automated suites makes them
progressively more thorough, to the extent that minor updates can be
released without manual testing ant the testing of major releases finds
few, if any, bugs. Most of the bugs found by the manual testing are
differing interpretations of the specification.


--
Ian Collins.
Flash Gordon

2007-08-29, 7:16 pm

Ian Collins wrote, On 28/08/07 22:03:
> Flash Gordon wrote:
> We all work from our own point of reference, in mine, units tests are a
> developer tool so that's why I answered as I did.


You should try to avoid assuming everyone works the same way. In the
defence industry at least it is very common for there to be a lot of
formal unit tests.

> Again, as one who uses TDD, the tests are always up to date as they
> document the workings of the code. All down to process.


If the process is enforced then the testing is formal and, I would
expect, the results are recorded somewhere the 10th developer after you
will be able to find them.

> Ah, that explains a lot!


Acceptance tests are used to accept, simple :-)

> Neither have I.


So you do your acceptance tests before the customer sees the kit?

>
> True, but with care you can automate the majority of them. The beauty
> of automated tests is they cost next to nothing to run, so they can be
> continuously run against your code repository.


I fully understand the use of them. However, it is not always either
practical or cost effective. In this case there was no automated test
system available, so if we wanted one we would have had to design,
implement and test it, then write all the test harnesses...

Almost forgot, we would have had to generate and validate a *lot* of
test data instead of just using real kit either with or without faults.

At the end of the day we would also have had to do thorough integration
testing as well. So I still believe doing automated testing would have
been more expensive overall, and certainly would have been a significant
up-front cost.

Note that this SW does a *lot* of HW interaction, since it is actually
the main SW of a piece of 2nd line test equipment.

> There you have what I'd call integration testing,


Yes and no. Each set of tests was focused on exercising a specific unit,
it was just using the rest of the SW as a test harness.

> something we also do
> with any software that interacts with other equipment.


Obviously. We just killed multiple birds with the same high-tech
missile^W^W^Wstone.

> I don't doubt it, I just prefer to send my resources elsewhere.


I still don't believe it cost more time overall.

> We go
> through the full manual integration tests for major software releases
> (adding acceptance and unit tests to reproduce any bugs found).


We also added tests to trap the few bugs that were found.

> This
> process of feeding back tests into the automated suites makes them
> progressively more thorough, to the extent that minor updates can be
> released without manual testing ant the testing of major releases finds
> few, if any, bugs.


We started off by making the tests thorough which is why the testing
takes so long. Due to this and the low bug count almost all releases
whilst I worked at the company were major releases (adding support for
major variants of the kit it tested, testing major new features in new
versions of the kit it tested etc) with only a small number of bug-fix
releases.

> Most of the bugs found by the manual testing are
> differing interpretations of the specification.


Not on this SW. Reviews of requirements caught most of them and reviews
of design most of the remainder. I can only think of one interpretation
issue on the SW that was not caught before coding started on this SW.
--
Flash Gordon
Ian Collins

2007-08-29, 10:14 pm

Flash Gordon wrote:
> Ian Collins wrote, On 28/08/07 22:03:
>
> If the process is enforced then the testing is formal and, I would
> expect, the results are recorded somewhere the 10th developer after you
> will be able to find them.
>

The results are recorded every time the tests run - either "OK" or
failure messages :)

>
> So you do your acceptance tests before the customer sees the kit?
>

They are run as soon as the feature they test is complete.

>
> I fully understand the use of them. However, it is not always either
> practical or cost effective. In this case there was no automated test
> system available, so if we wanted one we would have had to design,
> implement and test it, then write all the test harnesses...
>

It the project is log running, or a family of products are to be
maintained it can be worth the effort. I preferred to have my test
engineers developing innovative ways to build automatic tests that have
them running manual tests. Provided they can produces the tests at
least as fast as the developers code the features, everyone is happy.

> Almost forgot, we would have had to generate and validate a *lot* of
> test data instead of just using real kit either with or without faults.
>

I like to capture all of the data generated during manual tests and feed
it back through as part of the automated tests.

> Note that this SW does a *lot* of HW interaction, since it is actually
> the main SW of a piece of 2nd line test equipment.
>

The example I'm referring to were power system controllers.

>
> I still don't believe it cost more time overall.
>

This project has been running (the product has to continuously evolve to
meet the changing market) for 5 years, so the up front cost has paid for
its self many times over.

--
Ian Collins.
Phlip

2007-08-29, 10:14 pm

Flash Gordon wrote:

>
> You should try to avoid assuming everyone works the same way. In the
> defence industry at least it is very common for there to be a lot of
> formal unit tests.


I think Ian refers to "developer tests". Giving them different definitions
helps. They have overlapping effects but distinct motivations.

The failure of a unit test implicates only one unit in the system, so the
search for a bug should be very easy. The failure of a developer test
implicates the last edit - not that it inserted a bug, but only that it
failed the test suite! Finding and reverting that edit is easier than
debugging.

> I fully understand the use of them. However, it is not always either
> practical or cost effective.


They are more cost effective than endless debugging!!!

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Flash Gordon

2007-08-31, 9:07 pm

In-Reply-To: <5jmlkoFau6qU2@mid.individual.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: <fq1mq4x6od.ln2@news.flash-gordon.me.uk>
X-Leafnode-NNTP-Posting-Host: 86.16.180.2
Lines: 89
NNTP-Posting-Host: 88.208.220.85
X-Trace: 1188586862 news.gradwell.net 639 spamtrap/88.208.220.85:38452
X-Complaints-To: news-abuse@gradwell.net
Bytes: 5488
Xref: number1.nntp.dca.giganews.com comp.arch.embedded:260321 comp.software.testing:58939 comp.lang.c:762624 comp.lang.c++:871651

Ian Collins wrote, On 30/08/07 02:42:
> Flash Gordon wrote:
> The results are recorded every time the tests run - either "OK" or
> failure messages :)


They are only recorded if they are put somewhere that someone can see
them after you have left the company. Otherwise they are only reported.

> They are run as soon as the feature they test is complete.


And all re-run after the final line of code is cut, I trust.

> It the project is log running,


I started on it in the late 80's and the last I heard was a contract
signed giving an option of support until 2020, that long enough for you?

> or a family of products are to be
> maintained it can be worth the effort.


Only half a dozen or so variants, all using over 90% common code.

> I preferred to have my test
> engineers developing innovative ways to build automatic tests that have
> them running manual tests.


Ah, but we did not spend vast amounts of time running the tests, not
compared to the time/effort involved in generating the required test
data, automating the tests, and then writing the integration tests
needed to prove it works as an entire system.

> Provided they can produces the tests at
> least as fast as the developers code the features, everyone is happy.


We did not have the luxury of dedicated test developers. Those
developing the tests where those analysing the requirements, designing
the SW and implementing it.

> I like to capture all of the data generated during manual tests and feed
> it back through as part of the automated tests.


That would require writing a lot of SW to capture the data. All of which
would have to be tested.

> The example I'm referring to were power system controllers.


I'm talking about 2nd line test equipment for *very* high end camera and
image processing systems. 2nd line is the kit the customer puts it on
when it has come back from operation broken.

> This project has been running (the product has to continuously evolve to
> meet the changing market) for 5 years, so the up front cost has paid for
> its self many times over.


Ah well, the SW I'm referring changes only every few years due to new
customers or existing customers wanting enhancements to the kit it is to
test. The last set of updates I'm aware of will have started probably in
2001 (maybe 2000) but I had left the company by then. I know we had won
the contract. So definitely over twice as long a period. Requirements
changes also had minimal code impact because we had designed the system
to allow for changes.
--
Flash Gordon
Ian Collins

2007-08-31, 9:07 pm

Flash Gordon wrote:
> Ian Collins wrote, On 30/08/07 02:42:
>
> They are only recorded if they are put somewhere that someone can see
> them after you have left the company. Otherwise they are only reported.
>

The tests are part of the project, in the sane source control. Without
the tests, the project can not build. Building and running the tests is
an integral part of the build process.

The last sentence is important, so I'll repeat it - the unit test are
built and run each time the module is compiled.

>
> And all re-run after the final line of code is cut, I trust.
>

Rerun every build, dozens of times a day for each developer or pair.

--
Ian Collins.
Ark Khasin

2007-08-31, 9:07 pm

Thank you folks for the fruitful discussion :)
Ian Collins

2007-08-31, 9:07 pm

Ark Khasin wrote:
> Thank you folks for the fruitful discussion :)


What was your conclusion?

--
Ian Collins.
Ark Khasin

2007-09-01, 12:10 am

Ian Collins wrote:
> Ark Khasin wrote:
>
> What was your conclusion?
>

Since you asked... I expected a more substantial feedback (perhaps,
naively).

I've been heavily involved in a SIL 3 project with ARM and IAR EWARM 4.x
Our consultant-on-safety-stuff company spelled out requirements for unit
testing /documentation/ so that it could be accepted by TUumlautV - a
certifying agency. So we started looking for test automation tools that
would help /demonstrate/ test results including code/branch coverage.
IPL Cantata++ didn't (at least at the time) support IAR. Period.
LDRA Testbed was (at least at the time) outright buggy; we had to
abandon it.
We ended up doing unit tests ad hoc with manual documentation, asking
for code coverage proof from the debugger. In the process, I found a way
of instrumenting the code by abusing the C preprocessor. It was not what
I asked to critique but the ideas were similar. I used it for regression
testing to make sure the execution trace was the same - or it changed
for a purpose and I had to set a new regression base.
It occurred to me later that the same instrumentation ideas could be
used to demonstrate code coverage, and that the framework could be
commonized with a very small footprint. That's what I asked to critique.
Personally, I love this thingie. It can get me through SIL business for
free (as opposed 10K per license or so), and additional coding policy
items should not be an issue for those already restricted by (a subset
of) MISRA.
Note that this stuff is applicable whether you do TDD or a formal
testing campaign.

--
Ark



Ian Collins

2007-09-01, 6:20 am

Ark Khasin wrote:
> Ian Collins wrote:
> Since you asked... I expected a more substantial feedback (perhaps,
> naively).
>
> I've been heavily involved in a SIL 3 project with ARM and IAR EWARM 4.x
> Our consultant-on-safety-stuff company spelled out requirements for unit
> testing /documentation/ so that it could be accepted by TUumlautV - a
> certifying agency. So we started looking for test automation tools that
> would help /demonstrate/ test results including code/branch coverage.


Well TDD done correctly will give you the required coverage, might be
interesting proving it to the certifying agency though (a branch won't
be there unless a test requires it).

--
Ian Collins.
Ark Khasin

2007-09-01, 6:20 am

Ian Collins wrote:
>
> Well TDD done correctly will give you the required coverage, might be
> interesting proving it to the certifying agency though (a branch won't
> be there unless a test requires it).
>

The mantra is, if a branch is to never be executed, it shall not be in
the production code. If it is executable, I must demonstrate how I
exercised it.
BTW, I tend to doubt TDD can be followed in semi-research environment
where you churn out piles of code to see what works vs. how the plant
behaves etc. Of course there is some lightweight informal T but none
documented. Once you found a decent solution, you got a fair amount of
code that needs, aside from bug finding/fixing, only error conditions
handling for productizing. At this point, T is already way behind D.
Or I must be missing something here...

--
Ark
Ark Khasin

2007-09-01, 6:20 am

Ian Collins wrote:

> The last sentence is important, so I'll repeat it - the unit test are
> built and run each time the module is compiled.
>

Assuming that the test code itself must be reasonably dumb (so that
/its/ errors immediately stand out), that's not terribly realistic:
imagine a sweep over, say, "int24_t" range. One could only hope to run
automated tests overnight - on a long night :).
--
Ark
Ian Collins

2007-09-01, 6:20 am

Ark Khasin wrote:
> Ian Collins wrote:
>
> Assuming that the test code itself must be reasonably dumb (so that
> /its/ errors immediately stand out), that's not terribly realistic:
> imagine a sweep over, say, "int24_t" range. One could only hope to run
> automated tests overnight - on a long night :).


It may not appear that way, but it is the reality on any project I
manage. In all (C++) cases, the tests take less time to run than the
code takes to build (somewhere between 50 and 100 tests per second,
unoptimised).

--
Ian Collins.
Ian Collins

2007-09-01, 6:20 am

Ark Khasin wrote:
> Ian Collins wrote:
> The mantra is, if a branch is to never be executed, it shall not be in
> the production code. If it is executable, I must demonstrate how I
> exercised it.
> BTW, I tend to doubt TDD can be followed in semi-research environment
> where you churn out piles of code to see what works vs. how the plant
> behaves etc. Of course there is some lightweight informal T but none
> documented. Once you found a decent solution, you got a fair amount of
> code that needs, aside from bug finding/fixing, only error conditions
> handling for productizing. At this point, T is already way behind D.
> Or I must be missing something here...
>

Well there you would be wrong. I even use it for quick one offs,
because it helps me go faster. The time saved not having to debug more
than justifies the process.

--
Ian Collins.
Ian Collins

2007-09-01, 6:20 am

Ark Khasin wrote:
> Ian Collins wrote:
> The mantra is, if a branch is to never be executed, it shall not be in
> the production code. If it is executable, I must demonstrate how I
> exercised it.


With TDD, if the branch isn't required to pass a test, it wont be there
at all.

--
Ian Collins.
Keith Thompson

2007-09-01, 6:20 am

Ark Khasin <akhasin@macroexpressions.com> writes:
> Ian Collins wrote:
> Assuming that the test code itself must be reasonably dumb (so that
> /its/ errors immediately stand out), that's not terribly realistic:
> imagine a sweep over, say, "int24_t" range. One could only hope to run
> automated tests overnight - on a long night :).


If every unit test has to check every possible value over a large
range, then yes, things could take a while. I just wrote a program
that iterated over a range of 2**24 in under a second, but if a
function takes three 32-bit arguments an exhaustive test starts to be
impractical.

But presumably in that case you'd just test a carefully chosen subset
of the possible argument values.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Phlip

2007-09-01, 9:49 am

Ian Collins wrote:

> Ark Khasin wrote:



Sounds like TDD.
[color=darkred]
> Well there you would be wrong. I even use it for quick one offs,
> because it helps me go faster. The time saved not having to debug more
> than justifies the process.


Ian, you are responding to the straw-person argument, "Projects that use TDD
only ever write any tests before the tested code."

They don't. Now let's hear why these "research" environments are _allowed_
to write code without a failing test, first!

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Phlip

2007-09-01, 9:49 am

Ian Collins wrote:

> Well TDD done correctly will give you the required coverage, might be
> interesting proving it to the certifying agency though


Put the test into FIT, and let the certifying agency change the input
variables and watch the output responses change.

Oh, are you going to say there are "certifying agencies" out there which
_don't_ expect literate acceptance tests to cover their requirements??

Sure explains KBR, huh? (-;

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Phlip

2007-09-01, 9:49 am

Ark Khasin wrote:

> Ian Collins wrote:


[color=darkred]
> Assuming that the test code itself must be reasonably dumb (so that /its/
> errors immediately stand out), that's not terribly realistic: imagine a
> sweep over, say, "int24_t" range. One could only hope to run automated
> tests overnight - on a long night :).


Under TDD, you might only need enough tests to establish a linear function
along that int24_t (two tests), and enough to establish its extrema (two
more tests). These run fast.

If you then need more tests, you add them. There is no TDD "or" a formal
test campaign. You are allowed to use "unit test" techniques. For example,
you might calculate a sweep that covers a representative subset of all
possible integers.

If your tests run too long to help development, you push the slow ones out
into a slow suite, and run this on a test server. You can use one of many
Continuous Integration tools to trigger that batch run each time developers
commit. (And they should commit every 5 to 15 minutes.)

You leave every TDD test in the developers' suite.

If the long suites fail, you treat the failure the same as a bug reported by
users. You determine the fix, then write a trivial TDD test which fails
because the fix is not there. Note the test does not exhaustively prove the
fix is correct; it's just enough to pin the fix down. You leave this test
case with the developers' suite.

A program that uses a wide range of an int24_t's possible states is a
program with a high dimensional space. All programs have such a space! The
exhaustive test for such spaces would take forever to run. However, high
dimension spaces are inherently sparse. Tests only need to constrain
specific points and lines within that space.

One way to determine these points and lines is to analyze that space to
determine the minimum set of sweeps of inputs that will cover the whole
space. You test a carefully choses subset of the possible input values.

Another way is to write the tests first. Then you have a test for two points
on every line, and each endpoint to each line, and so on.

If you write tests first, you also get to avoid a lot of debugging. If your
tests fail unexpectedly, you have the option to revert back to the last
state where all tests passed, and try again.

This leads to some curious effects. Firstly, you can make much more savage
changes between each test run. You can even change code that someone else
wrote! a long time ago!! that everything else uses!!!

Next, if your tests are cheap and sloppy, but the code can't exist without
them, you get the ultimate in Design for Testing. You get Design BY Testing.
That means your tests might fail even if your code had no bug.

Again: Your tests might fail even if your code had no bug.

That means your code occupies a high-dimension space that is easy for your
tests to cover. So instead of analyzing your code and wrapping your tests
around it, you used Adaptive Planning with both the code and tests to
simplify that high-dimension space.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Ark Khasin

2007-09-01, 10:08 pm

Phlip wrote:
> Ian Collins wrote:
>
>
> Put the test into FIT, and let the certifying agency change the input
> variables and watch the output responses change.
>
> Oh, are you going to say there are "certifying agencies" out there which
> _don't_ expect literate acceptance tests to cover their requirements??
>
> Sure explains KBR, huh? (-;
>

A certifying agency's stamp assures a Bhopal or Chernobyl plant manager
that it is safe to put your gadget in a safety application w.r.t.
acceptable safety risk according to a safety standard (ISO/IEC 61508 in
my case).
It verifies that
- you have an acceptable development process (from marketing
requirements to validation and everything in between, usually including
continuous improvement)
- you followed the process and can demonstrate it on every level.

It doesn't _run_ any tests for you; but it checks that you did so and
your tests were comprehensive and that you can show a documentation for it.
--
Ark
Ark Khasin

2007-09-01, 10:08 pm

Phlip wrote:
> Ian Collins wrote:
>
>
>
> Sounds like TDD.
>
>
> Ian, you are responding to the straw-person argument, "Projects that use TDD
> only ever write any tests before the tested code."
>
> They don't. Now let's hear why these "research" environments are _allowed_
> to write code without a failing test, first!
>

[If we agree that a test is a contraption to check if the code works as
expected:]
If we don't know what to expect ("research"), we cannot write a test.
[Or again I am missing something]

E.g. if I'm writing a version control system, I know exactly what _has_
to happen, and I can write the tests.
If e.g. I'm writing a monitoring code for the car's wheel speed sensors,
I may have a rock solid idea that e.g. whatever the speeds, the wheels
always remain in the vertices of a rectangle of the original size. Enter
sensor noise, wheel spinning, tire inflation and what not. I need lots
of code just to study what's going on before I can arrive at a sensible
algorithm.
--
Ark
Everett M. Greene

2007-09-01, 10:08 pm

<fq1mq4x6od.ln2@news.flash-gordon.me.uk> <5jrb76FmdijU19@mid.individual.net> <3N7Ci.176$Ov2.106@trndny06>
X-NewsReader: GRn 3.2n February 9, 1999
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Organization: none that you'd notice
Date: Sat, 1 Sep 2007 11:42:04 PST
Message-ID: <20070901.796CCD0.A56A@mojaveg.lsan.mdsg-pacwest.com>
Lines: 22
X-Usenet-Provider: http://www.giganews.com
NNTP-Posting-Host: 67.150.170.31
X-Trace: sv3-LaeMn6p3VR5HWTPcxT0PcYZuHMtk// jtrXSKq0+QepR7+WoOaqxA6Ddnkp8glKegTVTe5c
9eTfm0f+W!v6DxgzNvrZ/ qTVexPdKV1ZhoSgvdX0PbjtQMV8rS2RBq87hbKMB
D0Dp32XeRaB4Ea23JIaDXkY1+!zHqIx6HmdNuvCp
ZKbfNNTA==
X-Complaints-To: abuse@softcom.net
X-DMCA-Complaints-To: abuse@softcom.net
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.35
Bytes: 2970
Xref: number1.nntp.dca.giganews.com comp.arch.embedded:260356 comp.software.testing:58962 comp.lang.c:762844 comp.lang.c++:871759

Ark Khasin <akhasin@macroexpressions.com> writes:
> Ian Collins wrote:
>
> Assuming that the test code itself must be reasonably dumb (so that
> /its/ errors immediately stand out), that's not terribly realistic:
> imagine a sweep over, say, "int24_t" range. One could only hope to run
> automated tests overnight - on a long night :).


Is it really necessary to test all 2**24 values? It would seem that
testing the minimum, maximum, zero, and some representative values in
between would suffice. The "representative values" should be numerically
irrational (pi and e are good, for instance) so as to catch cases of
certain bits not being handled properly; 1 and 2 are not good choices
although they need to work properly as well.

In the area of branch testing, one has to test loops for proper
termination. [I just found some bugs last evening that involved
some simple counting loops that didn't terminate due to doing a
check for <0 on an unsigned value -- oops.]
Everett M. Greene

2007-09-01, 10:08 pm

"Phlip" <phlipcpp@yahoo.com> writes:
>
>
> Ian, you are responding to the straw-person argument, "Projects that use
> TDD only ever write any tests before the tested code."
>
> They don't. Now let's hear why these "research" environments are _allowed_
> to write code without a failing test, first!


Have you ever worked in a product R&D environment? A lot of concepts
are taken for a test drive without ever seeing the light of day outside
the lab. If the product does make it out the door, the original
concept proving/testing work is probably a very small portion of the
final product. You want to spend a lot of time and effort producing
more formalized testing processes for something that has a very low
probability of ever being used in a production environment?
Ark Khasin

2007-09-01, 10:08 pm

Everett M. Greene wrote:
> Is it really necessary to test all 2**24 values? It would seem that
> testing the minimum, maximum, zero, and some representative values in
> between would suffice. The "representative values" should be numerically
> irrational (pi and e are good, for instance) so as to catch cases of
> certain bits not being handled properly; 1 and 2 are not good choices
> although they need to work properly as well.


The example of "gullibility measurement and conversion" in
http://www.macroexpressions.com/dl/... /> estring.pdf
may be reasonably convincing

>
> In the area of branch testing, one has to test loops for proper
> termination. [I just found some bugs last evening that involved
> some simple counting loops that didn't terminate due to doing a
> check for <0 on an unsigned value -- oops.]


IMHO, unsigned<0 condition doesn't rise to testing: Lint will find it
before you compile
Phlip

2007-09-01, 10:08 pm

Ark Khasin wrote:

> [If we agree that a test is a contraption to check if the code works as
> expected:]


The weakest possible such contraption - yes.

> If we don't know what to expect ("research"), we cannot write a test. [Or
> again I am missing something]


If you can think of the next line of code to write, you must perforce be
able to think of a test case that will fail because the line is not there.

Next, if you are talking about research to generate algorithms for some
situation, then you aren't talking about production code. Disposable code
doesn't need TDD. Once you have a good algorithm, it will have details that
lead to simple test cases.

> E.g. if I'm writing a version control system, I know exactly what _has_ to
> happen, and I can write the tests.
> If e.g. I'm writing a monitoring code for the car's wheel speed sensors, I
> may have a rock solid idea that e.g. whatever the speeds, the wheels
> always remain in the vertices of a rectangle of the original size. Enter
> sensor noise, wheel spinning, tire inflation and what not. I need lots of
> code just to study what's going on before I can arrive at a sensible
> algorithm.


That's an acceptance test. TDD tests don't give a crap if your code is
acceptable - if it targets wheels or wings. It's just a system to match
lines of code to trivial, nearly useless test cases.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Phlip

2007-09-01, 10:08 pm

Everett M. Greene wrote:

> Have you ever worked in a product R&D environment?


Yes. I helped teach TDD and Python to polymaths who were into declaring
multi-dimensional arrays as void **.

> A lot of concepts
> are taken for a test drive without ever seeing the light of day outside
> the lab. If the product does make it out the door, the original
> concept proving/testing work is probably a very small portion of the
> final product. You want to spend a lot of time and effort producing
> more formalized testing processes for something that has a very low
> probability of ever being used in a production environment?


TDD is faster and easier than debugger-oriented programming.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Ark Khasin

2007-09-01, 10:08 pm

Phlip wrote:

> Next, if you are talking about research to generate algorithms for some
> situation, then you aren't talking about production code. Disposable code
> doesn't need TDD. Once you have a good algorithm, it will have details that
> lead to simple test cases.
>

That's the whole point. I end up with some working prototype code for
which I need to create tests post factum.
Ian Collins

2007-09-01, 10:08 pm

Phlip wrote:
> Ark Khasin wrote:
>
>
> The weakest possible such contraption - yes.
>
>
> If you can think of the next line of code to write, you must perforce be
> able to think of a test case that will fail because the line is not there.
>
> Next, if you are talking about research to generate algorithms for some
> situation, then you aren't talking about production code. Disposable code
> doesn't need TDD. Once you have a good algorithm, it will have details that
> lead to simple test cases.
>

I have found TDD to be a good tool for pointing me at a new algorithm.
It might just be the way a think, but given something I'd forgotten or
was too lazy to look up such as polynomial fitting, I start with a
simple flat line test, then a slope with two points, and so on until I
have a working general solution. I've found a dozen or so tests are
required to pop out a working solution. Given the working tests, the
algorithm can then be optimised.

--
Ian Collins.
Ian Collins

2007-09-01, 10:08 pm

Everett M. Greene wrote:
> "Phlip" <phlipcpp@yahoo.com> writes:
>
> Have you ever worked in a product R&D environment? A lot of concepts
> are taken for a test drive without ever seeing the light of day outside
> the lab.


We call them spikes, or a proof of concept. Once the concept has been
proven, the code is put to one side and re-written using TDD.

Even these spikes can often be produced faster with TDD, the time saved
not debugging justifies the more formal approach.

It's unfortunate that us C and C++ programmers are spoiled rotten with
decent debuggers. Try developing something complex in an environment
without one and the benefits of TDD become clear. I do a lot of PHP and
I have never bothered looking for a PHP debugger.

--
Ian Collins.
Phlip

2007-09-01, 10:08 pm

Ark Khasin wrote:

> That's the whole point. I end up with some working prototype code for
> which I need to create tests post factum.


"Unit" tests post-factum. Not "developer tests" that support generating
production code.

Before you create this prototype code, do you _never_ debug it?

When researching, I frequently write disposable code test-free. When I
convert it to production code, I write the tests first. The result is much
cleaner for two reasons: It's a rewrite - that's always cleaner - and it's
super-easy to refactor. Without debugging.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Flash Gordon

2007-09-01, 10:08 pm

Ian Collins wrote, On 01/09/07 08:21:
> Ark Khasin wrote:
>
> It may not appear that way, but it is the reality on any project I
> manage. In all (C++) cases, the tests take less time to run than the
> code takes to build (somewhere between 50 and 100 tests per second,
> unoptimised).


This, however, is not always the case. I've written a function of about
20 lines that IIRC required something line 100-200 tests. If test took a
similar amount of time to run as the code took to compile. It was doing
maths and there where a *lot* of cases to consider.

For an audit of the entire piece of SW (rather than just that one
function) the customer insisted that we print out all of the module test
specs. The stack of A4 paper produced was a couple of feet tall! Running
that set of tests would take rather more than overnight.

On another project, doing a build of our piece of the SW took 8 hours.
Doing a build of all of the SW for the processor took 48 hours. Add
testing to that for each build...

Some projects are a lot harder than yours.
--
Flash Gordon
Phlip

2007-09-01, 10:08 pm

Ian Collins wrote:

> I have found TDD to be a good tool for pointing me at a new algorithm.
> It might just be the way a think, but given something I'd forgotten or
> was too lazy to look up such as polynomial fitting, I start with a
> simple flat line test, then a slope with two points, and so on until I
> have a working general solution. I've found a dozen or so tests are
> required to pop out a working solution. Given the working tests, the
> algorithm can then be optimised.


If you follow the exact refactoring rules, you'll remove all duplication
before adding the next line of code. I once tried that while generating an
algorithm to draw Roman Numerals, and I discovered that the outcome was
sensitive to one of my early refactors. The design I got sucked; it was
harder to code over time, not easier. I had to roll the entire process back
to that refactor, try it the other way, and _this_ time the correct
algorithm popped out.

TDD is a very good way to force a clean design to emerge, following simple
and known algorithms. But it's not a general-purpose algorithm generator.
Whoever discovers _that_ gets to go to the top of the food chain.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Ian Collins

2007-09-01, 10:08 pm

Phlip wrote:
> Ian Collins wrote:
>
>
> If you follow the exact refactoring rules, you'll remove all duplication
> before adding the next line of code. I once tried that while generating an
> algorithm to draw Roman Numerals, and I discovered that the outcome was
> sensitive to one of my early refactors. The design I got sucked; it was
> harder to code over time, not easier. I had to roll the entire process back
> to that refactor, try it the other way, and _this_ time the correct
> algorithm popped out.
>

There you go, the solution is to hold off all but the most trivial
refactoring until the end!

--
Ian Collins.
Ian Collins

2007-09-01, 10:08 pm

Flash Gordon wrote:
> Ian Collins wrote, On 01/09/07 08:21:
[color=darkred]
>
> This, however, is not always the case. I've written a function of about
> 20 lines that IIRC required something line 100-200 tests. If test took a
> similar amount of time to run as the code took to compile. It was doing
> maths and there where a *lot* of cases to consider.
>

Um, I've never seen one like that before, probably because TDD doesn't
yield that type of code.

> For an audit of the entire piece of SW (rather than just that one
> function) the customer insisted that we print out all of the module test
> specs. The stack of A4 paper produced was a couple of feet tall!


Sounds like the US DOD, I'm sure they just weigh or measure
documentation rather than read it!

>
> On another project, doing a build of our piece of the SW took 8 hours.
> Doing a build of all of the SW for the processor took 48 hours. Add
> testing to that for each build...
>

Those were the days. Thank goodness for fast CPUs and distributed building.

--
Ian Collins.
Phlip

2007-09-01, 10:08 pm

Ian Collins wrote:

> There you go, the solution is to hold off all but the most trivial
> refactoring until the end!


That's a joke, guys.

When creating production code, not when researching, after passing a test,
try to simplify, and go in order from easy to hard refactors. Never try a
hard refactor first if there's an easy one available in the neighborhood.

The only exception is renaming things. Name them after their roles
stabilize!

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


Flash Gordon

2007-09-02, 4:37 am

Ian Collins wrote, On 02/09/07 01:35:
> Flash Gordon wrote:
>
> Um, I've never seen one like that before, probably because TDD doesn't
> yield that type of code.


That is only true if it produces more complex code. The 20 odd lines of
code resulted from a requirement to implement two simple looking
equations and one simple statement in English. The reason it was so many
test cases was that it was dealing with one angle in the range +/- 270
degrees, one in the range +/-170 degrees and two in the range +/1 6
degrees. The testing had to verify behaviour with every angle in each
quadrant, every angle at 0, 90 etc, every angle just either side etc. It
was the *maths* together with the chances of selecting the wrong
solution from the trig that meant a lot of test cases, not the
complexity of the code.

>
> Sounds like the US DOD, I'm sure they just weigh or measure
> documentation rather than read it!


You guessed right, but the important thing is the quantity of tests and
therefore the time it would take to run them all.

> Those were the days. Thank goodness for fast CPUs and distributed building.


Also the larger projects which tie up the processors for just as long
because they are so much more complex.

I know there is still SW that takes hours to build because within the
last few years I have done builds that have taken hours.
--
Flash Gordon
Flash Gordon

2007-09-02, 4:37 am

Phlip wrote, On 02/09/07 01:36:
> Ian Collins wrote:
>
>
> That's a joke, guys.
>
> When creating production code, not when researching, after passing a test,
> try to simplify, and go in order from easy to hard refactors. Never try a
> hard refactor first if there's an easy one available in the neighborhood.
>
> The only exception is renaming things. Name them after their roles
> stabilize!


Sometimes when code as been "hacked together" over time the only way to
get a clean design is to start from scratch and design it based on what
you have learned.

There is no one set of rules that is always correct.
--
Flash Gordon
Ian Collins

2007-09-02, 4:37 am

Flash Gordon wrote:
>
> I know there is still SW that takes hours to build because within the
> last few years I have done builds that have taken hours.


Must be huge, the biggest think I build regularly is the OpenSolaris
code base, which takes about 40 minutes on my box.

If a build takes too long, throw more cores at it. If the tools don't
support distributed building, change the tools.

--
Ian Collins.
Ark Khasin

2007-09-03, 7:17 pm

Ark Khasin wrote: