Default Arbitrary instances for e.g. long return very small values? #9

BardurArantsson · 2018-04-01T10:00:28Z

It seems that the default Arbitrary instances for integral types return very small values. Here's the small prop I've been using to test:

template <typename T>
struct ReproducerProp: cppqc::Property<T>
{
	bool check(const T &expected) const override
	{
		std::cout << expected << std::endl;
		if (expected > 255) {
			return false;
		}
		if (expected < -255) {
			return false;
		}
		return true;
	}
};

I would certainly have expected this to fail at least some of the time, but I've done ~20 runs (each with 100 tests) and it's not failed so far.

It seems very odd that it's not using the full range of the type by default.

(Generally speaking it might also be a good idea to make sure to always include the min/max/0 value for any given type in the set of integers tested, by that's another issue.)

EDIT: Oh, the runner code is

	quickCheckOutput(ReproducerProp<long>());

Tested current master.

The text was updated successfully, but these errors were encountered:

BardurArantsson · 2018-04-01T10:06:27Z

Checking the implementation, I see that it's using arbitrarySizedBoundedIntegral which seems very ad-hoc. Is there any particular reason that this is the default?

philipp-classen · 2018-04-02T01:41:06Z

Good example. Yes, it will currently always pass, and I agree that it is not the behavior that you would expect.

The problem is that by default the number are constrained by a parameter "size", which is increased during the search phase, but is currently limited to 100. That is why it will not find a counter-example. Yes, that needs to be fixed.

Longer answer:

Not being the original author of the library, I have to guess a bit about why it is so. The arbitrarySizedBoundedIntegral function was clearly inspired by the original Haskell implementation, which has this mysterious comment (source):

-- | Generates an integral number from a bounded domain. The number is
-- chosen from the entire range of the type, but small numbers are
-- generated more often than big numbers. Inspired by demands from
-- Phil Wadler.
arbitrarySizedBoundedIntegral :: (Bounded a, Integral a) => Gen a
arbitrarySizedBoundedIntegral =
  withBounds $ \mn mx ->
  sized $ \s ->
    do let bits n | n == 0 = 0
                  | otherwise = 1 + bits (n `quot` 2)
           k  = 2^(s*(bits mn `max` bits mx `max` 40) `div` 80)
       n <- choose (toInteger mn `max` (-k), toInteger mx `min` k)
       return (fromInteger n)

Recently, I modified the function a bit (310f21b#diff-f0468744ea9948df8c9d0dbdfb4343daR70) to remove more Boost dependencies. The old implementation returned values whose absolute values were near the "size" parameter. For instance, if you passed 1000 for "size", it returned values like 994, 1002, etc. Not sure if that was intended.

The new implementation returns values that are in the interval [-size, size], but values closer to 0 are a bit more likely. From what I saw it performs not worse than the old implementation. First, I had an implementation that returned greater numbers (outside [-size, size]), but it performed worse in some examples that I saw. That is why I have chosen the current implementation which seemed like the safe bet to me at the time.

Replacing it by some better function, which can return values outside [-size, size] and possibly switching the default is something that would make sense in my opinion, but it is also something that has to be checked carefully as it can easily lead to worse behavior in other examples.

Maybe a more robust way would be to use a hybrid approach: For x%, continue with the [-size, size] and close to 0 strategy, and for the rest choose uniformly from the whole range. I also thought about explicitly trying the lowest and the highest value as you proposed.

philipp-classen added the bug label Apr 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default Arbitrary instances for e.g. long return very small values? #9

Default Arbitrary instances for e.g. long return very small values? #9

BardurArantsson commented Apr 1, 2018 •

edited

Loading

BardurArantsson commented Apr 1, 2018

philipp-classen commented Apr 2, 2018 •

edited

Loading

Default Arbitrary instances for e.g. long return very small values? #9

Default Arbitrary instances for e.g. long return very small values? #9

Comments

BardurArantsson commented Apr 1, 2018 • edited Loading

BardurArantsson commented Apr 1, 2018

philipp-classen commented Apr 2, 2018 • edited Loading

BardurArantsson commented Apr 1, 2018 •

edited

Loading

philipp-classen commented Apr 2, 2018 •

edited

Loading