Highlighted
Super Contributor.
Super Contributor.
173 views

What is accepted as RANDOM function seed value?

The RANDOM function documentation page doesn't give sufficient information on the range of accepted values for argument-1.

Perhaps someone can shed light on my following two questions:

  1. What range of seed values is accepted?
  2. How are values treated that are exceeding the allowed range?
      »  Are they truncated?
      »  Are only the lower bits used?
      »  Or the upper bits?
      »  Are the leftmost digits used?
      »  Or the rightmost?
      »  How many of them?
      »  Is a `MOD` function applied to the seed value?

It is common to use the current date/time as a seed value for RANDOM to get a "truely pseudo random" sequence. So I used below piece of code to create an appropriate seed value. But I have no clue on how the seed value I'm providing to the RANDOM function is processed by the compiler.

If the compiler only took the first few digits (which is the date part) then this wouldn't result in different random sequence results when the program would be run several times a day. That's not what I'm looking for.

Your answers are appreciated.

------------------------------------

WORKING-STORAGE SECTION.
  01 dateTimeString PIC X(16).
  01 dateTime PIC 9(16) USAGE IS COMP VALUE ZEROS.
  01 vNumber PIC 9(4).

PROCEDURE DIVISION.
  *> Create random seed value
  MOVE FUNCTION CURRENT-DATE TO dateTimeString
  MOVE FUNCTION NUMVAL(dateTimeString) TO dateTime
  COMPUTE vNumber = FUNCTION RANDOM(dateTime) * 700
  .

0 Likes
7 Replies
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: What is accepted as RANDOM function seed value?

The Micro Focus COBOL runtime implements the intrinsic function RANDOM per the COBOL specification. The specification restricts the seed parameter to non-negative integers that can be represented by numeric-class elementary data items.

The standard also makes certain guarantees about the semantics of seed values and about the distribution of the PRNG output.

Anything beyond that is an implementation detail which may change without notice, and cannot be relied on in portable code.

I took a quick look at the current implementation and it doesn't appear to discard any of the information entropy from the permitted range of seed values in updating the internal state. So all (settable) bits of the seed value should affect the sequence.

Your specific use case should be easy enough to test probabilistically, in Monte Carlo fashion, if you're still concerned. Personally I can't think of many applications where I'd want to use a timestamp as a seed - either I want reproducible results, or I want unpredictable (under some threat model) ones - but I don't know what your application is.

Out of curiosity, what difference do you posit between "high and low" bits, on the one hand, and "left and right" bits on the other?

0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: What is accepted as RANDOM function seed value?

Please pardon me for replying so late.

OK, let's see if I correctly got your point ...

Let's first imagine I'm supposed to write a program that's generating cyphers to generate a certificate, or a blockchain. Or, better, some winning numbers. An official lottery program perhaps. A program that's in a way resembling something like this. The program to write has to pass the revision dept.'s judgement of generating new, unique, "true" random numbers whenever it's called, no matter how often or at which time of day.

 

You wrote that "all (settable) bits of the seed value should affect the sequence".

So, you suggest that if I declared the seed value to be, for instance, PIC 9(50) USAGE IS DISPLAY, which is an amazingly large number range, like this:

large number block, dark.png

 

... then each of these 50 digits is considered for calculating the seed value?

 

This COBOL tutorial document claims that, no matter how large the seed value is declared, only the rightmost 31 bits will be considered for calculating the seed value.

 

That's quite a huge difference in result, because if I used a seed value large enough to hold more than 31 bits, and if I incidentally provided values only varying in the MSBs "left to" these significant 31 LSBs, then I wouldn't get random sequences at all.

The same is true, of course, if the COBOL compiler took only the leftmost bits of a provided seed value:

RANDOM Seed (short).gif

See the difference?

 

That's why I believe it's important to know the range of digits (or values) accepted by the seed value. It's also important to know if a hash function is getting applied to the provided seed value to shorten its accepted range. If, for instance, a plain FUNCTION MOD would be getting applied, risk is high that attackers easily might predict future random sequences.

 

"Out of curiosity, what difference do you posit between "high and low" bits, on the one hand, and "left and right" bits on the other?"

The difference was that "high and low" dealt with bits (binary system) while "left and right" dealt with digits (decimal system). A decimal value like "PIC 9(7) USAGE IS COMP VALUE 8000000000" may internally be trimmed if assigned to a 32 bit register.  As I don't know yet how seed values are actually fed to the RANDOM function, I needed to distinguish between both numerical systems.

 

So, the question still is: What range of values has significance on the RANDOM seed?

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: What is accepted as RANDOM function seed value?

Sigh.


@BlackKnight wrote:

Let's first imagine I'm supposed to write a program that's generating cyphers to generate a certificate, or a blockchain.

Stop right there, and get a professional cryptographer. Or, better, get a professional cryptographic engineer, who can implement a standard cryptosystem with standard, vetted ciphers and other cryptographic primitives, with meaningful security guarantees under an explicit threat model that's relevant to the actual use case.

COBOL's RANDOM intrinsic function is not a cryptographic pseudorandom number generator, and using it for any cryptographic purpose is an extremely bad idea.

Or, better, some winning numbers. An official lottery program perhaps.

An even worse idea.

The program to write has to pass the revision dept.'s judgement of generating new, unique, "true" random numbers whenever it's called, no matter how often or at which time of day.

You're Doing It Wrong.

You wrote that "all (settable) bits of the seed value should affect the sequence".

So, you suggest that if I declared the seed value to be, for instance, PIC 9(50) USAGE IS DISPLAY,

That's not a valid "numeric-class elementary data item", as I also specified.

which is an amazingly large number range, like this:

No, it's an error. (As for what constitutes "amazingly large" ... a 50-decimal-digit number has less than 161 bits of entropy. In crypto we eat that sort of thing for breakfast.)

This COBOL tutorial document claims that, no matter how large the seed value is declared, only the rightmost 31 bitswill be considered for calculating the seed value.

That COBOL tutorial document does not determine the behavior of Micro Focus COBOL.

 

That's quite a huge difference in result, because if I used a seed value large enough to hold more than 31 bits, and if I incidentally provided values only varying in the MSBs "left to" these significant 31 LSBs, then I wouldn't get random sequences at all.

Well, it's a "huge difference" mostly because one of those cases is an error. But yes, obviously, if you could specify a seed larger than the internal state of the PRNG, then it's possible (though not necessary) that some bits of the seed would not affect the PRNG's internal state.

(Why not necessary? Because the PRNG could hash all the [significant] bits of the seed so that they all contributed some nonzero amount of information entropy to the state, or by updating the internal state with bits or blocks of an arbitrarily-long seed stream, which comes to the same thing. A PRNG is not forced to discard seed entropy beyond the size of its state, though of course the total entropy of the PRNG is restricted to the size of the state, so by the pigeonhole principle there are an infinite number of equivalent seeds for at least one state.)

That's why I believe it's important to know the range of digits (or values) accepted by the seed value.

I understand why you think that. I've explained what's guaranteed by the system, and anything beyond that is an internal implementation detail.

It's also important to know if a hash function is getting applied to the provided seed value to shorten its accepted range. If, for instance, a plain FUNCTION MOD would be getting applied, risk is high that attackers easily might predict future random sequences.

Your threat model isn't entirely clear to me, but it certainly sounds like you're using the RANDOM intrinsic for something inappropriate.

Nonetheless, the answer to this (implied) question is present in my previous post. The seed must be a non-negative numeric-class elementary data item. If it is, then all bits affect the internal state of the PRNG.

For Micro Focus COBOL, that means 64 bits; that's the maximum size of (the internal representation of) a numeric-class elementary data item.

"Out of curiosity, what difference do you posit between "high and low" bits, on the one hand, and "left and right" bits on the other?"

The difference was that "high and low" dealt with bits (binary system) while "left and right" dealt with digits (decimal system).

Ah, I see. While that's certainly an interesting nomenclature you have there, you might want to be a bit more explicit in the future. And, perhaps, don't immediately jump to the conclusion that you know more about the subject than your interlocutors do. Just a suggestion.

I might point out that there is a vast collection of PRNGs, suited for different use cases, available as open source, in libraries, and so on. If you need particular parameterizations, distributions, behavorial guarantees, security guarantees, or what have you, then you'd do well to pick one that explicitly provides those attributes. You can't swing a seed value without hitting an implementation of some Mersenne Twister variant. It wouldn't be hard to implement a CMWC generator even in COBOL. You could use an entropy pool and iterated HMAC or a PBKDF, depending on the details of your use case (or, though this is still cause for concern, your threat model). Rather than worry about what the RANDOM intrinsic provides beyond what the Standard specifies, use a generator that gives you what you think you want.

0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: What is accepted as RANDOM function seed value?

I'm baffled you're using so many words to circumvent an appropriate answer and sharing your personal bias with me. tl;dr

I tried to explain why I believe it is important to know the accepted range of values for the RANDOM seed. I'm not writing here to discuss your personal point of view. I'm not even interested. If you're not inclined to answer appropriately, you better leave this thread.

 

So, once again:

What is the valid range of values evaluated as a seed value by the RANDOM function?

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: What is accepted as RANDOM function seed value?

I'm not too sure about the native case but, when compiling for .NET or JVM, the seed value is truncated to a 32 bit integer before being passed to a run-time method.  In the case of .NET, this run-time method uses the framework class System.Random, and in the case of JVM, it uses java.lang.Random.

 

Highlighted
Super Contributor.
Super Contributor.

Re: What is accepted as RANDOM function seed value?

Thank you, Robert. I very much appreciate your valuable answer. 👍

So, when I create a "COBOL Console Application" project in Visual COBOL for Visual Studio 2019, is the .NET Random API used then? Or is the native COBOL Random generator being used?
0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: What is accepted as RANDOM function seed value?

If you choose Console Application (.NET framework) then the .NET Random class is used.  However if you choose Console Application - A project for creating a native command-line application..., then that will not be using any .NET functionality, and the Random function will be evaluated using code in the MF native COBOL runtime.  In this latter case, I'm not sure of exactly how that random function is evaluated.

 

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.