A while ago we had a similar query about the weighted list generator where even at a simple level it behaved unexpectedly - for instance if you tried to generate 10 rows of values x, y and z on a 20, 20, 60 basis, you'd expect to get 2 x, 2 y, and 6 z. But it would often not produce this.
I queried it with the developers and apparently it's working as designed, in their words: "The values are generated at random using the weightings. Not generated in the weighted ratio then randomized."
As for how it works- it seems both ratios and a percentage should be feasible, as the popup help states:
For example, if you enter 2 for value Yes and 1 for value No, Yes will occur twice as many times as No in the selected column.
To specify as percentages, ensure all the weight ratios add up to 100.
The new version of Data Generator has an option to use a Python Script as a generator, and they were kind enough to produce a sample that would lead to a more predictable result, which I've pasted below. Hopefully it's of some use although I see you're actually working with a CSV file of values, so I'm not sure how easily you'll be able to convert it across.
- Code: Select all
#Python script is generate strings in a strict ratio
__randomize__ = True
weightedStrings = (('xxx',2), ('yyy',2), ('zzz',6))
for i in range(n_rows):
for item in weightedStrings:
string = item
weight = item
for i in range(weight):