LLMs and UUIDs

Continuing the discussion from DrWatson Common Code as Module Issue:

This prompted me to ask ChatGPT for some UUIDs and it happily “generated” 6 for me. All of them where valid UUIDs and none of them turned anything up in a Google search. So apparently ChatGPT is able to generate UUIDs in principle. Ofc the next question is: Are these random enough? So is it a good idea to use ChatGPT in practice to generate UUIDs? :smiley: (yes I know it is a horribly inefficient way of generating them…)
Does anybody know about research in that direction? I.e. how random are the responses of LLMs when asked for some random output?

Funnily enough I also asked it to give me 5 UUIDs that are already in use and the first 2 of the 5 I got are actually widely used:

  • 550e8400-e29b-41d4-a716-446655440000 is apparently a default example of an UUID, used in a few documentations and interestingly enough also the name of some file in the Apache Accumolo project
  • 123e4567-e89b-12d3-a456-426614174000 is used as an example in the Wikipedia article on UUIDs
  • I could not find anything for the other 3 UUIDs (but they where valid UUIDs)
1 Like

As I would expect, it works poorly, this is the prompt I used

generate 100 uniformly random integers from a range starting from 3128728 to 3128738 extreme included, without using Python, just generate them

Probably we need more samples to be sure but the sample it provided as an occurence count of

  3128735 => 10
  3128736 => 9
  3128731 => 9
  3128738 => 9
  3128728 => 9
  3128729 => 8
  3128733 => 9
  3128734 => 9
  3128732 => 9
  3128730 => 10
  3128737 => 9

which is very unlikely

3 Likes

hahahahha this would be the numbers you’d get if you ask someone that just started with math, like elementary school or something. They are as random as 7 is a random number between 1 and 10 :smiley:

2 Likes