How to improve a Generator to be more memory efficient when it is collected?

ATG at start looks for START codon, then (?: )*? repeats whatever is inside 0 or more times, but lazily (that’s the ? after *). This skips codons until the first (b/c lazy) of the next group. The next group is 3 string options covering the end codons. The [AGTC] repeating three times matches a codon, as each square bracket matches a single letter from inside it.

So actually, it could be even simpler:
r"ATG(?:[AGTC]{3})*?T(AG|AA|GA)"

This link gives all the common and uncommon regular expression syntax:
https://www.pcre.org/current/doc/html/pcre2syntax.html

2 Likes