Scanning dicts

I have a function in my script for scanning one file with another that I have solved in awk:
awk ‘FNR==NR {
a[$2]=$1
next
}
{ for (i in a)
if ($2 ~ i) {$3+=a[i];}
}’ file1 file2 > fileout

file1 and file2 are type string-int.

Can this be solved in Julia more efficiently? How?

Best regards,

alaraints

I’m afraid that’s beyond my ability to understand awk, could you explain what your script does?

1 Like

$ denotes field in awk tabular layout.
The first half creates an associative awk array from the file1. $2 is short text ADFGT; $1 the corresponding value 324 (for example). In Julia this would be called a dict, I guess?
The second half takes the text bits i from the array a (one by one) and looks for match in file2, $2, which is a longer string. If match is found, the corresponding $1 of file1 is added to the matching line $3.
file1 is 50k lines, file2 is 2M lines. Takes ages.
I am currently reading Introducing Julia/Controlling the flow - Wikibooks, open books for an open world, I’m as far as creating the arrays. I have to find the match function yet, and a couple of other things too. Just wondering if it is worth the effort. Thank you for your interest.

You may find the manual useful, especially on strings and about using the language in general.

Depends on your payoff function. I can’t say whether it would be faster or slower than awk, especially if this is your first Julia program. But if you aim to learn Julia, programming a script like this is not a bad start.