Going from python to Julia

I really want to learn Julia to replace what I do in python. I was trying to create some basic things and kept getting errors I couldn’t resolve. This is an example of what I do in python:

cd={}
opt=''
oldtime=time.time()
with open(file2) as fh:
	for line in fh:
		if '#' not in line:
			ll=line.split('\t')
			chroosome=ll[0]
			if 'chr' in chroosome:
				chroosome=chroosome.replace('chr','')
			if chroosome not in cd:
				cd[chroosome]=[ll[1]]
			elif chroosome in cd:
				cd[chroosome].append(ll[1])
	with open(file) as fh:
		for line in fh:
			if '#' not in line:
				linecounter+=1
				if linecounter % 100000 == 0:
					newtime=time.time()
					timeelapsed = newtime-oldtime
					minutes=timeelapsed / 60
					displayminutes=format(minutes, '.2f') 
					sys.stdout.write(str(linecounter)+' total SNPs processed, last 100k in '+str(displayminutes)+' minutes\n')
					sys.stdout.flush()
					oldtime=newtime
				ll = line.split('\t')
				if 'chr' in chroosome:
					chroosome=chroosome.replace('chr','')
				if chroosome not in cd:
					opt+=line
				elif ll[1] not in cd[chroosome]:
					opt+=line
with open(outfile,'w') as fh:
	fh.write(opt)
1 Like

Hello and welcome!

You’re more likely to get useful help here if you can be clear and specific about what exactly you were trying to do, what the error was, and what you’ve already tried to do to resolve it. Can you help us help you?

5 Likes

xref: https://stackoverflow.com/questions/60570911/learning-julia-to-replace-python

1 Like

Sure thanks, to try to get some basics, I tried to get some basic iterations through files:

open("t_filter.vcf") do f
	line=0
	for i in eachline(f)
		line+=1
	end
	global line
end

The things I’m trying to learn is create a dictionary of one file, compare against a new file, ad write the output that doesn’t intersect. So what I’m trying to learn is run through a file, split on tabs, make a dictionary, iterate another file, write a new file. The issues I’ve run into are the global variable and creating dictionaries and lists

Have you seen readdlm in DelimitedFiles?

Could you show an example of something you tried that didn’t do what you expected or returned an error?

Regarding the use of glabal variables, I’d recommend seeing Scope of Variables · The Julia Language. The gist of it is that this doesn’t work:

julia> x = 1
1

julia> for i in 1:10
           x = x + i
       end
ERROR: UndefVarError: x not defined
Stacktrace:
 [1] top-level scope at ./REPL[49]:2

but if you want that to do what you expect, you need to mark x as a global variable in the loop like so:

julia> for i in 1:10
           global x = x + i
       end

julia> x
56

Of course, even better would be to just wrap everything into a function. Global variables make code many optimizations impossible and a lot of Julia’s advertized speed is assuming you’re using things like functions and not just writing big chunks of code in the global scope.

Compare this snippet:

julia> x = 1;

julia> @btime for i in 1:10
           global x = x + 1
       end
  239.515 ns (10 allocations: 160 bytes)

with this:

julia> x = 1;

julia> function f(x)
           for i in 1:10
               x = x + 1
           end
           x
       end
f (generic function with 1 method)

julia> x = 1
1

julia> @btime f(x)
  18.128 ns (0 allocations: 0 bytes)
11

So not only do we consider it to be ‘good style’ to prefer functions, but there are compelling performance reasons to do so.

3 Likes