Your reasoning is sound, and makes sense in my use case. My documents will be probably at least 50% explanation, and commenting everything would be irritating I think. Plus I’m familiar with Rmd so I’ll probably go with Weave. Thanks!
Check out the Julia for Data Science YouTube series that’s on the official Julia channel. A link to the first video in the series is here.
Check out mybinder.org for making your notebooks fully executable in the browser, without the user having to download anything. I created a very basic, intro to Julia notebook (specifically for colleagues of mine) that I have running on Binder so you can see what that looks like here: https://mybinder.org/v2/gh/mthelm85/Intro-to-Julia/master
I’ve been very frustrated at the lack of good data analysis/data science content on the web that relies on Julia. There are loads of great courses on a variety of online learning sites that make use of R/Python but almost nothing for Julia. I would be happy to contribute to this project and would be interested in linking up with you to share thoughts/organize an outline for topics to cover.
Yes, it’s very frustrating for someone who knows a bunch about data analysis, in say R or Python, but wants to move to Julia and get up to speed at their former level of knowledge. So I’m hoping to alleviate that and also teach a bit about data analysis.
I am very happy to partner on this. I am really just getting started on the project though. How about I PM you on the forum here, and we can discuss some ideas there, and then feed the more fully formed ones back into this thread?
Feel free to loop me in on this too. I’m developing a course right now (starting next week ) so may not be super available. But I’m still coming up with assignments, so there may be some mutually beneficial work to be done. My stuff will largely be biology focused, but I was planning to work with some covid datasets, so there may be broader interest
Nice. I have worked with biologists quite a bit over the years. What sort of topics are you working on?
I am in the process of writing the first of these tutorials, it basically downloads a public Census dataset, munges it, and makes a variety of plots to answer very basic questions about the data. Once that’s in a viable form I’ll put a git repo up on github and mention it here, we can discuss how to build on that foundation in different directions.
I think the “learn by doing” with not too much excess explaining is powerful. I do like to explain, so I’m thinking of having a companion to each tutorial that’s a discussion of why things were done, and why other things weren’t done etc.
I’m also interested in this process. I work for the state of California and I use Julia for some basic data analysis and manipulation. I have some scripts that access some pretty comprehensive database about pesticide use in the state and have been meaning to learn more about the process, but also share some of the stuff I know. @mthelm85 for example, helped me in the past to do some mapping using VegaLite and I did a scientific presentation with that.
Please do not hesitate to ping me or message me.
I currently study the human microbiome (in kids, looking at relationships with cognitive development). The course will include sequence analysis, using web APIs for biological datasets, phylogenetics and a bunch of other stuff
Do you do sequence analysis in Julia btw? What tools are there for this kind of thing?
BioSequences.jl and other stuff in BioJulia, mostly. I don’t do so much of this at the moment, and for this course I plan to do very basic things with strings mostly, or have them implement stuff themselves.
@dlakelan I’m also interested to see what you come up with. I’ve been curious about Julia for many years now, and this summer I may finally have some time to get into it.
Thanks!. Right now I have a kind of quick intro tutorial fairly well done in first draft form… I’m now going through and writing the Discussion that goes along with it. The tutorial is supposed to be more of step by step very readable thing with all the decisions made for you… the discussion is all about why did we choose what we did, or what could we have done differently etc. It’s a little more involved. I’m hoping to get those two both in a decent draft form, then throw them up in my git repo and open up some commentary here. Maybe another few days.
Ok, those who are interested. See the very bare github repo: https://github.com/dlakelan/JuliaDataTutorials
You should be able to get notebooks by just running the build.jl script, if you have Weave.jl installed.
The essential format is to split this into a series of Tutorials with paired Discussion. The first one is “BasicDataAndPlots”. Imagine the target audience is a 3rd year undergrad who has at most 1 semester of a computer programming course. The idea is to get them loading some data and producing some plots even if they don’t know how or why, just to see the syntax, maybe play around with it. It should have links to documentation so they can maybe modify plots by reading the discussion.
As things go along it should build to the point where we’re answering more meaningful questions and using more advanced ideas in data analysis, mostly from a Bayesian perspective. I’d like to tackle real world and interesting questions, the kind of thing where the answer isn’t clear, and someone who is interested in the topic could start from these tutorials, and then build a little undergrad or Masters level term-paper type project by further research. For the moment though, it’s just getting started.
Question for @kevbonham, how do I make sure Weave doesn’t try to execute a code block when building notebooks/pdf/html? I’m not clear on the syntax for that.
Having reread it, I can already see that there are some sections I should strip out and push into the Discussion. Also some things I need to add to the Discussion, like how the length units
Doesn’t the code chunk option (http://weavejl.mpastell.com/stable/chunk_options/#Chunk-Options-1)
eval = false do the job?
echo are the ones I use most frequently. They affect whether the code is executed, whether the results are shown, and whether the code is shown, respectively. So
```julia; results=false; echo=false # this code won't show up in the document, but x is available x = 2 ``` ```julia; echo=false # a results block with `5` will show up in the document, but not the code x + 3 ``` ```julia; eval=false # this code block will show up, but won't be evaluated x = 5 ``` ```julia # both this block, and the results (`2`) will show up x ```
Very cool! Looks like a good start - let me know if you’d like help setting this up with Documenter to auto-generate pages (can have html pages built and automatically make links to mybinder for running /downloading the notebooks). I’m in the middle of doing something similar for my course, so hopefully won’t be too much additional effort.
Would love to have help using Documenter and getting things into Binder. I have never used either of those. Binder in particular seems extremely useful for this kind of purpose.
In the end, I’m not teaching courses, but I would be very happy to have others who ARE teaching courses to use these resources in their courses. So whatever seems most useful for that target audience we should do.
Thanks for sharing! I do teach courses in this area, and I would love to have some more resources to share with undergraduates.
+1 for rendered HTML pages of this content. Being able to browse without installing makes it more accessible.
Note that our tutorials are basically rendered script downloadable as notebooks or scripts using Literate + Franklin which might be relevant and more flexible than Weave (I’m biased). I intend to port the R bookdown template over the next few months to Franklin to help people writing series of tutorials present their content
Franklin also allows your content to be published online very easily. That’s a big plus.