Building some Data analysis Tutorials

@dlakelan I’m also interested to see what you come up with. I’ve been curious about Julia for many years now, and this summer I may finally have some time to get into it.

Thanks!. Right now I have a kind of quick intro tutorial fairly well done in first draft form… I’m now going through and writing the Discussion that goes along with it. The tutorial is supposed to be more of step by step very readable thing with all the decisions made for you… the discussion is all about why did we choose what we did, or what could we have done differently etc. It’s a little more involved. I’m hoping to get those two both in a decent draft form, then throw them up in my git repo and open up some commentary here. Maybe another few days.

2 Likes

Ok, those who are interested. See the very bare github repo: GitHub - dlakelan/JuliaDataTutorials: Tutorials For Data Analysis in Julia

You should be able to get notebooks by just running the build.jl script, if you have Weave.jl installed.

The essential format is to split this into a series of Tutorials with paired Discussion. The first one is “BasicDataAndPlots”. Imagine the target audience is a 3rd year undergrad who has at most 1 semester of a computer programming course. The idea is to get them loading some data and producing some plots even if they don’t know how or why, just to see the syntax, maybe play around with it. It should have links to documentation so they can maybe modify plots by reading the discussion.

As things go along it should build to the point where we’re answering more meaningful questions and using more advanced ideas in data analysis, mostly from a Bayesian perspective. I’d like to tackle real world and interesting questions, the kind of thing where the answer isn’t clear, and someone who is interested in the topic could start from these tutorials, and then build a little undergrad or Masters level term-paper type project by further research. For the moment though, it’s just getting started.

Question for @kevbonham, how do I make sure Weave doesn’t try to execute a code block when building notebooks/pdf/html? I’m not clear on the syntax for that.

Having reread it, I can already see that there are some sections I should strip out and push into the Discussion. Also some things I need to add to the Discussion, like how the length units cm and inch work.

2 Likes

Doesn’t the code chunk option (Chunk Options · Weave.jl)
eval = false do the job?

1 Like

Exactly - eval, results, and echo are the ones I use most frequently. They affect whether the code is executed, whether the results are shown, and whether the code is shown, respectively. So

```julia; results=false; echo=false
# this code won't show up in the document, but x is available
x = 2
```
```julia; echo=false
# a results block with `5` will show up in the document, but not the code
x + 3
```
```julia; eval=false
# this code block will show up, but won't be evaluated
x = 5
```
```julia
# both this block, and the results (`2`) will show up
x
```
1 Like

Very cool! Looks like a good start - let me know if you’d like help setting this up with Documenter to auto-generate pages (can have html pages built and automatically make links to mybinder for running /downloading the notebooks). I’m in the middle of doing something similar for my course, so hopefully won’t be too much additional effort.

1 Like

Would love to have help using Documenter and getting things into Binder. I have never used either of those. Binder in particular seems extremely useful for this kind of purpose.

In the end, I’m not teaching courses, but I would be very happy to have others who ARE teaching courses to use these resources in their courses. So whatever seems most useful for that target audience we should do.

1 Like

Thanks for sharing! I do teach courses in this area, and I would love to have some more resources to share with undergraduates.

+1 for rendered HTML pages of this content. Being able to browse without installing makes it more accessible.

Note that our tutorials are basically rendered script downloadable as notebooks or scripts using Literate + Franklin which might be relevant and more flexible than Weave (I’m biased). I intend to port the R bookdown template over the next few months to Franklin to help people writing series of tutorials present their content

6 Likes

So, for those who are interested in following along. Like many things these days, I’ve been derailed a bit by COVID. Specifically I have a lot of friends and family who want to know what the latest info is on the COVID epidemic, and I was making some PDFs by hand and putting them on my blog each week or so, but I figured, hey, why not give them julia notebooks they can interact with… And I managed to get that into binder etc, It’s been educational, but it’s a work in progress and not very tutorial-like really. In particular I don’t have a discussion document for the COVID stuff because it’s still a work in progress.

I’d like to do a tutorial in which I use Turing to build a Bayesian model of something interesting. Here’s your chance to influence that. What would you like to see modeled? Requirements are:

  1. Publicly available dataset, prefer something not too enormous. Must be an easily readable format (CSV for example). It could involve integrating data from two public sources.
  2. Model shouldn’t require tons of moving parts (so for example the COVID epidemic while very interesting, is a very challenging field, so it’s out). Also shouldn’t be trivially simple (something you could do fine with GLM and a linear or logistic regression with a point estimate).
  3. Should be a topic I have some familiarity with: Economics, Biology, Healthcare, Mechanics/Physics, Civil and Environmental Engineering would be good candidates.

Thoughts?

1 Like

Maybe @cpfiffer has some ideas from the econ world that could even end up as Turing tutorials in the docs so we kill two birds with one stone (if you’d be okay with that of course…)

2 Likes

I’m fine with tutorials ending up in docs. I should probably put up explicit licenses. will do that today.

Check out this issue to see some economics ideas. Happy to put your tutorial up on the sure if that’s what you’re like.

Oh that gave me an idea, do a Bayesian model decomposing a timeseries into a fast and slow component (seasonality).

There are plenty of wiggly timeseries you can easily get from fred.stlouisfed.org

1 Like

I’d be interested in ‘beta-reading’ your tutorials: giving feedback before they are published, etc

The repo I’m using is here… I’ve had a bunch of projects and had to put this on hold for the last several weeks. I’ve got some half-baked ones that I haven’t integrated yet, including one where I’m fitting a nonlinear function to seasonally adjust a timeseries.

https://github.com/dlakelan/JuliaDataTutorials