This is the first post directly to the new Knowledge Based Practice: Cause, models and inference blog Welcome!

This is the first post directly to the new Knowledge Based Practice: Cause, models and inference blog. As you can see the new URL is simply:

Rather than posting these past few months I have taken the time to migrate the blog from Wordpress to being served on GitHub pages and developed using Jekyll. This decision came after an extended period of time trying to write posts on word press that integrated with R Markdown so that I could leverage the power of R programming to bring data analytic content directly to the blog without the hassle of generating code, analysis and output to then embed into the blog post.

For example, in the past when I wanted to post an analysis I performed on NFL vertical jump data in R my word press site kept getting stuck, so I had to post the results on RPubs (not a bad thing - just not what I was looking to do). Incidently, that RPub is here and I will soon migrate it to the blog once I finish some editing and discussing its relevance.

So now I can pull data in and analyze it and create a graphic with it for the blog all in one environment.

For example - here are the graphs created from an analysis of data from Astrand’s classic paper on aerobic capacity

The R code generating these graphics has been surpressed

Is there an inverse relationship between vo2max and resting heart rate

Is there an inverse relationship between submaximal heart rate and max vo2?

Is there a relationship between maximal heart rate and max vo2?

From your perspective this may not be a big deal; but when writing the blog the ability to easily weave the statistical analysis code with the results into the writing process makes things much easier :)

Additional FYIs

The original (legacy) blog will still be available as hosted on wordpress here. For one thing, while I was able to automate the migration process due to the tool made availble by Thomas Frössman and his code in his “exitwp” project on GitHub, there were some issues with the links and the images - so it is simply easier to keep the wordpress site up and running for those archival posts that refer back to the legacy site.

The layout you see here is attributed to Michael Rose and his “minimal mistakes” Jekyll theme available on GitHub; it is clean, easy to integrate and allows a simple user such as myself to easily incorporate the built in Jekyll features of posts (standard) and collections (newer feature). Collections have allowed me to pull some of the older posts that have value to the purpose of the blog out from the dense chronological archive and create collections on thematic topics. For example, - there is a collection called “Foundations” that includes the early posts related to the philosophical foundations of the knowledge based practice, cause, models and inference project.

For the past two years one of my past-times has been learning how to program for statistical analysis with open source R project, the simple fact that it is open is inspiring on many levels and “opens” opportunities for research collaboration and teaching. I have also enjoyed being forced into coding my analysis, which can provide a reproducible archive of data processing and analysis steps - something that is becoming more important with the number of manipualations and assumptions in modern complex analysis (see Peng’s article in Biostatistics and rOpenScience).

For anyone interested in learning how to conduct your statistical analyses in R - I highly recommend the Coursera Data Science Specialization offered by John’s Hopkins Bloomberg School of Public Health, see here. If you come at this course with a background in statistics or in coding then it is reasonable because you can then spend time emphasizing the other aspect (i.e., I had a stronger background in stats, so spent more time learning how to code; others that have a background in coding likely spend more time learnign the stats). If you come at the courses with a background in neither it can be a bit overwhelming (I had some students try it a few years ago). My suggestion for you if you fall into that category is stretch out the courses and just keep “switching sessions” as Coursera now allows - all of your progress from the session you are switching out of is moved ahead to the session you are switching into. This allows you to take more than 4 weeks to work through the assignments. The courses progress logically and really do require you to recall and use what you have learned in prior sessions.

What makes this new approach the blog really possible is the R package knitr by Yihui Xie, and his post that talked me into it and provided a sample here. And, I am really happy to have such a great R programming IDE (integrated development environment) as made possible by R Studio.

Finally, in addition to the connectivity of publishing from the RStudio IDE to a GitHub pages served Jekyll developed static blog and website and the benefit of integration of data analysis and modeling (which only makes sense for a blog on knowledge, cause, models and inference); there is the tremendous and growing benefit of the world of graphical causal models and Bayesian network packages in R (all freely available). For example, see the task view on Bayesian Inference, and on Graphical Models. While it takes some time to learn, they are open - both in terms of free and they are modifiable to tailor to your own needs.

I am still working on integrating my new found developing skills with R with that of graphical / Bayesian modeling - so more to come in time. But at least now I have a blogging platform and flow to support the work :)


Leave a Comment