Collard Greens

Garlic

Carrots

Salt

Skip to content
# Month: November 2015

## Collard Greens

## Cornbread Stuffing

## Stastical Inference with Multiple Regression

## Concept Recall from Vocabulary

## Prime Learn as Much as Possible

## Using Dynamically Named Dataset Columns in R

## Spark MLlib Machine Learning

## Getting Started in Scala

## Getting Started with Apache Spark

### First Steps with Spark – Screencast #1

### Spark Documentation Overview – Screencast #2

### Transformations and Caching – Spark Screencast #3

### A Standalone Job in Scala – Spark Screencast #4

### Notes

## R and Google Analytics

Collard Greens

Garlic

Carrots

Salt

**Ingredients:**

Cornbread

Onion

Celery

Apples

Mushrooms

Broth

Leek

Rosemary

Thyme

Garlic

Salt / Pepper

**Recipe:**

**Goodness of Fit
**R squared part 1 (Ben Lambert)

R squared part 2 (Ben Lambert)

The Coefficient of Correlation (ProfessorSerna)

Statistics 101: Simple Linear Regression (Part 4), Fit and the Coefficient of Determination (Brandon Foltz)

Adjusted R squared (Ben Lambert)

**ANOVA**

Multiple regression 2 – (F test and T test) (Jason Delaney)

Multiple regression 5 – F test for a subset of variables (Jason Delaney)

Multiple Regression Part 4. The Global Test (ProfessorSerna)

What Is the F-test of Overall Significance in Regression Analysis?

Statistics PL11 – Inferences about Population Variances (Brandon Foltz)

Inference for Variances (Jason Delaney)

The F statistic – an introduction (Ben Lambert)

**Regressor Tests and Stats**

Hypothesis testing in linear regression part 1 (Ben Lambert)

Hypothesis testing in linear regression part 2 (Ben Lambert)

Hypothesis testing in linear regression part 3 (Ben Lambert)

Hypothesis testing in linear regression part 4 (Ben Lambert)

Hypothesis testing in linear regression part 5 (Ben Lambert)

Inference on the Slope (The Formulas) – standard error and hypothesis test (jbstatistics)

What is the standard error of the coefficient? (Minitab Docs)

**Interpretation and Examples**

Multiple regression 4 – how to interpret regression models (Jason Delaney)

Multiple Regression – Estimated regression equation practice problem – 15.07 (Jason Delaney)

Multiple Regression – Dummy variables – 15.37 – RestaurantRatings (Jason Delaney)

Statistics 101: Multiple Regression (Part 3A), Evaluating Basic Models (Brandon Foltz)

Statistics 101: Multiple Regression (Part 3B), Evaluating Basic Models (Brandon Foltz)

Filling in regression tables (Jason Delaney)

A Mathematical Primer for Social Statistics (Quantitative Applications in the Social Sciences) 1st Edition by John Fox

– Chapter on Statistical Inference in Regression

Trying to recall an entire concept or dialogue is a great way to strengthen learned knowledge. It forces you to apply it from your own understanding rather than by being guided from existing text.

For example when studying a language you can review dialogue containing new vocabulary and grammar concepts. Later when reviewing and revising (when the lesson is no longer fresh in your memory) you can look at each vocabulary word then try to recall and explain the lesson’s dialogue.

*Teaching with the Brain in Mind by Eric Jensen

Beginning and ending a study session is very helpful. Priming can be motivating because it gives you a clearer picture of an entire set of ideas and how they’re applied in the real world.

When in doubt do some priming. Skim around and do brief overviews of anything related to, “ahead” or “behind” where you are now with the goal of understanding the full context of what you’re studying. Priming is also good when you don’t have a lot of time or are just too tired to get into an in-depth study session.

Cognitive researchers say that by priming material you are giving your brain a chance to establish small footholds that you’ll later use to connect to more detailed networks of knowledge. The more footholds you have to connect your knowledge to, the stronger the synapses between the neurons that hold the new knowledge will be – which means you attain expert knowledge stronger and faster.

In addition, by priming you give yourself extra time to rest and digest the information a little bit at a time, which in turn, helps meaning-making of new ideas.

Prime learning several books at a once, as apposed to, sequentially learning the material in detail gives a huge time savings advantage. Priming several sources gives you a “working knowledge” of the material which you can instantly apply to whatever portion you’re currently learning in detail – which helps cement the understand with fewer repetitions.

*Teaching with the Brain in Mind by Eric Jensen

Machine Learning Library (MLlib) Programming Guide

The Spark MlLib docs also serves as a nice curriculum break down for studying all the common machine learning techniques

- Download spark
- Extract archive
- View archive README. Note we’ll use sbt to build spark, Scala must be pre-installed, and SCALA_HOME environment var needs to be set.

**[UPDATED]**Looks this video is outdated. The latest spark version (1.5.2) uses maven to build. Install Java and Maven. - Start spark build
spark-archive-home$> sbt/sbt package

**[UPDATED]**Run Maven build command…spark-archive-home$> build/mvn -DskipTests clean package

… I did run into consistent errors during this step, which think it has to do with mis-matching java versions. I gave up after a while and just downloaded the pre-compiled spark binaries instead

- Download & extract Scala required version (from spark source README)

**[UPDATED]**looks like maven takes care of building scala source and it already comes working with the pre-compiled binary distros. - Set SCALA_HOME env var by creating a conf/spark-env.sh file using the spark distro’s conf/spark-env.sh.template file as a template (as described in spark distro README) OR edit your user .profile file to export SCALA_HOME env var with correct scala exec path.
$> cp conf/spark-env.sh.template conf/spark-env.sh $> vi conf/spark-env.sh $> export SCALA_HOME=/opt/spark-1.5.2-bin-hadoop2.4 ##add this line inside spark-env.sh

**[UPDATED]**Skipped SCALA_HOME env var step. Looks like it’s unnecessary. - Log4j logging level setup by using sparks log4j template….
$> cp conf/log4j.properties.template conf/log4j.properties $> vi conf/log4j.properties log4j.rootCategory=ERROR, console ##edit this line inside log4j.properties

- Start spark shell…
$> bin/spark-shell

- Open Spark Quick Start guide and walk through scala with spark examples.

- Spark docs at spark project site. You can select specific versions of documentation
- Free spark project curriculums at Berkeley Amp Camp. Covers implementation of more complex apps and deployments.

- Walks through Quick Start guide using scala transformations and caching examples

- Walks through Quick Start guide example building and running a stand alone application

Quick Start (Apache Spark Docs)

Getting Started with Apache Spark and Cassandra

What is MapReduce?