This is a step that is usually missed out. Mainly, to be fair, because it looks difficult.
But actually it’s easy.
The Outcome
We are going to produce an excel file that has some pretend data in it. It will look exactly like the real data will, when we have it. So we can practice our data analysis on it and make sure nothing goes wrong.
Basically we want a spreadsheet with a simple structure that is easy to understand.
It is a good idea to use the first column for a Participant ID. Each subsequent column is then a variable.
The rows are filled as follows:
- Row 1: a header with variable names
- Row 2: (optional) variable types
We could also have an optional comments row - Row 3 onwards: data.
This is done so that each row is all the data from one participant
“One row per participant”
Approach 1: MS Excel & nothing fancy
Example file here
It is easy enough to make an excel file that does this. You want random numbers for the data. If you are very patient, you could just type in random numbers. It’s lazier and therefore better to use formulas.
Interval variables Use this formula to make a normally distributed random number (change the 100 and 15 to any mean sd you wish) | =NORM.S.INV(RAND())*15+100 |
Categorical variables Use this formula (change the “6” to however many categories you are using) | =CEILING(RAND()*6,1) |
Approach 2: MS Excel with effect sizes
Data Generator file here
This approach allows you to create an excel file where the variables you create can have built in relationships between them.
Use the data generator file, sheet “Design Sheet”.
You can enter as many variables as you like, specifying:
- Name
- Type (Interval or Categorical)
- Values (mean and sd or number of categories)
You can also enter the effect sizes between variables (keep these quite small less than 0.5).
As you do this, excel will automatically create data for you in “Data Sheet”.
Approach 3: using BrawStats (up to 3 variables)
Using BrawStats you can:
- set up a simple 2 or 3 variable hypothesis
- specify the effect size (or effect sizes)
- make a sample of data
At that point you can, if you wish save the sample as an Excel file withthis menu item:
Sample –> Save… –> as Excel file
Approach 4: using BrawStats (multiple variables)
This feature will be available from 1st October.
You can also if you wish use the Design Sheet of the Data Generator file. Fill this in with as many variables as you like as above.
Then in BrawStats:
- import it exactly as if it is a sample of data
You can then examine a subset of 2 or 3 variables using the standard BrawStats interface:
- choose the variables in the pink logic dialog
- press “apply” to buid the hypothesis
- press “new sample” to get a sample of data
You can also use the Linear Modelling tool in BrawStats. This will allow you to analyse more than 3 variables at once.