Problem D
Summary Statistics
The average and standard deviation are incredibly useful summary statistics. The average, which is also known as the mean, tells us what the “typical” value is for a set of data whereas the standard deviation tells us how close or far away from the average the other values tend to be.
The average
-
Sum up all the numbers,
-
Divide the sum by
,
Once the average has been calculated, the standard deviation
-
Squaring the difference between each value
and the average , that is . -
Summing up all those differences,
-
Dividing the sum by
, -
Taking the square root,
Notice how these steps resemble an algorithm ... but this is too easy.
To make this more interesting, let us implement an algorithm that computes the running standard deviation. A running standard deviation is calculated for every data point as it arrives. This can be useful for very large data sets or when values are arriving in real-time (e.g. sensor measurements).
The program should repeatedly ask the user to type in an
integer value and then print out the cumulative moving average
and the standard deviation for each value that the user enters.
When printed, the summary statistics should be rounded to the
The WikiPedia article on standard deviation lays out how these calculations can be carried out, albeit expressed using mathematical notation.
We propose that you tackle this problem one step at a time:
-
Read the WikiPedia article, think about how step-wise calculations that compute the value at step
by using values from step could be implemented using a loop. (If you are stranded, don’t be afraid to ask a teacher or a fellow student for help.) -
Start by tackling the simpler problem: calculating the average. If you can solve that, then the rest becomes much easier.
-
Extend your solution to also update the standard deviation in every iteration of the loop.
NOTE: The program should calculate the population standard deviation, not the sample standard deviation.
Input
Input starts with one line containing one integer
Output
Output consists of two lines for each number
Sample Input 1 | Sample Output 1 |
---|---|
2 4 4 6 4 -1 |
2.00 0.00 3.00 1.00 3.33 0.94 4.00 1.41 4.00 1.26 |
Sample Input 2 | Sample Output 2 |
---|---|
2 4 4 4 5 5 7 9 -1 |
2.00 0.00 3.00 1.00 3.33 0.94 3.50 0.87 3.80 0.98 4.00 1.00 4.43 1.40 5.00 2.00 |