cellMatrix.net
Spreadsheet Modeling and Related Topics

About

Formulas, Charts, and Models Created with Microsoft Excel.
More . . .

Statistics

  • Total Entries - 102
  • Current Viewers - 12

Categories

Recent Comments

Syndicate

Validate

Floating Bars with Min, Max, and Averages

A recent question to the Microsoft Excel Charting Newsgroup asks "How can I create a floating column chart that shows a minimum, maximum, and average value"? Although Excel 2003 does offer a floating column chart as a custom-type option, it is not as "clean" as desired because the bars are automatically set to a 3D option. The best alternative is to use a combination Stacked Bar / XY Scatter chart. The bars will show the minimum to maximum range and the XY series will show minimum, maximum, and average values for each item. An example of the finished product is below:

image

To build the chart, assume the following set of data:

image

Some additional data and calculations need to be added to prepare the chart source. In the screenshot below, the range A9:C13 will serve as the source of the bar chart. The original range A2:D6 will serve as the X axis points of the XY chart and the Y points are shown in the range F2:F6.

image

Start by creating the floating bar chart. To do so, refer to a process posted at the Peltier Technical Services site that illustrates how to make a series invisible so that a second series floats in place (titled: Floating Column Charts) as a guide.

In the example, note that the "Dummy1" series is exactly the same as the "Min" values. The "Dummy2" series is calculated by subtracting the "Min" value from the "Max"value. Select the range A9:C13 and create a stacked bar chart. The chart should look like the one below:

image

Double-click the "Dummy1" series. In the Format Data Series dialog box - Patterns tab, set the Border to None and the Area to None. The chart should now look like this:

image

Now that the floating bar chart is created, it's time to incorporate an XY Scatter chart which will show the minimum, maximum, and average values. Select the chart and in the Source Data - Series dialog box, add a third series called "Minimum" based on the range F3:F6. After doing so, the Source Data - Series dialog box should look like this:

image

Note the new third series has appeared as a set of yellow bars. These bars need to be converted to XY Scatter points. Click on the new third series (the yellow bars) and, in the Standard toolbar, select Chart -> Chart Type -> XY (Scatter) -> type = Scatter. After doing so, the chart looks like this:

image

In the Standard Toolbar, select Chart -> Source Data -> "Minimum" Series. Currently there are no X values selected for the XY points (see the screenshot below):

image

Select the Minimum values as the X Value range (see the screenshot below):

image

And the chart should look like this:

image

With the incorporation of the XY Scatter points, two new secondary axes have been added to the top and right sides of the chart. To properly align the new XY points with the bars, the maximum values for these new axes need to reflect the same maximum values of the primary category and value axes. Double-click on the Secondary Value (X) Axis (the axis values at the top of the chart) and in the Format Axis dialog box -> Scale tab - set the maximum to the same value as the maximum of the Value axis (in this example the maximum is 60).

image

At this point the minimum values are now aligned with the minimum values of the bars. The same process of adding XY scatter points has to be completed to set up the average and maximum values. After adding the average and maximum values as XY points, the chart looks like this:

image

All that's left is to do now is some clean-up. Double-click on the Secondary Value (X) Axis (the axis values at the top of the chart) and in the Format Axis dialog box -> Patterns tab set the Tick Mark Labels and Major Tick Mark Type options to None. Likewise, double-click on the Secondary Value (Y) Axis (the axis values on the right side of the chart) and in the Format Axis dialog box -> Patterns tab set the Tick Mark Labels and Major Tick Mark Type options to None. The chart now looks like this:

image

To eliminate the legend references to the "Dummy1" and "Dummy2" series, click first on the legend and second on the "Dummy1" legend entry. Handles should appear around the "Dummy1" legend entry if it has been selected correctly. An example of how the chart looks with the "Dummy1" legend entry selected is below:

image

To get rid of the "Dummy1" legend entry, hit the delete key. Follow the same process for the "Dummy2" reference.

Select the plot area. In the Standard Toolbar select Chart -> Chart Options -> Gridlines and deselect all of the gridline options. Select the Minimum, Average, and Maximum series and in the Format Data Series dialog box -> Data Labels tab select the X value.

image

Continue to format until finished.

Actual vs. Plan Crawl Chart

A chart that I have to show each month combines a trend of current results with a forecast. As an example, for the month of May 2008 the chart would look similar to this:

image

Fast-forwarding a few months, in August 2008 it would look similar to this:

image

By comparing the two examples, you can see that the "Actual" line (red line) expands each month reflecting current results. Conversely, the "Plan" line (blue line) contracts as the year moves forward. A marker and label indicating the Actual and Plan net operating income is always shown for the current month.

Although the chart appears to be Line chart, it really is a combination Line-XY (Scatter) chart. The chart can be automated so that it automatically updates as the most current data is entered. To do so, the screenshot below shows the setup:

image

The Actual and Plan data are entered into the range C4:D15. The Chart Source and Data Labels Source ranges contain formulas that allow the chart to automatically update. At the start of the year enter your Plan data into the range D4:D15. At the end of each month the results are entered into the range C4:C15.

There are six different formulas that drive the chart. The first two formulas drive the Actual and Plan lines.

The first formula, which drives the Actual series, is entered into cell F4 and then copied through the range F4:F15. The formula is:

=IF(C4,C4,NA())

The second formula, which drives the Plan series, is entered into cell G4 and then copied through the range G4:G15. The formula is:

=IF(COUNT($C$4:C4)=COUNT($C$4:$C$15),D4,IF(C4,NA(),D4))

The remaining four formulas drive the points for the current month and use an XY (Scatter) chart as their source. The following formulas appear in the following cells:

Actual Label Sources

Cell J5: =COUNT($C$4:$C$15)
Cell J6: =OFFSET($C$3,COUNT($C$4:$C$15),0)

Plan Label Sources

Cell J10: =COUNT($C$4:$C$15)
Cell J11: =OFFSET($D$3,COUNT($C$4:$C$15),0)

Finally, the chart source looks like this:

Line Chart Sources

Actual Line Source

image

Plan Line Source

image

XY Chart Sources

Actual Marker and Label

image

Plan Marker and Label

image

Tie Series Labels to Worksheet Cells

Depending on the version of Excel that you are using, there are two free addins available that allow you to tie the series labels in a chart to worksheet cells. Those addins are:

A project that I worked on recently required that I tie the series data labels to one of many range references based on an input by the user. To do so, it was required that I build the same type of functionality that the addins provide directly into my project.

image

In a post to the Microsoft Excel Programming Newsgroup dating back to July of 1999, John Green describes how to create labels in a chart from the text in a range using VBA. The code originates from the book titled "Excel 2000 VBA Programmer's Reference" written by John Green, Stephen Bullen, and Felipe Martins.

As taken from the post, the code to tie the series labels to a worksheet range looks like this:

  Sub AddDataLabels() 
  Dim SalesSeries As Series 
  Dim pts As Points 
  Dim pt As Point 
  Dim rng As Range 
  Dim i As Integer 

  Set rng = Range("B4:G4") 
  Set SalesSeries = ActiveSheet.ChartObjects(1).Chart.SeriesCollection(1) 
  SalesSeries.HasDataLabels = True 
  Set pts = SalesSeries.Points 
  For Each pt In pts 
    i = i + 1 
    pt.DataLabel.Text = "=" & rng.Cells(i).Address _ 
             (RowAbsolute:=True, _ 
              ColumnAbsolute:=True, _ 
              ReferenceStyle:=xlR1C1, _ 
              External:=True) 
   Next pt 
End Sub

After finding the example above, I wanted to see if the code could be streamlined in any way. I first looked up the "Address" property as applied to the range object in Excel's VBA help and found out that the default values of the row and column references are "true". As a result, the Address property could be shortened and rewritten as:

rng.Cells(i).Address(, , xlR1C1, True) 

I remembered that in a prior post titled Declaring Line Chart Variables, Jon Peltier suggested a best-practice for declaring variables. Following those suggestions, I renamed the series variable "SalesSeries" to "Srs". I also eliminated the point variable and used a reference to Srs.Points.Count in its place. With those changes the loop could be rewritten as follows:

For i = 1 To Srs.Points.Count
    Srs.Points(i).DataLabel.Text = "=" & Rng.Cells(i).Address(, , xlR1C1, True)
Next i

Finally, by making the changes above I was able to simplify the code to look like this:

Sub AddDataLabels()
    Dim Srs As Series
    Dim Rng As Range
    Dim i As Integer
    Set Rng = Range("B4:G4")
    Set Srs = ActiveSheet.ChartObjects(1).Chart.SeriesCollection(1)
    Srs.HasDataLabels = True
    For i = 1 To Srs.Points.Count
        Srs.Points(i).DataLabel.Text = "=" & Rng.Cells(i).Address(, , xlR1C1, True)
    Next i
End Sub

Trendline Fitting Errors

In a recent post at the PTS Blog titled Trendline Fitting Errors, Jon Peltier describes some of the problems that result from fitting data to trendlines created as a part of an Excel chart. Microsoft seems to acknowledge this problem via their Knowledge Base Article 211967, although the article is somewhat confusing.

Refering to the article and under the header "Symptoms", it states "The equation displayed for a trendline on an xy (scatter) chart is incorrect." Several sentences later, the article states "The trendline formula should only be used when your chart is an XY Scatter chart."

Finally, the article does state the reason why trendlines should only be used on XY Scatter charts: "Line, Column, and Bar charts plot only the Y axis as values. The X axis is plotted only as a linear series in these chart types, regardless of what the labels actually are. Therefore, the trendline will be inaccurate if displayed on these types of charts. This behavior is by design." I wonder why Microsoft continues to offer trendlines as a component of Line, Column, and Bar charts if they are incorrect?

Below are some useful resources for creating and interpretting trendlines:


XY-Bar Combination Chart

Recently I've had to create a series of Bar charts that incorporate points that allow additional data labels. I've created these points by building in an XY scatter chart into the Bar chart. The additional data labels are tied to the XY points using a tool such as John Walkenbach's J-Walk Chart Tools or Rob Bovey's XY Chart Labeler.

An example of one of these charts, less data labels, looks like this:

image

The person who requested the chart wanted each XY point to be visible as well as vertically centered within each bar. To do so, I started by setting the gap width to 100%. Next, a calculation creates the Y values of the XY points as illustrated below.

image

The calculation starts with the source data for the bar chart which is contained in the range B4:E8. There are four categories and three series within each category. I've entered those numbers into the yellow highlighted cells D19 and D20. The calculation of the lowest Y point on the XY chart vertical axis is contained in cell D23. The calculation is:

=D20/(D20+1)

Once this calculation is done, the results can be applied to all of the Y values of the XY chart. The XY chart source is contained in the range C11:H15. The results of the calculation above are linked to cell D12. Cell D13 adds the series per category value to the results in C12. The formula in D13 is:

=D12+$D$20

Cell F12 adds the series per category value to the results in C12. The formula in F12 is:

=D12+$D$23

Finally, the maximum value of the secondary axis is simply the number of categories multiplied by the series per category. The formula in cell D24 is:

=D19*D20

The VBA procedure below attempts to quickly build this chart based on the template above. Please note that it references the color palette that I currently use - your colors may be different. At this point the output produces the following:

image

Given time the procedure could be written to complete the formatting. It could also be modified to automatically accept a reasonable number of changing categories and series, use a custom color palette, and allow more efficient range references.

Option Explicit

Sub BuildChart()

    Application.ScreenUpdating = False

    Dim XVals As Range
    Dim Srs1 As String
    Dim Srs2 As String
    Dim Srs3 As String

    Set XVals = ActiveSheet.Range("B5:B8")
    Srs1 = ActiveSheet.Range("C4").Value
    Srs2 = ActiveSheet.Range("D4").Value
    Srs3 = ActiveSheet.Range("E4").Value

    Charts.Add
    ActiveChart.Location Where:=xlLocationAsObject, Name:="Sheet1"

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs1
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("C5:C8")
        .Interior.ColorIndex = 22
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs2
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("D5:D8")
        .Interior.ColorIndex = 35
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs3
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("E5:E8")
        .Interior.ColorIndex = 24
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs1 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("C12:C15")
        .Values = Sheets("Sheet1").Range("D12:D15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 3
        .MarkerForegroundColorIndex = 3
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs2 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("E12:E15")
        .Values = Sheets("Sheet1").Range("F12:F15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 10
        .MarkerForegroundColorIndex = 10
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs3 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("G12:G15")
        .Values = Sheets("Sheet1").Range("H12:H15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 5
        .MarkerForegroundColorIndex = 5
    End With

    ActiveChart.ChartGroups(1).GapWidth = 100

    ActiveChart.Axes(xlCategory, xlSecondary).MaximumScale = ActiveSheet.Range("D25").Value
    ActiveChart.Axes(xlCategory, xlSecondary).MinimumScale = 0
    ActiveChart.Axes(xlCategory, xlSecondary).MajorUnit = 1

    ActiveChart.Axes(xlValue).MaximumScale = ActiveSheet.Range("D25").Value
    ActiveChart.Axes(xlValue).MinimumScale = 0
    ActiveChart.Axes(xlValue).MajorUnit = 1

    ActiveChart.Axes(xlValue, xlSecondary).MaximumScale = ActiveSheet.Range("D24").Value
    ActiveChart.Axes(xlValue, xlSecondary).TickLabels.NumberFormat = "#,##0"

    ActiveChart.Parent.Height = 189.75
    ActiveChart.Parent.Width = 319.5

End Sub

Secondary Axes in Charts Discussion

Recently the topic of primary and secondary axis scales was discussed at the PTS Blog. The discussion starts with a reference to Stephen Few's March 2008 Visual Business Intelligence Newsletter titled Dual-Scaled Axis in Graphs - Are They Ever the Best Solution . Stephen starts his article by concluding that He then goes on to cite a series of examples. Finally, at the end of his article he states that "I certainly cannot conclude, once and for all, that graphs with dual-scaled axes are never useful; only that I cannot think of a situation that warrants them in light of other, better solutions. I invite you to propose viable exceptions, which I will welcome with open arms."

From a healthcare finance perspecitive, it has been my experience that these charts are great tools for those who know how to build them and for those audiences that know how to interpret the underlying data. Below are several comments I have concerning Stephen's article:

Comment #1 - It's not necessarily the chart with two axes that's bad, it's the data behind it. In the newsletter, most of Stephen's graphs illustrate the relationship between revenue and units sold. I would argue that is comparison is flawed from the beginning and, as such, this type of chart should not be used. My reasoning is that, although it seems like the data is should be related, when you drill down into the data you might find out that the correlation is not as close as you might have thought. Depending on the data:

  • What defines a unit?
  • Are the units the same unit each quarter or are there multiple types of units being charted as one?
  • If there are multiple units, how are the changes in volume accounted for within the sales mix?
  • Won't price increases distort the data over time?
  • How does the chart account for decreases in purchasing power over time i.e inflation over time?
  • If the data is measured by quarter and one of the quarters includes a leap year, how do you account for the extra day?

In my opinion, I think an argument can be made that a dual axis chart showing revenue and units sold is probably not the the best use of a dual-axes chart. That being said I do have to admit that I do use this type comparison at a top-level where the components of variance might be considered immaterial. If a dual-axes chart is to be used the audience should be educated on how to interpret the underlying data and how to recognize and address possible flaws before decisions a made.

Comment #2 - If you are going to present a dual-axis chart, the axes need to be proportional. That means doing the math for each axes to make sure it is in fact proportional to the other.

Comment #3 - As Stephen observed, line graphs are the best presentation for this type of chart.

Comment #4 - "I can’t think of a single case when there isn’t a better solution than a graph with a dual-scaled axis." In my line of work, we've found that these graphs are very useful for auditing calculations. For example, you are tasked with building a projection of revenue and discounts for the next year. In a healthcare environment, revenue and discounts both contain the same components of variance i.e. work days, fee increases, volume changes, payer mix changes, and changes in service mix or acuity. The calculations the build all of these components can get very complex. Rather than attemping to proof each calculation that builds the projection, this chart can quickly pick up areas of possible error.

In the example below, it's obvious that there's problem with June calculations because the lines are not in sync for that month.

image

Comparing the growth in same-type volumes (for example Radiology vs. Laboratory volumes) is another example.

To conclude, there may in fact be better charts for displaying certain types of data. But if the audience understands the relationships between the data I think there is a place for these charts. And like the example above shows, there's most definitely a place for these charts when performing audits.