cellMatrix.net
Spreadsheet Modeling and Related Topics

About

Formulas, Charts, and Models Created with Microsoft Excel.
More . . .

Statistics

  • Total Entries - 102
  • Current Viewers - 15

Categories

Recent Comments

Syndicate

Validate

Tie Series Labels to Worksheet Cells

Depending on the version of Excel that you are using, there are two free addins available that allow you to tie the series labels in a chart to worksheet cells. Those addins are:

A project that I worked on recently required that I tie the series data labels to one of many range references based on an input by the user. To do so, it was required that I build the same type of functionality that the addins provide directly into my project.

image

In a post to the Microsoft Excel Programming Newsgroup dating back to July of 1999, John Green describes how to create labels in a chart from the text in a range using VBA. The code originates from the book titled "Excel 2000 VBA Programmer's Reference" written by John Green, Stephen Bullen, and Felipe Martins.

As taken from the post, the code to tie the series labels to a worksheet range looks like this:

  Sub AddDataLabels() 
  Dim SalesSeries As Series 
  Dim pts As Points 
  Dim pt As Point 
  Dim rng As Range 
  Dim i As Integer 

  Set rng = Range("B4:G4") 
  Set SalesSeries = ActiveSheet.ChartObjects(1).Chart.SeriesCollection(1) 
  SalesSeries.HasDataLabels = True 
  Set pts = SalesSeries.Points 
  For Each pt In pts 
    i = i + 1 
    pt.DataLabel.Text = "=" & rng.Cells(i).Address _ 
             (RowAbsolute:=True, _ 
              ColumnAbsolute:=True, _ 
              ReferenceStyle:=xlR1C1, _ 
              External:=True) 
   Next pt 
End Sub

After finding the example above, I wanted to see if the code could be streamlined in any way. I first looked up the "Address" property as applied to the range object in Excel's VBA help and found out that the default values of the row and column references are "true". As a result, the Address property could be shortened and rewritten as:

rng.Cells(i).Address(, , xlR1C1, True) 

I remembered that in a prior post titled Declaring Line Chart Variables, Jon Peltier suggested a best-practice for declaring variables. Following those suggestions, I renamed the series variable "SalesSeries" to "Srs". I also eliminated the point variable and used a reference to Srs.Points.Count in its place. With those changes the loop could be rewritten as follows:

For i = 1 To Srs.Points.Count
    Srs.Points(i).DataLabel.Text = "=" & Rng.Cells(i).Address(, , xlR1C1, True)
Next i

Finally, by making the changes above I was able to simplify the code to look like this:

Sub AddDataLabels()
    Dim Srs As Series
    Dim Rng As Range
    Dim i As Integer
    Set Rng = Range("B4:G4")
    Set Srs = ActiveSheet.ChartObjects(1).Chart.SeriesCollection(1)
    Srs.HasDataLabels = True
    For i = 1 To Srs.Points.Count
        Srs.Points(i).DataLabel.Text = "=" & Rng.Cells(i).Address(, , xlR1C1, True)
    Next i
End Sub

Trendline Fitting Errors

In a recent post at the PTS Blog titled Trendline Fitting Errors, Jon Peltier describes some of the problems that result from fitting data to trendlines created as a part of an Excel chart. Microsoft seems to acknowledge this problem via their Knowledge Base Article 211967, although the article is somewhat confusing.

Refering to the article and under the header "Symptoms", it states "The equation displayed for a trendline on an xy (scatter) chart is incorrect." Several sentences later, the article states "The trendline formula should only be used when your chart is an XY Scatter chart."

Finally, the article does state the reason why trendlines should only be used on XY Scatter charts: "Line, Column, and Bar charts plot only the Y axis as values. The X axis is plotted only as a linear series in these chart types, regardless of what the labels actually are. Therefore, the trendline will be inaccurate if displayed on these types of charts. This behavior is by design." I wonder why Microsoft continues to offer trendlines as a component of Line, Column, and Bar charts if they are incorrect?

Below are some useful resources for creating and interpretting trendlines:


XY-Bar Combination Chart

Recently I've had to create a series of Bar charts that incorporate points that allow additional data labels. I've created these points by building in an XY scatter chart into the Bar chart. The additional data labels are tied to the XY points using a tool such as John Walkenbach's J-Walk Chart Tools or Rob Bovey's XY Chart Labeler.

An example of one of these charts, less data labels, looks like this:

image

The person who requested the chart wanted each XY point to be visible as well as vertically centered within each bar. To do so, I started by setting the gap width to 100%. Next, a calculation creates the Y values of the XY points as illustrated below.

image

The calculation starts with the source data for the bar chart which is contained in the range B4:E8. There are four categories and three series within each category. I've entered those numbers into the yellow highlighted cells D19 and D20. The calculation of the lowest Y point on the XY chart vertical axis is contained in cell D23. The calculation is:

=D20/(D20+1)

Once this calculation is done, the results can be applied to all of the Y values of the XY chart. The XY chart source is contained in the range C11:H15. The results of the calculation above are linked to cell D12. Cell D13 adds the series per category value to the results in C12. The formula in D13 is:

=D12+$D$20

Cell F12 adds the series per category value to the results in C12. The formula in F12 is:

=D12+$D$23

Finally, the maximum value of the secondary axis is simply the number of categories multiplied by the series per category. The formula in cell D24 is:

=D19*D20

The VBA procedure below attempts to quickly build this chart based on the template above. Please note that it references the color palette that I currently use - your colors may be different. At this point the output produces the following:

image

Given time the procedure could be written to complete the formatting. It could also be modified to automatically accept a reasonable number of changing categories and series, use a custom color palette, and allow more efficient range references.

Option Explicit

Sub BuildChart()

    Application.ScreenUpdating = False

    Dim XVals As Range
    Dim Srs1 As String
    Dim Srs2 As String
    Dim Srs3 As String

    Set XVals = ActiveSheet.Range("B5:B8")
    Srs1 = ActiveSheet.Range("C4").Value
    Srs2 = ActiveSheet.Range("D4").Value
    Srs3 = ActiveSheet.Range("E4").Value

    Charts.Add
    ActiveChart.Location Where:=xlLocationAsObject, Name:="Sheet1"

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs1
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("C5:C8")
        .Interior.ColorIndex = 22
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs2
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("D5:D8")
        .Interior.ColorIndex = 35
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs3
        .ChartType = xlBarClustered
        .XValues = XVals
        .Values = Sheets("Sheet1").Range("E5:E8")
        .Interior.ColorIndex = 24
        .Border.LineStyle = xlNone
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs1 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("C12:C15")
        .Values = Sheets("Sheet1").Range("D12:D15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 3
        .MarkerForegroundColorIndex = 3
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs2 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("E12:E15")
        .Values = Sheets("Sheet1").Range("F12:F15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 10
        .MarkerForegroundColorIndex = 10
    End With

    With ActiveChart.SeriesCollection.NewSeries
        .Name = Srs3 & "_XY"
        .ChartType = xlXYScatter
        .XValues = Sheets("Sheet1").Range("G12:G15")
        .Values = Sheets("Sheet1").Range("H12:H15")
        .MarkerSize = 3
        .MarkerStyle = xlDiamond
        .MarkerBackgroundColorIndex = 5
        .MarkerForegroundColorIndex = 5
    End With

    ActiveChart.ChartGroups(1).GapWidth = 100

    ActiveChart.Axes(xlCategory, xlSecondary).MaximumScale = ActiveSheet.Range("D25").Value
    ActiveChart.Axes(xlCategory, xlSecondary).MinimumScale = 0
    ActiveChart.Axes(xlCategory, xlSecondary).MajorUnit = 1

    ActiveChart.Axes(xlValue).MaximumScale = ActiveSheet.Range("D25").Value
    ActiveChart.Axes(xlValue).MinimumScale = 0
    ActiveChart.Axes(xlValue).MajorUnit = 1

    ActiveChart.Axes(xlValue, xlSecondary).MaximumScale = ActiveSheet.Range("D24").Value
    ActiveChart.Axes(xlValue, xlSecondary).TickLabels.NumberFormat = "#,##0"

    ActiveChart.Parent.Height = 189.75
    ActiveChart.Parent.Width = 319.5

End Sub

Secondary Axes in Charts Discussion

Recently the topic of primary and secondary axis scales was discussed at the PTS Blog. The discussion starts with a reference to Stephen Few's March 2008 Visual Business Intelligence Newsletter titled Dual-Scaled Axis in Graphs - Are They Ever the Best Solution . Stephen starts his article by concluding that He then goes on to cite a series of examples. Finally, at the end of his article he states that "I certainly cannot conclude, once and for all, that graphs with dual-scaled axes are never useful; only that I cannot think of a situation that warrants them in light of other, better solutions. I invite you to propose viable exceptions, which I will welcome with open arms."

From a healthcare finance perspecitive, it has been my experience that these charts are great tools for those who know how to build them and for those audiences that know how to interpret the underlying data. Below are several comments I have concerning Stephen's article:

Comment #1 - It's not necessarily the chart with two axes that's bad, it's the data behind it. In the newsletter, most of Stephen's graphs illustrate the relationship between revenue and units sold. I would argue that is comparison is flawed from the beginning and, as such, this type of chart should not be used. My reasoning is that, although it seems like the data is should be related, when you drill down into the data you might find out that the correlation is not as close as you might have thought. Depending on the data:

  • What defines a unit?
  • Are the units the same unit each quarter or are there multiple types of units being charted as one?
  • If there are multiple units, how are the changes in volume accounted for within the sales mix?
  • Won't price increases distort the data over time?
  • How does the chart account for decreases in purchasing power over time i.e inflation over time?
  • If the data is measured by quarter and one of the quarters includes a leap year, how do you account for the extra day?

In my opinion, I think an argument can be made that a dual axis chart showing revenue and units sold is probably not the the best use of a dual-axes chart. That being said I do have to admit that I do use this type comparison at a top-level where the components of variance might be considered immaterial. If a dual-axes chart is to be used the audience should be educated on how to interpret the underlying data and how to recognize and address possible flaws before decisions a made.

Comment #2 - If you are going to present a dual-axis chart, the axes need to be proportional. That means doing the math for each axes to make sure it is in fact proportional to the other.

Comment #3 - As Stephen observed, line graphs are the best presentation for this type of chart.

Comment #4 - "I can’t think of a single case when there isn’t a better solution than a graph with a dual-scaled axis." In my line of work, we've found that these graphs are very useful for auditing calculations. For example, you are tasked with building a projection of revenue and discounts for the next year. In a healthcare environment, revenue and discounts both contain the same components of variance i.e. work days, fee increases, volume changes, payer mix changes, and changes in service mix or acuity. The calculations the build all of these components can get very complex. Rather than attemping to proof each calculation that builds the projection, this chart can quickly pick up areas of possible error.

In the example below, it's obvious that there's problem with June calculations because the lines are not in sync for that month.

image

Comparing the growth in same-type volumes (for example Radiology vs. Laboratory volumes) is another example.

To conclude, there may in fact be better charts for displaying certain types of data. But if the audience understands the relationships between the data I think there is a place for these charts. And like the example above shows, there's most definitely a place for these charts when performing audits.

Comparing Revenue Growth - Gross vs. Net

Gross revenue can be simply defined as charges billed. Net revenue, on the other hand, is more reflective of the revenue you expect to actually receive. In a healthcare environment, the spread between gross and net revenue can be fairly significant depending on who pays the bill.

A bar chart provides a simple visual comparison of gross vs. net revenue. Although the stacked bar chart initially seemed like the best choice to me, over time I've found that a regular bar chart with the bar overlap set to 100% works well too.

In this example, the goal is to chart the change in gross and net revenue in the Commercial / Contract and Government areas for the annual periods 2006 and 2007. Below is how I've set up the data:

image

The Chart Wizard produces the following:

image

The next task is to overlay the bars by setting the overlap to 100%. To do so, go to the Format Data Series dialog box - Options tab and change the overlap to 100%.

image

Finally, with a series of minor formatting changes the chart looks like this:

image

The chart shows the following:

  • The Government area is above the blue line and the Commercial / Contract area is below the blue line.
  • The Commercial / Contract revenue grew while the Government gross revenue showed no increase and reimbursement decreased.
  • The Commerical / Contract reimbursement is significantly higher than the Government reimbursement. As a result, the Commercial / Contract area looks much more attractive from a financial perspective.

A Study of Pie Chart Rounding

Pie charts, as well as many other charts, can be built to show percentages of a whole. Unfortunately, many times the pie chart percentages do not tie in total to the total of the source document percentages. Microsoft acknowledges the problem via this knowledge base article titled Pie Chart Shows Incorrect Percentage Value in Excel.

As a potential work-around, I attempted to create a source document formula that works similar to the logic described in the knowledge base article. The purpose of the formula is to force the rounded percentages in the source report to always tie to the pie chart percentages.

Refer to the screenshot below:

image

A source range, named "Data", covers the range C4:C10. Column D reflects the rounded source document percentages. When added together, the percentages equal 101%.

Column E contains the alternative percentages that tie to the pie chart. The formula, entered as a multicell-array into the range E4:E10, is as follows:

{=IF((Data-ROW()/10^10)=LARGE((Data-ROW()/10^10),1),
1-SUMPRODUCT(((Data-ROW()/10^10)<>LARGE((Data-ROW()/10^10),1))
*(ROUND(Data/$C$11,2))),ROUND(Data/$C$11,2))}

The formula works essentially as the knowledge base article describes. It:

  • Uses the ROW function to first break any ties that may occur when the same number appears two or more times.
  • Using the numbers calculated by the ROW function, it then uses the LARGE function to determain which percentage is the maximum.
  • If the percentage is the maximum, it subtracts the total of all of the the percentages excluding the maximum from 100%. All other percentages are calculated and rounded as normal.

In the example, the formula recognizes the rounded value for Gamma as the maximum. Because the data does not add to 100%, the formula subtracts one from the Gamma value to force the percentages to round to 100%. Likewise, the percentages in the pie chart show the same values.

To test the formula, I used the RANDBETWEEN function and a macro to quickly test 10,000 random scenarios. The formula appeared to work well with values that round in total to 99% or 101%. However, when values round to 98% or 102%, I did find occurances where the formula returns values that differ from the pie chart. In those cases Excel does not seem to assign the difference to one single value. Rather, Excel appears to pick two of the same values and distribute the difference evenly. Given the logic as described in the knowledge base article, I'm at a loss to explain why.

Below is an example where the pie chart percentages differ from the formula:

image

The formula calculates Beta to be the maximum and subtracts two from it to force the total to equal 100%. However, Excel subtracts one from Beta and Delta to get to 100%.