[ Team LiB ] |
Recipe 6.4 Find the Median Value for a Field6.4.1 ProblemYou need to calculate the median for a numeric field. Access provides the DAvg function to calculate the mean value for a numeric field, but you can't find the equivalent function for calculating medians. 6.4.2 SolutionAccess doesn't provide a built-in DMedian function, but you can make one using VBA code. This solution demonstrates a median function that you can use in your own applications. Load the frmMedian form from 06-04.MDB. Choose the name of a table and a field in that table using the combo boxes on the form. After you choose a field, the median value will be calculated and displayed in a text box using the acbDMedian function found in basMedian (see Figure 6-7). An error message will be displayed if you have chosen a field with a nonnumeric data type; the string "(Null)" will be displayed if the median value happens to be Null. Figure 6-7. The frmMedian formFollow these steps to use acbDMedian in your own applications:
6.4.3 DiscussionThe acbDMedian function in basMedian in 06-04.MDB is patterned to look and act like the built-in DAvg domain function. The algorithm used to calculate the median, however, is more complicated than what you would use to calculate the mean. The median of a field is calculated using the following algorithm:
After declaring a few variables, the acbDMedian function creates a recordset based on the three parameters passed to the function (strField, strDomain, and varCriteria), as shown in the following source code: Public Function acbDMedian( _ ByVal strField As String, ByVal strDomain As String, _ Optional ByVal strCriteria As String) As Variant ' Purpose: ' To calculate the median value ' for a field in a table or query. ' In: ' strField: The field ' strDomain: The table or query ' strCriteria: An optional WHERE clause to ' apply to the table or query ' Out: ' Return value: The median, if successful; ' otherwise, an error value Dim db As DAO.Database Dim rstDomain As DAO.Recordset Dim strSQL As String Dim varMedian As Variant Dim intFieldType As Integer Dim intRecords As Integer Const acbcErrAppTypeError = 3169 On Error GoTo HandleErr Set db = CurrentDb( ) ' Initialize the return value. varMedian = Null ' Build a SQL string for the recordset. strSQL = "SELECT " & strField strSQL = strSQL & " FROM " & strDomain ' Use a WHERE clause only if one is passed in. If Len(strCriteria) > 0 Then strSQL = strSQL & " WHERE " & strCriteria End If strSQL = strSQL & " ORDER BY " & strField Set rstDomain = db.OpenRecordset(strSQL, dbOpenSnapshot) ' Check the data type of the median field. intFieldType = rstDomain.Fields(strField).Type Select Case intFieldType Case dbByte, dbInteger, dbLong, dbCurrency, dbSingle, dbDouble, dbDate ' Numeric field. If Not rstDomain.EOF Then rstDomain.MoveLast intRecords = rstDomain.RecordCount ' Start from the first record. rstDomain.MoveFirst If (intRecords Mod 2) = 0 Then ' Even number of records. No middle record, so move ' to the record right before the middle. rstDomain.Move ((intRecords \ 2) - 1) varMedian = rstDomain.Fields(strField) ' Now move to the next record, the one right after ' the middle. rstDomain.MoveNext ' Average the two values. varMedian = (varMedian + rstDomain.Fields(strField)) / 2 ' Make sure you return a date, even when averaging ' two dates. If intFieldType = dbDate And Not IsNull(varMedian) Then varMedian = CDate(varMedian) End If Else ' Odd number of records. Move to the middle record ' and return its value. rstDomain.Move ((intRecords \ 2)) varMedian = rstDomain.Fields(strField) End If Else ' No records; return Null. varMedian = Null End If Case Else ' Nonnumeric field; raise an app error. Err.Raise acbcErrAppTypeError End Select acbDMedian = varMedian ExitHere: On Error Resume Next rstDomain.Close Set rstDomain = Nothing Exit Function HandleErr: ' Return an error value. acbDMedian = CVErr(Err) Resume ExitHere End Function The process of building the SQL string that defines the recordset is straightforward, except for the construction of the optional WHERE clause. Because strCriteria was defined as an optional parameter (using the Optional keyword), acbDMedian checks if a value was passed by checking that the string has a length greater than zero. Once acbDMedian builds the SQL string, it creates a recordset based on that SQL string. Next, acbDMedian checks the data type of the field: it will calculate the median only for numeric and date/time fields. If any other data type has been passed to acbDMedian, the function forces an error by using the Raise method of the Err object and then uses the special CVErr function in its error handler to send the error state back to the calling procedure: ' Check the data type of the median field. intFieldType = rstDomain.Fields(strField).Type Select Case intFieldType Case dbByte, dbInteger, dbLong, dbCurrency, dbSingle, dbDouble, dbDate ' ... more code follows ... Case Else ' Nonnumeric field; raise an app error. Err.Raise acbcErrAppTypeError End Select ' ... more code follows ... ExitHere: On Error Resume Next rstDomain.Close Set rstDomain = Nothing Exit Function HandleErr: ' Return an error value. acbDMedian = CVErr(Err) Resume ExitHere End Function If the field is numeric, the acbDMedian function checks to see if there are any rows in the recordset using the following If...Then statement, returning Null if there are no rows: ' Numeric field. If Not rstDomain.EOF Then ' ... more code follows ... Else ' No records; return Null. varMedian = Null End If If there are rows, the function moves to the end of the recordset to get a count of the total number of records. This is necessary because the RecordCount property returns only the number of rows that have been visited. The code is: rstDomain.MoveLast intRecords = rstDomain.RecordCount If the number of records is even, acbDMedian moves to the record just before the middle using the Move method, which allows you to move an arbitrary number of records from the current record. The number of records to move forward is calculated using the following formula: intRecords \ 2 - 1 This tells Access to divide the total number of records by 2 and then subtract 1 from the result (because you are starting from the first record). For example, if you are on the first of 500 records, you would move (500 \ 2 - 1) = (250 - 1) = 249 records forward, which would bring you to the 250th record. Once the function has moved that many records, it's a simple matter to grab the value of the 250th and 251st records and divide the result by 2. This part of the function is shown here: ' Start from the first record. rstDomain.MoveFirst If (intRecords Mod 2) = 0 Then ' Even number of records. No middle record, so move ' to the record right before the middle. rstDomain.Move ((intRecords \ 2) - 1) varMedian = rstDomain.Fields(strField) ' Now move to the next record, the one right after ' the middle. rstDomain.MoveNext ' Average the two values. varMedian = (varMedian + rstDomain.Fields(strField)) / 2 Because acbDMedian supports dates, the function needs to make sure that a date value is returned when taking the average of two dates. The following code handles this: ' Make sure you return a date, even when ' averaging two dates. If intFieldType = dbDate And Not IsNull(varMedian) Then varMedian = CDate(varMedian) End If The code for an even number of rows is much simpler: Else ' Odd number of records. Move to the middle record ' and return its value. rstDomain.Move ((intRecords \ 2)) varMedian = rstDomain.Fields(strField) End If Finally, acbDMedian returns the median value to the calling procedure: acbDMedian = varMedian The median, like the average (or arithmetic mean), is known statistically as a measure of central tendency. In other words, both measures estimate the middle of a set of data. The mean represents the mathematical average value; the median represents the middle-most value. For many datasets, these two measures are the same or very close to each other. Sometimes, however, depending on how the data is distributed, the mean and median will report widely varying values. In these cases, many people favor the median as a better "average" than the mean. Calculating the median requires sorting the dataset, so it can be rather slow on large datasets. Calculating the mean, however, doesn't require a sort, so it will always be faster to calculate the mean.
|
[ Team LiB ] |