Patient-reported Outcome Measures and Measurement Tools in Rhinology
Summary
Patients now rightly expect their doctors to record outcomes of clinical care. Clinical measures are becoming more sophisticated, and the equipment required to make such assessments is now widely available.
Routine collection of patient-reported outcome measures (PROMs) is likely to become mandatory for health care providers. We should embrace this opportunity and use this patient-rated information to enhance the doctor–patient relationship and focus communication. Careful use in cohort studies or within randomized trials may identify important differences in outcomes, as well as between treatments and providers, although it is still important to recognize the limitations of the outcome tools.
Introduction
Although surgery is an ancient art, outcome measurement remains very much in its infancy. For centuries, assessment has involved simple dichotomous outcomes: dead/alive, cured/residual disease, sometimes venturing as far as better/worse, and usually decided by the surgeons themselves. The first author′s initial experience with outcome measurement involved a surgeon′s ramming the end of a mallet through a patient′s nostrils to demonstrate a “successful” outcome following septal surgery. After several high-profile failings of medical care worldwide, there has been a growing demand for greater transparency and publication of outcome data following surgical intervention. This, coupled with the explosive growth of evidence-based medicine, has led to a significant refinement in the measurement of surgical outcomes. The move toward greater patient empowerment and patients’ increasingly active role in health management has brought to the spotlight patients’ own evaluation of their health-related quality of life (HRQOL) before and after medical or surgical interventions.
Why Measure Outcome?
It seems intuitive that doctors would wish to measure whether they are successful in achieving their treatment aims following surgical or medical intervention. There are many additional benefits to measuring surgical outcome:
Allows individual surgeons to judge and improve their practice
Allows refinement of surgical techniques by comparing procedures
Helps patients to make informed choices about their care, allowing comparison between medical and surgical treatments or between different procedures
Provides greater public transparency and accountability
Enables quality assurance of operations
Facilitates comparison between health care providers
Provides data for health care commissioners when making funding decisions, as it may allow comparison of the impact of treatments across different specialties
Has or will become an essential component of revalidation in the United Kingdom, the United States, the Netherlands, and other countries
Paradoxically, the use of patient outcomes for revalidation may contribute to resistance from doctors to outcome measurement, particularly when there is lack of reassurance that assessment will be fairly performed and properly risk adjusted. The outcome chosen must be appropriate, measured at a suitable and equitable period of time following treatment, and take into account the disease severity and comorbidities of the patient population.
When a condition has a high mortality rate, the success of a medical treatment or surgery may be measured by its impact on the survival rate. However, most rhinologic conditions are not life threatening, but instead have an impact on quality of life or produce functional disability. Therefore, outcome assessment must detect more subtle changes and will often include both a patient-rated outcome measure and an assessment by the clinician.
There is now growing acceptance that patients’ views are essential in the delivery of high-quality care. PROMs are measures of HRQOL that are self-rated and reported directly by patients.1 They usually refer to a single point in time or clearly defined preceding period; thus, the term outcome measure in this setting is a misnomer. The impact of medical care can be determined by comparing repeated measures of a patient′s self-reported health status before and after the intervention. We are more familiar with clinical measures, either physiologic parameters (e.g., blood tests) that require interpretation within the clinical context by a physician or clinician-rated measures of disease severity (e.g., grading of computed tomography [CT] findings). Clinical measures often correlate poorly with PROMs and fail to predict changes in PROMs following treatment.2 In addition, individual patients may report very different levels of HRQOL despite having a similar disease burden, as their response to disease may be modified by both their own individual characteristics and their environment.
Clinical versus Patient-reported Outcomes
During the time of Florence Nightingale, patient outcome was recorded as “dead, relieved, or unrelieved.”3 Until the last decade little had changed, with most reported hospital outcome data consisting of only mortality rates. Comparing hospital mortality rates actually provides little useful information on the quality of care, as it is greatly influenced by patient mix, geographical and cultural preferences on the place of death, and availability of palliative care beds and hospices. Even most elective cardiac procedures have an operative mortality rate of <1%. Fortunately, deaths in rhinologic surgery are exceptionally rare, and 5-year survival rates apply only in sinonasal malignancy. We therefore need more sensitive measures of outcome for our patients. In other cases, surgery may be deemed a technical success, but the patient may fail to experience improvement in symptoms ( Fig. 9.1 ).
The choice of outcome measure should therefore reflect both the aim of treatment and the reason for analysis. For example, in treating malignant conditions, both disease-free and overall survival rates are useful, although the impact of treatment itself on the quality of life of the patient should also be measured. When undertaking rhinoplasty, the patient′s satisfaction with his or her postoperative appearance is the most important aim; however, a surgeon may learn more about technical aspects of the procedure from photographic analysis.
In the past, clinician-rated outcomes were often described as “objective” measures and patient-rated as “subjective.” A clinician assessing the endoscopic appearance of a nose or the cosmetic appearance after rhinoplasty may be no less subjective than a patient rating his or her own symptoms. There is also often quite high interobserver variability using clinician-rated scoring systems. Although some clinical outcome measures are produced by technical measures, they are also prone to sampling errors; therefore, the terms above should no longer be applied.
Note
When choosing an outcome measure, think carefully about the aim of surgery. If it is to improve quality of life, then a PROM is the ideal choice.
Clinician-reported Outcome Measures in Rhinology
Clinician-rated outcomes in rhinology are complementary to PROMs, as they can help clinicians to benchmark their results, facilitate research, and provide information about the disease process, its severity, and the potential effects of treatment.4 They are discussed in more detail in the relevant chapters and include endoscopic, radiologic, functional, and photographic evaluation of the results of an intervention.
Complication Rates
Measuring the incidence of complications is important before obtaining informed consent for any treatment, either medical or surgical, as well as for comparing different treatment options and allowing assessment of quality of care.
Some complication rates are more prone to bias. For example, readmission rates for epistaxis after rhinologic surgery will depend on local policies and the structure of health care systems: some centers will admit all patients, whereas others may discharge patients with epistaxis packs in situ. In contrast, the indications to return a patient for surgical control of postoperative epistaxis are likely to be more consistent between units. Caution is therefore required when selecting complication rates to compare different health care providers.
Caution
Case-mix adjustment is essential when using complication rates and mortality figures to compare health care providers.
Patient-reported Outcome Measures in Rhinology
Quality of life is measured using one of a growing number of instruments; typically, these are questionnaires, but in some cases visual scales or grading systems can be used. These allow quantitative assessment of otherwise subjective results. So why not simply ask the patients if they are satisfied with their treatment? Although this is easy to do, patient satisfaction is influenced by many variables,5 such as the availability and convenience of health care, the “bedside manner” of the doctor, the affability of the extended team, and the perceived cleanliness of the hospital. Although these are all important, they complicate the evaluation of clinical outcome. To avoid this, questionnaires require patients to rate the impact of their disease across several specified “domains,” or areas of interest. Individual questions are scored according to the severity or impact of disease, and scores are then combined to produce an overall score. Scores can be used to follow patients with chronic disease or compared before or after an intervention at an individual patient level or across different groups of patients, thus quantifying the amount of change.
Tips and Tricks
The term outcome measure is a misnomer; it is the change in score that informs us of the outcome. Therefore, it is essential to record PROM scores both before and after intervention.
Some PROMs have been developed for particular conditions or treatments (disease-specific measures), whereas others are designed for use in all patient groups or healthy individuals and measure the patients’ perception of their general health (generic measures).
Generic Outcome Measures
Generic PROMs allow comparison between conditions or treatments and therefore can be used to determine the impact of different diseases on patient groups, to evaluate the relative cost utility of different interventions, and to inform commissioning decisions.
The Short Form 36 Health Survey (SF-36) is a multipurpose, 36-item survey that measures eight domains of health: physical functioning, role limitations due to physical health, bodily pain, general health perceptions, vitality, social functioning, role limitations due to emotional problems, and mental health. It has been widely used in many medical conditions and has been reported on in over 5000 publications, with normative values available for the general population. Recently, norm-based scoring (NBS) algorithms were introduced for all eight scales, producing a score transformation with a mean of 50 and standard deviation (SD) of 10, making it easier to compare scores across the different scales, as well as with normative data. Using the SF-36, researchers have found that chronic rhinosinusitis (CRS) has a negative impact on several aspects of quality of life and has a greater impact on social functioning than congestive heart failure, angina, or back pain.6
The Health Utilities Index7 (HUI) includes eight domains (vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain). Although this makes it more useful for otological conditions, it lacks sensitivity for rhinologic conditions. The Scottish ENT Outcomes Study (SENTOS) found only a small benefit from nasal surgery (of all types grouped together) using the HUI.8
EQ-5D, a generic measure of HRQOL,9 has been recommended for future use by a working group of the UK Department of Health (DoH).1 The EQ-5D measures HRQOL across five domains: walking and mobility, ability to self-care, ability to perform usual activities, pain, and anxiety or depression. Though suitable for common surgical procedures such as hip and knee arthroplasty, for which it is currently being used by the DoH, global measures such as the EQ-5D may lack the sensitivity to assess changes in health status in many conditions. For example, when applied to cataract surgery, the DoH pilot study found that although the majority of patients (93.1% of 566 included in the study) reported that their vision was better following cataract surgery, there was no change in the EQ-5D (preop mean 0.81, postop mean 0.78).1 Similar results have been shown in patients with conductive hearing loss.10 One might also expect that the EQ-5D will fail to capture the impact of rhinologic disease or the effectiveness of treatment. This is of concern if such measures are used for demand management to ration health care. Furthermore, application of the SF-36, HUI, and EQ-5D in the same patient group can yield significantly different results.11
Caution
Although generic tools facilitate comparison between different disease groups, they are likely to lack sensitivity if used in rhinologic conditions.
The Glasgow Benefit Inventory (GBI) is a validated generic quality of life instrument that has been widely used in otolaryngology.12 It measures change in health status following interventions, allowing comparison between different types of treatment. It is a postintervention questionnaire that is administered once only and contains 18 five-point Likert-type questions that assess the effect of the intervention (“much worse,” “a little or somewhat worse,” “no change,” “a little or somewhat better,” and “much better”) on health status ( Fig. 9.2 ). The questionnaire can be filled in by the patient in ~5 minutes or (preferably) completed by an interviewer in ~10 minutes. The GBI is included in a total score and has three subscales: a general subscale (12 questions), a social support subscale (3 questions), and a physical health sub-scale (3 questions). All of these scores range from 100 (maximum positive change) to – 100 (maximum negative change). Both the questionnaire itself and the scoring guide have been translated into seven languages and are available for download (with permission) via the UK Medical Research Council–Institute of Hearing Research at www.ihr.mrc.ac.uk/downloads/products/questionnaires/GHSQManual.doc. The GBI has been used to show benefit from functional and cosmetic septorhinoplasty (+ 58.313), endoscopic sinus surgery (+ 2314), endoscopic dacryocystorhinostomy (DCR) (+ 16.815), and septoplasty (+ 11.316). Although the once-only administration of the instrument is likely to increase compliance, it means that baseline data are not collected and therefore precludes its use in comparative (pre- and posttreatment) studies. It also fails to add to our clinical understanding of the severity of patients’ symptoms prior to treatment, which may otherwise help guide treatment.
Disease-specific PROMs in Rhinology
These generic instruments are often insensitive to small changes, which remain nonetheless important to the individual patient. There are a rapidly growing number of instruments designed to measure patients’ perception of HRQOL in relation to a specific disease. The time needed to collect PROMs is commonly cited as a barrier to their routine clinical use. However, disease-specific instruments readily identify the most important symptoms to patients, quantify the severity of all commonly associated symptoms, focus the consultation, and provide a useful clinical record; they thus may help facilitate a patient′s visit. If patients are asked to complete questionnaires in the waiting room prior to their appointment, scores can be quickly calculated and used to instantly demonstrate the severity and changes in symptoms, and subsequently may reduce consultation time. They can help define the aims of treatment and are likely to be more sensitive to small but clinically relevant changes in outcome than are global measures.
Any textbook is unable to keep up with the development of new PROMs, but some of the key areas in rhinology are briefly reviewed here.
Rhinosinusitis
A recent literature review identified 15 disease-specific instruments designed for use in patients with rhinosinusitis (either acute or chronic).17 These were evaluated in terms of their reliability, validity, responsiveness, and ease of use. Key features are summarized in Table 9.1. RSOM-31 (Rhinosinusitis Outcome Measure), SNOT-16 (Sinonasal Outcome Test), and SNOT-20 are available for download (with permission) from http://oto2.wustl.edu/clinepi/downloads.html.
Although the choice will always depend on the clinical setting, the recent EPOS 201240 document has made recommendations for the use of specific instruments in rhinosinusitis.
Note
Recommended outcome tools based in current literature40:
Adult CRS: SNOT-22 or RSOM 31
Adult ARS: SNOT-16
Pediatric CRS: SN-5
Pediatric ARS: S-5
Morley and Sharp,18 based on their appraisal of the available measures, concluded that SNOT-22 was the most suitable tool in terms of reliability, validity, responsiveness, and ease of use. An earlier systematic review,41 before SNOT-22 was validated, recommends either the RhinoQOL or the RSOM-31 (a longer version of SNOT-22) for rhinosinusitis, and the RQLQ(S) and mini-RQLQ for rhinitis. This review also presents an excellent assessment of the psychometric properties of each tool ( Tables 9.2 and 9.3 ).
SNOT-22 was used to collect prospectively the outcomes of 3128 patients undergoing a range of surgical procedures for CRS who were recruited by the National Comparative Audit of Surgery for Chronic Rhinosinusitis and Nasal Polyposis.42 This is the largest published outcomes study to date in CRS and therefore provides useful benchmarking data against which future studies may be compared. Significant reductions in SNOT-22 scores were achieved by surgery and maintained across a 5-year period ( Fig. 9.3 ). Psychometric validation has been completed,39 suggesting excellent internal consistency, test–retest reliability, and discriminant ability and confirming that the minimally important change for an individual with CRS was 9 points. Normative data using SNOT-22 were collected, suggesting a median of 7 for the normal population. Several other outcome measures (CSS,28 RSI,38 Cologne Questionnaire,35 and Chronic Sinusitis Type Questionnaire27), although quite extensive in evaluating various symptoms of rhinosinusitis, do not aim to provide a comprehensive physical, functional, and psychosocial quality of life assessment.
Rhinitis and Rhinoconjunctivitis
The Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ) was developed in 1991 and has since been adapted in various forms: the standardized form of RQLQ, Nocturnal RQLQ (NRQLQ), for measurement of nocturnal rhinitis, and mini-RQLQ, with 14 items instead of 28.18–22 It has excellent measurement characteristics, while its timeframe (over 1 week) makes it more sensitive to recent changes. It has been used in more than 90 studies and is considered the gold standard in the assessment of quality of life in patients with rhinitis.
Nasal Obstruction and Septal Surgery
The Nasal Obstruction Septoplasty Effectiveness (NOSE) questionnaire is a validated 5-item instrument for use with patients with nasal obstruction. It measures improvements in quality of life following septoplasty, functional septorhinoplasty, and nasal valve surgery.43 Questions refer to nasal congestion, nasal blockage, difficulty breathing through the nose, problems with sleep, and difficulty with exercising; answers are organized using a Likert (0–5) scale. SNOT-22 has also been used in the evaluation of septoplasty outcomes, although it has not been validated for use in this patient group.44 In a study of 67 patients,45 the Fairley Nasal Questionnaire showed patients had limited improvement following septoplasty (from 13.2 to 9.1), while GBI confirmed moderate improvement (6.2 in nonresponders and 23.8 in responders).
Tips and Tricks
When selecting a PROM, there is a trade-off between sensitivity and respondent burden. Longer instruments provide more detailed outcome information but at the risk of lower response rates, as they become too onerous to complete.