Patient-reported Outcome Measures and Measurement Tools in Rhinology

10.1055/b-0034-77985

Patient-reported Outcome Measures and Measurement Tools in Rhinology

Claire Hopkins and Christos Georgalas

Summary

Patients now rightly expect their doctors to record outcomes of clinical care. Clinical measures are becoming more sophisticated, and the equipment required to make such assessments is now widely available.

Routine collection of patient-reported outcome measures (PROMs) is likely to become mandatory for health care providers. We should embrace this opportunity and use this patient-rated information to enhance the doctor–patient relationship and focus communication. Careful use in cohort studies or within randomized trials may identify important differences in outcomes, as well as between treatments and providers, although it is still important to recognize the limitations of the outcome tools.

Introduction

Although surgery is an ancient art, outcome measurement remains very much in its infancy. For centuries, assessment has involved simple dichotomous outcomes: dead/alive, cured/residual disease, sometimes venturing as far as better/worse, and usually decided by the surgeons themselves. The first author′s initial experience with outcome measurement involved a surgeon′s ramming the end of a mallet through a patient′s nostrils to demonstrate a “successful” outcome following septal surgery. After several high-profile failings of medical care worldwide, there has been a growing demand for greater transparency and publication of outcome data following surgical intervention. This, coupled with the explosive growth of evidence-based medicine, has led to a significant refinement in the measurement of surgical outcomes. The move toward greater patient empowerment and patients’ increasingly active role in health management has brought to the spotlight patients’ own evaluation of their health-related quality of life (HRQOL) before and after medical or surgical interventions.

Why Measure Outcome?

It seems intuitive that doctors would wish to measure whether they are successful in achieving their treatment aims following surgical or medical intervention. There are many additional benefits to measuring surgical outcome:

Allows individual surgeons to judge and improve their practice
Allows refinement of surgical techniques by comparing procedures
Helps patients to make informed choices about their care, allowing comparison between medical and surgical treatments or between different procedures
Provides greater public transparency and accountability
Enables quality assurance of operations
Facilitates comparison between health care providers
Provides data for health care commissioners when making funding decisions, as it may allow comparison of the impact of treatments across different specialties
Has or will become an essential component of revalidation in the United Kingdom, the United States, the Netherlands, and other countries

Paradoxically, the use of patient outcomes for revalidation may contribute to resistance from doctors to outcome measurement, particularly when there is lack of reassurance that assessment will be fairly performed and properly risk adjusted. The outcome chosen must be appropriate, measured at a suitable and equitable period of time following treatment, and take into account the disease severity and comorbidities of the patient population.

When a condition has a high mortality rate, the success of a medical treatment or surgery may be measured by its impact on the survival rate. However, most rhinologic conditions are not life threatening, but instead have an impact on quality of life or produce functional disability. Therefore, outcome assessment must detect more subtle changes and will often include both a patient-rated outcome measure and an assessment by the clinician.

There is now growing acceptance that patients’ views are essential in the delivery of high-quality care. PROMs are measures of HRQOL that are self-rated and reported directly by patients.1 They usually refer to a single point in time or clearly defined preceding period; thus, the term outcome measure in this setting is a misnomer. The impact of medical care can be determined by comparing repeated measures of a patient′s self-reported health status before and after the intervention. We are more familiar with clinical measures, either physiologic parameters (e.g., blood tests) that require interpretation within the clinical context by a physician or clinician-rated measures of disease severity (e.g., grading of computed tomography [CT] findings). Clinical measures often correlate poorly with PROMs and fail to predict changes in PROMs following treatment.2 In addition, individual patients may report very different levels of HRQOL despite having a similar disease burden, as their response to disease may be modified by both their own individual characteristics and their environment.

Clinical versus Patient-reported Outcomes

During the time of Florence Nightingale, patient outcome was recorded as “dead, relieved, or unrelieved.”3 Until the last decade little had changed, with most reported hospital outcome data consisting of only mortality rates. Comparing hospital mortality rates actually provides little useful information on the quality of care, as it is greatly influenced by patient mix, geographical and cultural preferences on the place of death, and availability of palliative care beds and hospices. Even most elective cardiac procedures have an operative mortality rate of <1%. Fortunately, deaths in rhinologic surgery are exceptionally rare, and 5-year survival rates apply only in sinonasal malignancy. We therefore need more sensitive measures of outcome for our patients. In other cases, surgery may be deemed a technical success, but the patient may fail to experience improvement in symptoms ( Fig. 9.1 ).

The choice of outcome measure should therefore reflect both the aim of treatment and the reason for analysis. For example, in treating malignant conditions, both disease-free and overall survival rates are useful, although the impact of treatment itself on the quality of life of the patient should also be measured. When undertaking rhinoplasty, the patient′s satisfaction with his or her postoperative appearance is the most important aim; however, a surgeon may learn more about technical aspects of the procedure from photographic analysis.

In the past, clinician-rated outcomes were often described as “objective” measures and patient-rated as “subjective.” A clinician assessing the endoscopic appearance of a nose or the cosmetic appearance after rhinoplasty may be no less subjective than a patient rating his or her own symptoms. There is also often quite high interobserver variability using clinician-rated scoring systems. Although some clinical outcome measures are produced by technical measures, they are also prone to sampling errors; therefore, the terms above should no longer be applied.

Note

When choosing an outcome measure, think carefully about the aim of surgery. If it is to improve quality of life, then a PROM is the ideal choice.

Clinician-reported Outcome Measures in Rhinology

Clinician-rated outcomes in rhinology are complementary to PROMs, as they can help clinicians to benchmark their results, facilitate research, and provide information about the disease process, its severity, and the potential effects of treatment.4 They are discussed in more detail in the relevant chapters and include endoscopic, radiologic, functional, and photographic evaluation of the results of an intervention.

Complication Rates

Measuring the incidence of complications is important before obtaining informed consent for any treatment, either medical or surgical, as well as for comparing different treatment options and allowing assessment of quality of care.

Some complication rates are more prone to bias. For example, readmission rates for epistaxis after rhinologic surgery will depend on local policies and the structure of health care systems: some centers will admit all patients, whereas others may discharge patients with epistaxis packs in situ. In contrast, the indications to return a patient for surgical control of postoperative epistaxis are likely to be more consistent between units. Caution is therefore required when selecting complication rates to compare different health care providers.

Caution

Case-mix adjustment is essential when using complication rates and mortality figures to compare health care providers.

Patient-reported Outcome Measures in Rhinology

Quality of life is measured using one of a growing number of instruments; typically, these are questionnaires, but in some cases visual scales or grading systems can be used. These allow quantitative assessment of otherwise subjective results. So why not simply ask the patients if they are satisfied with their treatment? Although this is easy to do, patient satisfaction is influenced by many variables,5 such as the availability and convenience of health care, the “bedside manner” of the doctor, the affability of the extended team, and the perceived cleanliness of the hospital. Although these are all important, they complicate the evaluation of clinical outcome. To avoid this, questionnaires require patients to rate the impact of their disease across several specified “domains,” or areas of interest. Individual questions are scored according to the severity or impact of disease, and scores are then combined to produce an overall score. Scores can be used to follow patients with chronic disease or compared before or after an intervention at an individual patient level or across different groups of patients, thus quantifying the amount of change.

Tips and Tricks

The term outcome measure is a misnomer; it is the change in score that informs us of the outcome. Therefore, it is essential to record PROM scores both before and after intervention.

Some PROMs have been developed for particular conditions or treatments (disease-specific measures), whereas others are designed for use in all patient groups or healthy individuals and measure the patients’ perception of their general health (generic measures).

Generic Outcome Measures

Generic PROMs allow comparison between conditions or treatments and therefore can be used to determine the impact of different diseases on patient groups, to evaluate the relative cost utility of different interventions, and to inform commissioning decisions.

The Short Form 36 Health Survey (SF-36) is a multipurpose, 36-item survey that measures eight domains of health: physical functioning, role limitations due to physical health, bodily pain, general health perceptions, vitality, social functioning, role limitations due to emotional problems, and mental health. It has been widely used in many medical conditions and has been reported on in over 5000 publications, with normative values available for the general population. Recently, norm-based scoring (NBS) algorithms were introduced for all eight scales, producing a score transformation with a mean of 50 and standard deviation (SD) of 10, making it easier to compare scores across the different scales, as well as with normative data. Using the SF-36, researchers have found that chronic rhinosinusitis (CRS) has a negative impact on several aspects of quality of life and has a greater impact on social functioning than congestive heart failure, angina, or back pain.6

The Health Utilities Index7 (HUI) includes eight domains (vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain). Although this makes it more useful for otological conditions, it lacks sensitivity for rhinologic conditions. The Scottish ENT Outcomes Study (SENTOS) found only a small benefit from nasal surgery (of all types grouped together) using the HUI.8

EQ-5D, a generic measure of HRQOL,9 has been recommended for future use by a working group of the UK Department of Health (DoH).1 The EQ-5D measures HRQOL across five domains: walking and mobility, ability to self-care, ability to perform usual activities, pain, and anxiety or depression. Though suitable for common surgical procedures such as hip and knee arthroplasty, for which it is currently being used by the DoH, global measures such as the EQ-5D may lack the sensitivity to assess changes in health status in many conditions. For example, when applied to cataract surgery, the DoH pilot study found that although the majority of patients (93.1% of 566 included in the study) reported that their vision was better following cataract surgery, there was no change in the EQ-5D (preop mean 0.81, postop mean 0.78).1 Similar results have been shown in patients with conductive hearing loss.10 One might also expect that the EQ-5D will fail to capture the impact of rhinologic disease or the effectiveness of treatment. This is of concern if such measures are used for demand management to ration health care. Furthermore, application of the SF-36, HUI, and EQ-5D in the same patient group can yield significantly different results.11

Caution

Although generic tools facilitate comparison between different disease groups, they are likely to lack sensitivity if used in rhinologic conditions.

The Glasgow Benefit Inventory (GBI) is a validated generic quality of life instrument that has been widely used in otolaryngology.12 It measures change in health status following interventions, allowing comparison between different types of treatment. It is a postintervention questionnaire that is administered once only and contains 18 five-point Likert-type questions that assess the effect of the intervention (“much worse,” “a little or somewhat worse,” “no change,” “a little or somewhat better,” and “much better”) on health status ( Fig. 9.2 ). The questionnaire can be filled in by the patient in ~5 minutes or (preferably) completed by an interviewer in ~10 minutes. The GBI is included in a total score and has three subscales: a general subscale (12 questions), a social support subscale (3 questions), and a physical health sub-scale (3 questions). All of these scores range from 100 (maximum positive change) to – 100 (maximum negative change). Both the questionnaire itself and the scoring guide have been translated into seven languages and are available for download (with permission) via the UK Medical Research Council–Institute of Hearing Research at www.ihr.mrc.ac.uk/downloads/products/questionnaires/GHSQManual.doc. The GBI has been used to show benefit from functional and cosmetic septorhinoplasty (+ 58.313), endoscopic sinus surgery (+ 2314), endoscopic dacryocystorhinostomy (DCR) (+ 16.815), and septoplasty (+ 11.316). Although the once-only administration of the instrument is likely to increase compliance, it means that baseline data are not collected and therefore precludes its use in comparative (pre- and posttreatment) studies. It also fails to add to our clinical understanding of the severity of patients’ symptoms prior to treatment, which may otherwise help guide treatment.

Disease-specific PROMs in Rhinology

These generic instruments are often insensitive to small changes, which remain nonetheless important to the individual patient. There are a rapidly growing number of instruments designed to measure patients’ perception of HRQOL in relation to a specific disease. The time needed to collect PROMs is commonly cited as a barrier to their routine clinical use. However, disease-specific instruments readily identify the most important symptoms to patients, quantify the severity of all commonly associated symptoms, focus the consultation, and provide a useful clinical record; they thus may help facilitate a patient′s visit. If patients are asked to complete questionnaires in the waiting room prior to their appointment, scores can be quickly calculated and used to instantly demonstrate the severity and changes in symptoms, and subsequently may reduce consultation time. They can help define the aims of treatment and are likely to be more sensitive to small but clinically relevant changes in outcome than are global measures.

Glasgow Benefit Inventory (GBI) questionnaire. (Courtesy of Professor George Browning.)

Any textbook is unable to keep up with the development of new PROMs, but some of the key areas in rhinology are briefly reviewed here.

Rhinosinusitis

A recent literature review identified 15 disease-specific instruments designed for use in patients with rhinosinusitis (either acute or chronic).17 These were evaluated in terms of their reliability, validity, responsiveness, and ease of use. Key features are summarized in Table 9.1. RSOM-31 (Rhinosinusitis Outcome Measure), SNOT-16 (Sinonasal Outcome Test), and SNOT-20 are available for download (with permission) from http://oto2.wustl.edu/clinepi/downloads.html.

Although the choice will always depend on the clinical setting, the recent EPOS 201240 document has made recommendations for the use of specific instruments in rhinosinusitis.

Note

Recommended outcome tools based in current literature40:

Adult CRS: SNOT-22 or RSOM 31
Adult ARS: SNOT-16
Pediatric CRS: SN-5
Pediatric ARS: S-5

Morley and Sharp,18 based on their appraisal of the available measures, concluded that SNOT-22 was the most suitable tool in terms of reliability, validity, responsiveness, and ease of use. An earlier systematic review,41 before SNOT-22 was validated, recommends either the RhinoQOL or the RSOM-31 (a longer version of SNOT-22) for rhinosinusitis, and the RQLQ(S) and mini-RQLQ for rhinitis. This review also presents an excellent assessment of the psychometric properties of each tool ( Tables 9.2 and 9.3 ).

SNOT-22 (Sinonasal Outcome Test) scores for a cohort of 3128 patients in the United Kingdom taking part in the National Comparative Audit of Surgery for Chronic Rhinosinusitis and Nasal Polyposis. Note the improved postoperative outcomes for patients with nasal polyposis and the maintained reduction in SNOT-22 scores after 5 years.42

SNOT-22 was used to collect prospectively the outcomes of 3128 patients undergoing a range of surgical procedures for CRS who were recruited by the National Comparative Audit of Surgery for Chronic Rhinosinusitis and Nasal Polyposis.42 This is the largest published outcomes study to date in CRS and therefore provides useful benchmarking data against which future studies may be compared. Significant reductions in SNOT-22 scores were achieved by surgery and maintained across a 5-year period ( Fig. 9.3 ). Psychometric validation has been completed,39 suggesting excellent internal consistency, test–retest reliability, and discriminant ability and confirming that the minimally important change for an individual with CRS was 9 points. Normative data using SNOT-22 were collected, suggesting a median of 7 for the normal population. Several other outcome measures (CSS,28 RSI,38 Cologne Questionnaire,35 and Chronic Sinusitis Type Questionnaire27), although quite extensive in evaluating various symptoms of rhinosinusitis, do not aim to provide a comprehensive physical, functional, and psychosocial quality of life assessment.

Rhinitis and Rhinoconjunctivitis

The Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ) was developed in 1991 and has since been adapted in various forms: the standardized form of RQLQ, Nocturnal RQLQ (NRQLQ), for measurement of nocturnal rhinitis, and mini-RQLQ, with 14 items instead of 28.18–22 It has excellent measurement characteristics, while its timeframe (over 1 week) makes it more sensitive to recent changes. It has been used in more than 90 studies and is considered the gold standard in the assessment of quality of life in patients with rhinitis.

Key features of studies on disease-specific instruments designed for use in patients with either rhinitis or rhinosinusitis
Rhinitis	Year of Introduction	Description	Comment
Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ)19	1991	28 items, 7 domains, validated for use in allergic rhinitis	Well validated but poor ease of use Exists in several derivatives (short forms, etc.)
Rhinitis Quality of Life Questionnaire20	1993	24 items, 6 domains
Standardized RQLQ21	1999	As above
Mini-RQLQ22	2000	14 items over 5 domains, greater ease of use
Nocturnal RQLQ23	2003	16 items over 4 domains, specific to sleep disturbance and nocturnal symptoms
Rhinitis Outcome Questionnaire24	2001	26 items, 4 domains; nose, eye, chest, systemic
Rhinosinusitis
Fairley′s Symptom Questionnaire25	1993	12 item, fully validated tool
Chronic Sinusitis and Type Sinusitis Survey27	1993	Extensive tool with 3 forms	High respondent burden-specific questionnaire. Complexity limits usefulness26
Sinusitis Survey27	1994	5 items	Lacks psychometric validation
Chronic Sinusitis Survey28	1995	6 items in two domains, focused on duration of symptoms and medication usage	Norwegian and Chinese validated translation
Rhinosinusitis Outcome Measure29 (RSOM-31)	1995	31 items, 7 domains with added severity and importance scales
Rhinosinusitis Disability Index (RSDI)30	1997	30 items, 1 domain, linked to impact on daily functioning	Turkish validated translation
Rhinosinusitis Utility Index31	1998	10 items	Designed for cost-effectiveness analysis
Sinonasal Outcome Test-1632	1999	16 item modification of the RSOM 31
Sinonasal Outcome Test-20(SNOT-20)33	2002	20 item, 1 domain, modification of the RSOM by item reduction including nasal obstruction and loss of sense of smell	Japanese, Chinese, Portuguese validated translations
Sinonasal Assessment Questionnaire34 (SNAQ)	2002	11 item modification of SNOT-20
Cologne Questionnaire35	2002	7 items focusing on symptom severity, lacks validation
SN-536	2003	5 item survey for use in pediatric population only
Rhinosinusitis Symptom Inventory37	2004	12 items, focused on demonstrating change in symptoms and medication use
Rhinoqol38	2005	17 items, includes measure of frequency and impact of symptoms	French validated translation
Sinonasal Outcome Test-2239	2006	Further modification of SNOT-20, returning 2 deleted symptoms (sense of smell and nasal obstruction)	Danish, Czech, Chinese, Swedish, and Portuguese validated translations
Recommended tools are in boldface.

Characteristics and criteria for quality assessment
Property	Part	Criterion	Points
Construction
Measurement goals	Targeted patient population Purpose: discrimination and/or evaluation For use in: (clinical) trial or clinical practice	If provided If provided Used for level of reliability	1 1 –
Item generation	Sources: literature (incl. questionnaires), clinician, patients	If all three sources are used	1
Item reduction	Approach: conceptual patient feedback; statistical analysis Scale construction: conceptual patient feedback; statistical analysis	If all three methods are used If all three methods are used	1 1
Description	Items, domains, response, score Timeframe	If all four are provided If provided	1 1
Feasibility	Feedback of patients Completion time	If obtained If provided	1 1
Validation study	Kind of patients Number of patients	If representative of target patient population If ≥ 100	1 1
Psychometric properties
Reliability	Internal reliability Test-retest	At group level: Cronbach′s ≥ 0.7 (or) At individual level: Cronbach′s ≥ 0.9 (Significant t–test and Pearson/Spearman) or (ICC): At group level: correlation ≥ 0.7 (or) At individual level: correlation ≥ 0.9	1 1
Validity	Content validity Convergent validity Discriminant validity	If confirmed (qualitative) If correlation is between 0.4 and 0.8 If the purpose is: evaluation; this item in NA If the purpose is discrimination; P-value <0.05	1 1 NA 1
Responsiveness		If the purpose is: evaluation: P-value <0.05 or responsiveness statistic is ≥ 0.5 If the purpose is discrimination; this item in NA	1 NA
Clinically significant change		If the purpose is: evaluation: used method and outcome provided If the purpose is discrimination; this item in NA	1 NA
From Van Oene CM, van Reij E, Sprangers M, Fokkens WJ. Quality assessment of disease-specific quality of life questionnaires for rhinitis and rhinosinusitis: a systematic review. Allergy. 2007;62:1359–1371, with permission.

Nasal Obstruction and Septal Surgery

The Nasal Obstruction Septoplasty Effectiveness (NOSE) questionnaire is a validated 5-item instrument for use with patients with nasal obstruction. It measures improvements in quality of life following septoplasty, functional septorhinoplasty, and nasal valve surgery.43 Questions refer to nasal congestion, nasal blockage, difficulty breathing through the nose, problems with sleep, and difficulty with exercising; answers are organized using a Likert (0–5) scale. SNOT-22 has also been used in the evaluation of septoplasty outcomes, although it has not been validated for use in this patient group.44 In a study of 67 patients,45 the Fairley Nasal Questionnaire showed patients had limited improvement following septoplasty (from 13.2 to 9.1), while GBI confirmed moderate improvement (6.2 in nonresponders and 23.8 in responders).

Choosing a questionnaire
Measurement goals
Questionnaire	Patient population	Purpose	Discriminant validity	Responsiveness	*Points x + y = z/m**
Questionnaires for rhinitis
RQLQ	Rhinoconjunctivitis	Evaluation	Not applicable	Yes	7 + 4 = 11/17
Rhinitis QOL	Perennial rhinitis	Evaluation	Not applicable	Yes	4 + 3 = 7/17
RQLQ(S)	Seasonal and perennial rhinitis	Evaluation	Not applicable	Yes	6 + 8 = 14/17
MiniRQLQ	Rhinoconjunctivitis	Evaluation; discrimination (of impairment)	No	Yes	7 + 8 = 15/18
ROQ	Allergic rhinitis	Evaluation	Not applicable	Yes	6 + 4 = 10/17
NRQLQ	Nocturnal allergic rhinitis	Evaluation; discrimination (of disease severity)	No	Yes	6 + 5 = 11/18
Questionnaires for rhinosinusitis
RSOM-31	Rhinosinusitis	Evaluation; discrimination (disease vs no- ~)	Yes	Yes	7 + 8 = 15/18
SNOT-16	Rhinosinusitis	Evaluation; discrimination (disease vs no- ~)	Yes	Yes	3 + 4 = 7/17
SNOT-20	Rhinosinusitis	Evaluation; discrimination (disease vs no- ~)	Yes	No	6 + 7 = 13/17
RSDI	Rhinosinusitis	Evaluation; discrimination (disease vs no- ~)	Yes	No	4 + 3 = 7/18
RhinoQol	Sinusitis	Evaluation; discrimination (sinusitis and rhinitis)	Yes	Yes	8 + 6 = 14/18
Questionnaire linked to rhinitis and rhinosinusitis
GNPI	Rhinology patient	Evaluation	N/A	No	3 + 3 = 6/16
Rhinasthma	Rhinoconjunctivitis and asthma	Evaluation; discrimination (rhinitis vs rhinitis and asthma)	No	No	3 + 3 = 6/18
* x is the number of points obtained for the construction, description, and feasibility; y is the number of points obtained for the validation study and psychometric properties; z is the total number of points obtained; m is the maximum number of points the questionnaire could have obtained. (Note: If the purpose is only evaluation, the maximum number is 17; if the purpose is discrimination and evaluation, the maximum number is 18; if the number of domains is 1, the maximum score decreases by 1.)
GNPI, general nasal patient inventory; N/A, not applicable; NRQLQ, Nocturnal Rhinoconjunctivitis Quality of Life Questionnaire; QOL, quality of life; RSDI, Rhinosinusitis Disability Index; ROQ, rhinitis outcomes questionnaire; RSOM, Rhinosinusitis Outcome Measure; SNOT, Sinonasal Outcome Test.
From Van Oene CM, van Reij E, Sprangers M, Fokkens WJ. Quality assessment of disease-specific quality of life questionnaires for rhinitis and rhinosinusitis: a systematic review. Allergy. 2007;62:1359–1371, with permission.

Tips and Tricks

When selecting a PROM, there is a trade-off between sensitivity and respondent burden. Longer instruments provide more detailed outcome information but at the risk of lower response rates, as they become too onerous to complete.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Audiology, Basic Science and Patient Assessment, Oral and Maxillofacial Surgery;Orthopaedics and Trauma Surgery, Otorhinolaryngology, Phoniatrics, Rhinology and Skull Base Surgery

Jun 28, 2020 | Posted by drzezo in OTOLARYNGOLOGY | Comments Off

Ento Key

Fastest Otolaryngology & Ophthalmology Insight Engine

Patient-reported Outcome Measures and Measurement Tools in Rhinology