Publications
and Materials
1999 NRP
Progress Report
Section 5: Methodology
1999 NRP Progress Report
Table of Contents
Introduction
The importance of the issues
under consideration by the Panel cannot be overstated.
For decades educators have been studying how children
learn to read, often producing conflicting results.
More recently, science has opened windows that allow
researchers to observe how the brain functions as reading
skills develop. Although these advances have afforded
a clearer understanding of how the brain processes information
transmitted through the written word, the issues remain
complex; the debates continue.
Many believe the debates have
gone on long enough. Congress has recognized the urgency
of sorting through the research and, based on trustworthy
evidence, developing recommendations and strategies
that can be used directly by educators in the classroom.
That is the Panel's task.
The Panel believes that it would
not have been possible to accomplish the mandate of
Congress without first hearing directly from consumers
of this information -- teachers, parents, and students
-- about their needs and their understanding of the
research. Although the regional hearings were not intended
as a substitute for scientific research, the hearings
gave the Panel an opportunity to listen to the voices
of those who will need to implement any determination(s)
the Panel develops. The hearings gave members a clearer
understanding of the issues important to the public.
As a result of these hearings,
the Panel altered and broadened its own agenda. It decided,
for example, that it would be important to examine issues
related to teaching standards and practices, since it
was clear that the public was very concerned about these
matters. The Panel also decided that the issue of research
evaluation methodology itself was so important that
it should spend time defining a methodology that would
constitute a rigorous and replicable scientific exploration.
Meanwhile, the Panel understood
that criteria had to be developed as it considered which
research studies would be eligible for assessment. There
are two reasons for determining such guidelines or rules
from the beginning. First, the use of common search
and selection, analysis, and reporting procedures will
allow this effort to proceed, not as a diverse collection
of independentand possibly unevensynthesis
papers, but as parts of a greater whole. The use of
common procedures will permit a more unified presentation
of the combined methods and findings. Second, the amount
of synthesis needed is great, and, consequently, the
Panel must work in diverse subgroups to complete the
reports. However, in the end the Panel will need to
arrive at findings that all members of the NRP will
be able to endorse. Common procedures should increase
the Panels ability to reach final agreements.
Return to Top of Page
Conceptualization
of Research Questions and Problem Identification Procedures
Congress mandated that the NRP
conduct a series of research reviews on the teaching
of reading. The Panel, through an examination of various
public databases, determined that there is a universe
of approximately 100,000 studies on reading published
since 1966, and, perhaps another 15,000 completed before
that time. It was apparent that the Panel could not
review all of this material adequately, in the time
allotted.
To ensure success, several actions
were taken. First, a request was made to extend the
Panels timeline by one year. This request was
granted. Second, support for hiring research assistants
and consultants was sought from the National Institute
of Child Health and Development and this was provided.
Third, decisions were made to narrow the search by limiting
the reviews to only those studies that focus directly
on childrens reading development (preschool through
grade 12) and are published in English in a refereed
journal. The Panel was asked to defer issues of second
language learning and bilingual education, as these
were to be the focus of future panels and new research
efforts.
Following its
Charge, the Panels reviews will seek research-based
answers to seven questions that the Panel carefully
determined to be of great importance in childrens
reading development and essential to its Charge:
- Does instruction in phonemic awareness improve reading?
If so, how is this instruction best provided?
- Does phonics instruction improve reading achievement?
If so, how is this instruction best provided?
- Does guided oral reading instruction improve fluency
and reading comprehension? If so, how is this instruction
best provided?
- Does vocabulary instruction improve reading achievement?
If so, how is this instruction best provided?
- Does comprehension strategy instruction improve
reading? If so, how is this instruction best provided?
- Do programs that increase the amount of childrens
independent reading improve reading achievement and
motivation? If so, how is this instruction best provided?
- Does teacher education influence how effective teachers
are at teaching children to read? If so, how is this
instruction best provided?
These questions represent topics of
widespread interest in the field of reading education.
They have been articulated in a wide range of theories,
research studies, instructional programs, curricula,
assessments, and policies as being central issues in
reading achievement. It is likely that clarification
of the matrix of the evidence supporting this approach
will lead to improved instruction and to greater learning.
Each subgroup will generate a list of additional subordinate
questions that they will attempt to pursue within each
of these major questions.
It must be remembered, however, that
these are not the only issues of importance in learning
to read. The Panels silence on other issues should
not be interpreted as indicating that other issues have
no importance or that improvements in those areas would
not lead to greater achievement. The review of other
areas of potential value must be left to the later work
of this or future panels or independent scholars.
Return to Top of Page
Search Procedures
Each subgroup will conduct a search
of the literature using common procedures, describing
in detail the basis and rationale for its topical term
selection, the strategies employed for combining terms
or delimiting searches, and the search procedures used
for each topical area.
Each subgroup will limit the period
of time covered by its searches on the basis of relative
recentness and how much literature the search will generate.
For example, it may be wise to limit the years searched
to the number of most recent years that will identify
between 300-400 potential sources. This scope can be
expanded in later iterations if it appears that the
nature of the research has changed qualitatively over
time, or, if the proportion of useable research identified
is small (e.g., less than 25 percent), or if the search
simply represents too limited a proportion of the total
set of identifiable studies. Although the number of
years searched may vary between subgroup topics, decisions
regarding the number of years to be searched will be
made in accord with shared criteria.
Applying the restriction that any study
selected must focus directly on children's reading development
(preschool through grade 12) and be published in English
in a refereed journal, each subgroup will search both
PsycINFO and ERIC databases. Subgroups may use additional
databases when appropriate. Although the use of a minimum
of two databases will identify much duplicate literature,
it will also afford the opportunity to expand perspective
and locate articles that would not be identifiable through
a single database.
Identification of each study selected
will be documented for the record and each will be assigned
to one or more members of the subgroup who will examine
the title and abstract. Based upon this examination
the subgroup member(s) will, if possible at this stage
of review, determine whether the study addresses issues
within the purview of the research questions being investigated.
If it does not, the study will be excluded and the reason(s)
for its exclusion will be detailed and documented for
the record. If it does, the study will undergo further
examination.
After this initial examination, the
study, if not excluded in accord with the preceding
criteria, will be located and examined further to determine
whether the following criteria for inclusion in the
subgroup's analysis are met:
Study participants must be carefully
described (age, demographic, cognitive, academic,
and behavioral characteristics);
Study interventions must be described
in sufficient detail to allow for replicability,
including how long the interventions lasted and how
long the effects lasted;
Study methods must allow judgments
about how instruction fidelity was insured; and
Studies must include a full description
of outcome measures.
These criteria for assessing research
literature are widely accepted by scientists in every
discipline, and using them assures that all studies
included in the final analysis meet rigorous standards
that enhance the validity of any conclusions drawn.
If the study does not meet these criteria
or cannot be located, the study will be excluded from
subgroup analysis and the reason(s) for its exclusion
will be detailed and documented for the record. If the
study is located and meets the criteria, the study will
become one of the subgroup's core working set of studies.
The core working sets of studies gathered by the subgroups
will be coded as described below and then analyzed in
search of answers to the questions posed in this chapter
and in the charge to the Panel.
If the core set of studies is insufficient
to answer these questions, less recent studies may be
screened for eligibility for, and inclusion in, the
core working sets of studies. This second search may
employ such resources as the reference lists of all
core-working studies and known literature reviews to
identify cited studies that may meet the Panel's criteria
for inclusion in the subgroups' core working sets of
studies. Any second search will be described in detail
and will apply precisely the same search, selection,
exclusion, and inclusion criteria and documentation
requirements as were applied in the subgroups' initial
search.
Manual searches, again applying precisely
the same search, selection, and exclusion criteria and
documentation requirements as were applied in the subgroups'
electronic searches, may be conducted as a supplement
to electronic domains. Manual searching of recent journals
that publish research on specific topics of the subgroups'
analyses will compensate for the delay in appearance
of these journal articles in the electronic databases.
Other manual searching will be done in relevant journals
to include eligible articles that should have been selected,
but were missed in electronic searches.
Return to Top of Page
Source of
Publications: The Issue of Refereed and Non-Refereed
Articles
In preparation for issuing its final
report, the subgroup searches will focus exclusively
on research that has been published or has been scheduled
for publication in refereed journals. Determinations
and findings for claims and assumptions that guide instructional
practice will depend on such studies. Any search or
review of studies that has not been published through
the peer review process may be identified and published
only as separate and distinct from evidence drawn from
peer reviewed sources (i.e., in an appendix) and will
not be referenced in the Panels report. These
non-peer-reviewed data may be treated as preliminary/pilot
data that illuminate potential trends and areas for
future research. Information derived in whole or in
part from such studies may not be represented at the
same level of certainty as findings derived from the
analysis of refereed articles.
Return to Top of Page
Orders of
Evidence and Breadth of Research Methods Considered
Each type of research (descriptive-interpretive,
correlational, experimental) lays claim to particular
warrants, and these warrants differ markedly. It is
important that we use a wide range of research, but
that we use such research in accordance with the purposes
and limitations of the various research types.
To make a determination that any instructional
practice could be or should be adopted widely to improve
reading achievement indicates a belief, an assumption,
or a claim that the practice is causally linked to a
particular outcome. The highest standard of evidence
for such a claim is the experimental study, in which
it is proved that treatment can make such changes and
effect such outcomes. Sometimes when it is not feasible
to do a genuine experiment, a quasi-experimental study
is done. This type of study provides a standard of evidence
that, while not as high, is acceptable to many investigators.
To sustain a claim it is necessary that there be experimental
or quasi-experimental studies of sufficient size or
number, and scope (in terms of population served), and
that these studies be of moderate to high quality. When
there are either too few studies of this type, or they
are too narrowly cast, or they are of marginally acceptable
quality, then it would be essential to have substantial
correlational or descriptive studies that concur with
the findings if a claim is to be sustained. No claim
can be determined on the basis of descriptive or correlational
research alone. The use of these procedures should increase
the possibility of reporting findings with a high degree
of internal validity.
Return to Top of Page
Coding of
Data
Characteristics and outcomes of each
study that has met the screening criteria described
earlier will be coded and analyzed, unless otherwise
authorized by the Panel. The data gathered in these
coding forms will be the information used in the final
analyses and so it is important that the coding be done
systematically and reliably.
The various subgroups will rely on
a common coding form developed by a working group of
the Panel's scientist members and modified and endorsed
by the Panel. However, some changes may be made to the
common form by the various subgroups for addressing
different research issues. As coding forms are developed,
any changes to the common coding form will be shared
with and approved by the Panel to ensure consistency
across various subgroups.
Unless specifically identified and
substantiated as unnecessary or inappropriate by a subgroup
and agreed to by the Panel, each form for analyzing
studies will be coded for the following categories:
- Reference
- Citation (standard APA format)
- How this paper was found (e.g., search of named
data base, listed as reference in another empirical
paper or review paper, hand search of recent issues
of journals)
- Narrative summary that includes distinguishing features
of this study
- Research Question: the general umbrella
question that this study addresses
- Sample of Student Participants
- States or countries represented
in sample
-
Number of different schools represented
in sample
-
Number of different classrooms
represented in sample
-
Number of participants (total,
per group)
-
Age
-
Grade
-
Reading levels of participants
(prereading, beginning, intermediate, advanced)
-
Whether participants were drawn
from urban, suburban, or rural setting
-
List any pretests that were administered
prior to treatment
-
List any special characteristics
of participants including the following if relevant:
-
Explain any selection restrictions
that were applied to limit the sample of participants
(e.g., only those low in phonemic awareness were
included)
-
Contextual information: concurrent
reading instruction that participants received in
their classrooms during the study
-
Schools or classrooms or students
were selected from the population of those available
-
Convenience or purposive sample
-
Not reported
-
Sample was obtained from another
study (specify study)
-
Setting of the Study
-
Design of Study
-
Independent Variables
a. Treatment Variables
-
Describe all treatments and control
conditions; be sure to describe nature and components
of reading instruction provided to control group
-
For each treatment, indicate whether
instruction was explicitly or implicitly delivered
and, if explicit instruction, specify the unit of
analysis (sound-symbol; onset/rime; whole word)
or specific responses taught. [NOTE: If this category
is omitted in the coding of data, justification
must be provided.]
-
If text is involved in treatments,
indicated difficulty level and nature of texts used
-
Duration of treatments (given to
students)
-
Minutes per session
-
Sessions per week
-
Number of weeks
-
Number of trainers who administered
treatments
-
Teacher/student ratio: Number
of participants to number of trainers
-
Type of trainer (classroom teacher,
student teacher, researcher, clinician, special
education teacher, parent, peer, other)
-
List any special qualifications
of trainers
-
Length of training given to trainers
-
Source of training
-
Assignment of trainers to groups:
-
Random
-
Choice/preference of trainer
-
All trainers taught all conditions
-
Cost factors: List any features
of the training such as special materials or
staff development or outside consultants that
represent potential costs
b. Moderator Variables:
List and describe other non-treatment independent
variables included in the analyses of effects (e.g.,
attributes of participants, properties or types of
text)
-
Dependent (Outcome) Variables
-
Code each as standardized or
investigator-constructed measure
-
Code each as quantitative or
qualitative measure
-
For each, is there any reason
to suspect low reliability? (yes / no)
-
List time points when dependent
measures were assessed
-
Non-equivalence of groups
-
Result (for each measure)
-
Record the name of the measure
-
Record whether the differencetreatment
mean minus control meanis positive or negative
-
Record the value of the effect
size including its sign (+ or -)
-
Record the type summary statistics
from which the effect size was derived
-
Record number of people providing
the effect size information
-
Coding Information
If text is a variable, the coding will
indicate what is known about the difficulty level and
nature of the texts being used. Any use of special personnel
to deliver an intervention, use of special materials,
staff development, or other features of the intervention
that represent potential cost will be noted. Finally,
various threats to reliability and internal or external
validity (group assignment, teacher assignment, fidelity
of treatment, and confounding variables including equivalency
of subjects prior to treatment and differential attrition)
will be coded. Each subgroup may code additional items
that they deem to be appropriate or valuable to the
specific question being studied.
A study may be excluded at the coding
stage only if it is found to have so serious a flaw
that its use would be misleading. The reason(s) for
exclusion of any such study will be detailed and documented
for the record. When quasi-experimental studies are
selected, it is essential that each include both pre-treatment
and post-treatment evaluations of performance, and that
there be a comparison group or condition.
Each subgroup will conduct an independent
re-analysis of a randomly designated 10 percent sample
of studies. Absolute rating agreement should be calculated
for each category (not for forms). If absolute agreement
falls below 0.90 for any category for occurrence or
non-occurrence agreement, the subgroup must take some
action to improve agreement (e.g., multiple readings
with resolution, improvements in coding sheet).
Upon completion of the coding for each
study published between 199395, a letter will
be sent to the first author of the study requesting
any missing information. Any information that is provided
by authors will be added to the database.
After its search, screening, and coding,
a subgroup shall determine whether for a particular
question or issue a meaningful meta-analysis can be
completed, or whether it is more appropriate to conduct
a literature analysis of that issue or question without
meta-analysis, incorporating all of the information
gained. The full panel will review and approve or modify
each such decision.
Return to Top of Page
Data Analysis
When appropriate and feasible, effect
sizes will be calculated for each intervention or condition
in experimental and quasi-experimental studies. The
subgroups will use the standardized mean difference
formula as the measure of treatment effect. The formula
will be:
(Mt - Mc)
/ 0.5(sdt + sdc)
where
Mt is the mean of the
treated group,
Mc is the mean of the control group,
sdt is the standard deviation of the treated
group, and
sdc is the standard deviation of the control
group.
When means and standard deviations
are not available, the subgroups will follow the guidelines
for the calculation of effect sizes as specified in
Cooper and Hedges (1994).
The subgroups will weight effect sizes
by numbers of subjects in the study or comparison to
prevent small studies from overwhelming the effects
evident in large studies.
Each subgroup will use median and/or
average effect size when a study has multiple comparisons,
and will only employ the comparisons that are specifically
relevant to the questions under review by the subgroup.
Return to Top of Page
Expected
Outcomes
Analyses of effect sizes will be undertaken
with several goals in mind. First, overall effect sizes
of related studies will be calculated across subgroups
to determine the best estimate of a treatments
impact on reading. These overall effects will be examined
with regard to their difference from zero (Does the
treatment have an effect on reading?), strength
(If the treatment has an effect, how large is that
effect?), and consistency (Did the effect of
the treatment vary significantly from study to study?).
Second, the Panel will compare the magnitude of a treatments
effect under different methodological conditions, program
contexts, program features, outcome measures, and for
students with different characteristics. The appropriate
moderators of a treatments impact will be drawn
from the distinctions in studies recorded on the coding
sheets. In each case, a statistical comparison will
be made to examine the impact of each moderator variable
on average effect sizes for each relevant outcome variable.
These analyses will enable the Panel to determine the
conditions that alter a programs effects and the
types of individuals for whom the program is most and
least effective. Within-group average effect sizes will
be examined as were overall effect sizes, for differences
from zero and strength. The analytic procedures will
be carried out using the techniques described in Cooper
and Hedges (1994).
Return to Top of Page
1999 NRP Progress Report
Table of Contents
|