
Isabel Pizarro and Leonor Pérez Ruiz
miscelánea 72 (2025): pp. 69-91 ISSN: 1137-6368 e-ISSN: 2386-4834
74
between 2015 and 2020. Its texts cover various fields related to English studies,
namely applied linguistics and grammar, literature, cultural studies and the study
of English as a foreign language (EFL). They have been evenly selected, with 50%
corresponding to literature and cultural studies fields and 50% corresponding to
applied linguistics, grammar and EFL, thereby reducing the risk of bias in the
rhetorical patterns observed. All the UDs are rhetorically structured, containing
at least an introduction, a body and a conclusion. They were uploaded to an open
repository and are related to the aforementioned disciplines; as such, they are
written under similar conditions and are therefore comparable (Moreno 2008: 35).
Based on the stated criteria, 100 texts were randomly selected from US and
Spanish universities that offer free access to full-text dissertations. The inclusion
of dissertations from US universities in the corpus was primarily motivated by the
predominance and global influence of American English in academic publishing
and higher education. The selected US universities include the University of
Arizona, the Ohio State University, the University of Michigan, the College of
William & Mary, Texas A&M University, the University of Utah, the University
of Florida, Brown University, the University of Vermont and Brandeis University;
and the Spanish universities are Universidad de Valladolid, Universidad de
Salamanca, Universidad de Oviedo, Universidad de Granada, Universidad de
Zaragoza, Universidad de la Rioja, Universidad de la Coruña, Universidad de la
Laguna, Universidad de Málaga and Universidad de Alicante.
To avoid source bias, a maximum of three UDs were downloaded from each
university repository, specifically the first three that met the inclusion criteria
specified above. Finally, due to the nature of the genre studied, this is a multi-
author corpus. Since we were unable to contact the authors, we used the
students’ names and affiliations to infer their first language, thereby maximising
the likelihood of including native speakers in both groups. Thus, following
Luzón (2018), we only included UDs authored by students with Anglophone
names in the L1 English subcorpus and by students with Spanish names in the
English L2 subcorpus. We assumed that the US students were native English
speakers and that the Spanish students were proficient in EFL, as expected
upon completing a degree in English studies. It was not considered necessary
to anonymise the corpus, as all the texts included in the two subcorpora are
publicly available through institutional repositories and written for academic
assessment. No personal or sensitive data were included, and the analysis focused
on rhetorical patterns, not on individual authors. This follows standard practice
in corpus-based discourse studies.
Once the texts were downloaded, we manually deleted the sections that would
introduce noise into our analysis, i.e. reference lists, annexes, acknowledgements