An Evaluation Paradox: The Issues of Test Validity in the Realm of Writing Test as the Final School Examination in the Indonesian Senior High School Milieu

  • David Imamyartha Universitas Negeri Jember
  • Gunadi Harry Sulistyo Universitas Negeri Malang


Even though there are four English language skills in the Indonesia’s national curriculum at upper secondary schools, each of these skills is given an unequal emphasis since only reading and listening skills are formally tested in the national examination. Although writing competence possesses a particular stake as the determinant of students’ achievement after students undergo a three-year education at the upper secondary school level, it appears that the existing writing tests are low in terms of test validity, as demonstrated by a preliminary study. A further study is carried out to probe the issues of test validity by deploying the framework of test validity, which pertains to theory-based validity, context validity, scoring validity, criterion-related validity, and consequential validity in the scrutiny of the existing writing tests. It is revealed that the current writing tests are fraught with validity problems in all of these facets of test validity. This is corroborated by interview data in the preliminary study and the analysis of the existing writing tests. These particular issues obviously evoke an ambivalence between the exalted educational objectives in the national curricula and the edifice of English assessment. Based on the findings, several implications and directions rise for future praxis of writing assessment.


Abedi, J. (2010). Performance assessments for English language learners. Stanford University: Stanford Centre for Opportunity Policy in Education.
Bachman, L.F. and Palmer, A. S. (1996). Language Testing in Practice: Designing and Developing Useful Language Tests. Oxford: Oxford University Press.
Braun, H. and Kanjee, A. (2006). Using Assessment to Improve Education in Developing Nations. Cambridge: American Academy of Arts and Sciences.
Brown, H. D. (2003). Language Assessment: Principles and Classroom Practices. California: Longman.
Creswell, J.W. (2012). Educational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research. Boston: Pearson Education.
Dornyei, Z. (2007). Research Methods in Applied Linguistics: Quantitative, Qualitative, Mixed Methodologies. Oxford: Oxford University Press.
Eyal, L. (2012). Digital Assessment Literacy—the Core Role of the Teacher in a Digital Environment. Educational Technology & Society, 15 (2): 37–49.
Fadilla, R. (2014). Development and Administration of the Speaking Test for the School Examination of Senior High School in Malang. Unpublished Undergraduate Thesis. Malang: The State University of Malang.
Gilson L, ed. (2012). Health Policy and Systems Research: A Methodology Reader Alliance for Health Policy and Systems Research. Geneva: World Health Organization.
Knight, P.T. (2002). Summative Assessment in Higher Education: practices in disarray. Studies in Higher Education Volume. 27 (3): 275-286.
Lai, E.R. (2011). Performance-based Assessment: Some New Thoughts on an Old Idea. Bulletin in Always Learning. ( Accessed on January 4th 2015.
Lai, E.R., Wei, H., Hall, E.L., Fulkerson, D. (2012). Establishing An Evidence-based Validity Argument for Performance Assessment. White Paper in Always Learning. ( Accessed on January 4th 2015
Lane, J. L. (2000). The Basics of Rubrics. Clayton
University Center for

Lane, S. (2010). Performance Assessment: State of the Art. Stanford: Stanford Centre for Opportunity Policy in Education.
Stecher, B. (2010). Performance Assessment in an Era of Standards-Based Educational Accountability. Stanford, CA: Stanford University, Stanford Centre for Opportunity Policy in Education.
Sulistyo, G.H. (2002). Language Testing: Some Selected Terminologies and Their Underlying Basic Concepts. State University of Malang: the Faculty of Letters.
Sulistyo, G.H. (2009). English as A Measurement Standard in National Examination: Some Grassroots’ Voice. TEFLIN Journal. 20 (1): 1-24.
Swaffield, S. and Dudley, P. (2010). Assessment Literacy for Wise Decisions. London: ATL – the Education Union.
Tantri, N.R. (2014). A Program Evaluation of MGMP (Teachers Profesional Development Forum) Program for English Senior High School Teachers in Sidoarjo. Unpublished Thesis, Graduate Program in English Language Teaching, Universitas Negeri Malang.
Weir, C.J. (2005). Language Testing and Validation: An Evidence-based Approach. New York: Palgrave McMillan.
Whitehead, D. (2007). Literacy Assessment Practices: Moving from Standardised to Ecologically Valid Assessments in Secondary Schools. Language and Education. 21 (5): 434-452.
Whitehead, D. (2008). Testing like you teach: The challenge of constructing local, ecologically valid tests. English Teaching: Practice and Critique. 7 (3): 10-25.
Wisconsin Education Association Council. (1996). Performance Assessment. Wisconsin: Education Issues Series.
Yin, R.K. (2003). Case study research: Design and methods(3rd ed.). California: Sage
How to Cite
Imamyartha, D., & Sulistyo, G. (2017). An Evaluation Paradox: The Issues of Test Validity in the Realm of Writing Test as the Final School Examination in the Indonesian Senior High School Milieu. Dinamika Ilmu, 17(1), 1-21.