Identifying Latent Data Structures: Structural Equation Modelling II


Hierarchically clustered (multilevel or nested) data are common in most scientific fields, including the medical, biological and social sciences. For example, individuals may be nested within geographical areas, institutions, or companies, the canonical example being students nested within schools. Multilevel data also arise in longitudinal studies where one or several outcomes are measured on several occasions. Another feature of multilevel data is that variables can be measured at any level. For example, we may have collected measures of student outcomes and student characteristics, but we may also have collected variables at the school level.

This course is part of a larger course series in Data Analysis consisting of 19 individual modules. Find more information and enroll for this module via

This course starts with a refresher of multilevel modeling (MLM). We will discuss key concepts of MLM, introduce the linear mixed model, and provide several examples of univariate multilevel regression analysis. All analyses will be done in R, using a variety of packages (nlme, lme4, lavaan). Next, we will discuss the relationship between classic (single-level) regression, multilevel regression, and structural equation modeling (SEM). We will do this both from a theoretical point of view as well as from a software point of view. We will show how and under which conditions (classic, non-multilevel) SEM software can produce identical results as dedicated multilevel (or mixed modeling) software.

On the second day, we will introduce the multilevel SEM framework. We will start from a regression perspective, and gradually proceed from a simple regression analysis, to a two-level regression analysis, towards more complicated (regression) models, exploiting the full power of the multilevel SEM framework. Special attention will be given to multilevel mediation models, and the difference between the latent and manifest covariate approach to represent observed exogenous covariates at the between level. Next, we will take a latent-variable (CFA) perspective, and discuss various examples of multilevel CFA, and eventually multilevel SEM involving latent variables and regressions among latent variables. Here, special attention will be given to the interpretation of the latent variables at both the within and between level, together with a typology of possible approaches. Along the way, we will discuss many practical issues including the role of centering, the treatment of missing and/or non-normal data, and how to deal with categorical data. Finally, we will discuss some alternative approaches to handle clustering in the data in a SEM framework, including the design-based (survey) approach, and the 'wide format' approach.

The main software used in this course is the open-source R package `lavaan' (see

Schrijf je hier in voor lessen uit deze cursus

Identifying Latent Data Structures: Structural Equation Modelling II

  • Type of course: This is an on campus course.
  • Dates & times: May 23 & 24, 2022, from 9 am to 12 pm and from 1 pm to 4 pm
  • Venue: Faculty of Psychology and Educational Sciences, Campus Dunant, PC-lokaal 1.2 - Dunant 2, 9000 Gent
  • Target audience: This course targets everyone who has had some exposure to either multilevel modeling and/or structural equation modeling, and who wants to deepen their understanding of both the theoretical and practical connection between the two frameworks. The course also targets everyone who wants to better understand the new multilevel SEM framework available in lavaan.
  • Exam/certificate: Participants who attend all classes receive a certificate of attendance via e-mail at the end of the course. Additionally, participants who follow both Part I and Part II of this course can, if they wish, take part in an exam. Upon succeeding in this test a certificate from Ghent University will be issued. The exam consists of a take home project assignment. Students are required to write a report by a set deadline.
  • Course prerequisites: Participants should have a solid understanding of regression analysis and basic statistics (hypothesis testing, p-values, etc.) at a level equivalent of Module 2 'Drawing Conclusions from Data: an Introduction' and Module 12 - Explaining and Predicting Outcomes with Linear Regression of this year's program. At least some minimal knowledge of multilevel modeling and/or structural equation modeling, equivalent to Part I to Module 7 - Identifying Latent Data Structures: Structural Equation Modelling is recommended.   Because lavaan is an R package, some experience with R consistent with the course content of Module 1 'Getting Started with R Software for Data Analysis' (reading in a dataset, fitting a regression model) is recommended, but not required.
  • Funding: => Our academy is recognised as a service provider for the 'KMO-portefeuille'. In this way small and middle sized businesses located in the Flanders region can save up to 30% on the registration fee for our courses. You can request this subsidy via up until 14 calender days after the course has started. => UGent PhD students can apply for a full refund from their Doctoral School.
  • Reduction: => If two or more employees from the same company enrol simultaneously for this course a reduction of 20% on the module price is taken into account starting from the second enrolment => Reduced prices apply to coworkers in governmental institutions, non-profit organisations and higher eduction as well as for students and the unemployed.
  • Enrolling for this course is possible via the IPVW-ICES website.