The Teacher Demand and Supply Model (TDSM) supports Jordan's Ministries of Education (MoE) and Higher Education and Scientific Research (MoHESR) to make data-driven decisions and prepare multi-year supply and demand projections for grade K-12 teachers of Jordanian students in the Kingdom's public schools. The model uses data entered into the Open Education Management Information System (OpenEMIS1)) to calculate teacher surpluses and shortages in each school from 2016 to five years in the future, allowing policymakers to recognize trends and shifts, and incorporate those into their considerations for policies and incentives impacting teacher recruitment and retention. TDSM will support stakeholders across the education sector in their efforts to plan strategically; it will inform the MoE and university partners about where teachers are going to be needed and for which subjects. The TDSM workflow is depicted in the illustration below.
Browse to the Teacher Demand and Supply Model https://tdsm.moe.gov.jo/
Choose from Filter choices: Click on the 'x' to remove individual filtered choices or click 'Clear Filters' to remove all
Choose Group By preference:
Choose Years to forecast and compare. Years before the current year are ACTUAL data. Years after the current year are FORECASTS.
Choose visualization preference:
Scroll down to the Data Table to see more details.
Filtered by Specialization - 'Arabic' by default Choose filters to display schools on the map. Click on the numbers on the map to drill down to geographic area. Continue to click on the numbers to get to the school blue points Click on a school blue point to get the school level data on Teachers Needed
Users can download the TDSM data as an Excel spreadsheet. To do so, scroll down the TDSM page until you see the “Download data as Excel” button at the top of the data table. Click on this button and save the file to your computer.
The TDSM data download contains the following columns:
The following columns are repeated for each year from 2016 to five years in the future, with the year indicated in parentheses after the column name. For example: “Teachers(2016)”, “Teachers(2017)” … “Teachers(2026)”, “Teachers(2027)”.
QUESTION: Why can't I select ALL or MULTISELECT for SPECIALTIES?
ANSWER: If specialties are grouped together (multi select) or (all) then the surpluses and shortages are obscured which defeats the task of the TDSM. Example: if all specialties are grouped together then a school with an excess of 2 teachers for Arabic and 2 for English but needs 2 teachers each for Math and Science, then the TDSM would return a prediction of 0 teachers needed when in reality the school needs 4 teachers. At a larger scale if Jordan needs 100 Arabic Teachers and has 100 too many English teachers, they'd cancel each other out if grouped with 'All'.
QUESTION: Why does the prediction show a pattern of exponential or ascending needs/overage for future years?
ANSWER: The model is using the previous years data to calculate the prediction going forward.
Follow these steps to run the script:
1. Navigate to the data-model
cd path/to/data-model
2. Create and Activate Virtual Environment (if not already present):
If you haven't created a virtual environment, you can create one using the following command:
python -m venv venv
Replace venv with the desired name for your virtual environment.
Then, activate the virtual environment:
source venv/bin/activate
3. Install Dependencies from requirements.txt: If you haven't already installed the required packages, use pip to install them from the requirements.txt file:
pip install -r requirment.txt
4. Run the Script:
Execute the Python script data-module/tdsm.py. to fetch the data. This will take 30-40 minutes depending upon your internet connection speed. Give at least 1 hour to complete this task.
python data-module/tdsm.py
You will then be promted with the following the following questions, after which TDSM will fetch all new data from OpenEMIS.
Enter password or leave blank to use the default password: Enter username or leave blank to use the default username: Enter API key or leave blank to use the default API key:
You will then be prompted with the following:
Enter the beginning year of the academic period for which projections should start or leave blank to use [current year].
For example, if you want TDSM to use all data through the 2023/2024 year to make projections from 2025 onward, you would enter 2025.
The TDSM program will then generate all data and forecasts.
5. Moving Data After running the script and generating the data, you may need to move the data files from one directory to another. This step is not specific to the script and can be performed using standard file management techniques. This data files should be moved from data-model/content under src/js folder. The list of data files can e found in the data files section of this wiki.
Make sure you have the necessary permissions and use commands like mv (on Unix-like systems) or move (on Windows) to relocate the files as needed.
The TDSM back end will then fetch the latest data from OpenEMIS if you told it to, recalibrate the models, and generate new forecasts. Wait until you see “UPDATE COMPLETE” in the ouput window.
Navigate to the TDSM front end to view the updated forecasts.
Inside the project folder is file called “archive.sh”, this script will archive the old data for the project create a record for the archive in archive list. Run archive.sh from your terminal, script will ask for the name for archive, script will move all old data to a new folder under Archives.
Whenever TDSM receives a new subject during an OpenEMIS dowload, it automatically associates it with an existing teacher specialization. If TDSM's mapping rules do not apply to the new subject, “Other” is used as the default specialization. The list of all subject-to-specialization mappings is stored in the subjects.csv file. To change the mapping of a given subject, modify the corresponding value in the specialization column of subjects.csv. The specialization can be one that already exists in the file or a new specialization. TDSM does not overwrite existing subject x specialization mappings and will apply this new mapping to all data the next time it is updated.
To add a new student track, e.g., Eleventh grade engineering, to TDSM, first check to see that the OpenEMIS grade label is already contained in the grade_mappings lookup table. If not, add the new grade mapping to this file. For example, you would add “Eleventh grade” in the name column and “11” in the grade column. Next, add the new OpenEMIS track labels, separated by commas, and the corresponding suffix to the the track_mappins lookup table. For example, for engineering, you might add “engineering,هندسة” to the names column and “eng” to the track column. Finally, you would create a new row in the weekly_classes_per_specialization lookup table, with “11_eng” in the grade column and the number of hours required under each specialization column. For specializations where instruction is not required, leave the column blank.
The weekly hours of required instruction for each student track are contained weekly_classes_per_specialization lookup table. To adjusted the required hours of instruction for a given student track, modify the values in this file. Leave the cells blank for specializations where no instruction is required.
Each directorate is uniquely identified in OpenEMIS by an area_id. A list of all OpenEMIS area_ids can be found in the areas.csv table, which is updated each time the TDSM data are refreshed. Additionally, each school in OpenEMIS is given a more granular administrative_area_id. Administrative_area_ids are nested within area_ids. Because OpenEMIS does not have unique identifiers for liwas, each known administrative_area_id+area_id pair has been mapped to a liwa in the administrative_area_crosswalk.csv file. If an administrative_area_id is not mapped in this table, any school with that administrative_area_id will not be included in TDSM. All unmaped schools are listed in the unmapped_schoools.csv file. To add a new administrative_area_id+area_id pair to TDSM, add a new row the administrative_area_crosswalk.csv file, filling in a value for each column. The next time the TDSM data are refreshed, the newly mapped administrative_area_ids will then be included.
The TDSM models automatically recalibrate whenever new data are downloaded from OpenEMIS. However, the models can be further refined by changing the model parameters or the algorithm itself. This can be done by modifying the TDSM back-end code (data-module/tdsm.py) which is written in Python version 3.10.12. See the forecasting models section for details on the models.
TDSM version 1.0 was deployed February 2022. Version 2.0 is scheduled to deploy in 2024.
TDSM 1.0 | TDSM 2.0 | |
---|---|---|
Data feed | Files manually generated from OpenEMIS and sent to TDSM | Automatically pulls from OpenEMIS via its API |
Granularity | Directorate-level | School-level |
Teachers needed calculation | Based on user-defined class size | Based on number of sections in each grade, track, and school |
Forecasting algorithm | Ridge regression | Extreme gradient boosting |
Visualizations | Bar Chart, Pie Chart, Line Chart, Surplus/Shortage Chart | Teacher Demand/Supply Chart, Line Chart, Teacher Excess/Teacher Needed Chart, Schools Map |
TDSM predicts the count of Jordanian students and civil service teacher FTEs in public schools for the next five years. On execution, TDSM retrieves all student and teacher records from 2016 onward from OpenEMIS, recalibrates its models, and generates fresh forecasts. Forecasts include only schools administered by the Ministry of Education and exclude Syrian students and contract teachers.
The student and teacher forecasting models use a machine learning algorithm called extreme gradient boosting. Both models are generated by the Python implementation of XGBoost version 1.7.6. Function call: XGBRegressor(objective ='reg:squarederror', colsample_bytree = 0.3, learning_rate = 0.5, max_depth = 5, alpha = 100, reg_lambda=100, n_estimators = 500). Below are the model predictors (features) and their relative importance in each model as of April 2024. As the models are recalibrated on additional data over time, the importance of each predictor may change.
The following tables are used by TDSM to compile data and generate forecasts.
The following tables are generated by TDSM each time it loads new data from OpenEMIS Jordan. These tables are persisted so that TDSM can be rerun without having to reload data from OpenEMIS.
The following files are generated by TDSM for use by the user interface.