Ml - analytics cron service

This is an automated scheduled cron module that triggers specific processess according to business requirements. The main data generation sequence is defined as per onboarded districts of which the dataframes tables have been created. It consists of three major processes:

1. Dailypull
2. Most used software
3. Recommendations

The dailypull refers to the process of updating the dataframes with the latest software usage data from the student analytics reporting table. The dataframe tables are labelled per district, where all districts consists columns of apps containing data for each student. ~ Read more on the function:
Most used software is calculates the most used software by the individual student as well as the software used on assessments or latest usage such as forecast data. ~ Read more on the function:
Recommendations provides suggestions on the best software to use based on the trackline of usage and overall district performance. It will produce recommendations only for students with assessments. ~ Read more on the function:

Note: The recommendations are resource heavy, hence why only allocated to a single instance, compared to the dailypull and most used

Main columns for sorting during operations
------------------------------------------+
- Col:      Type                 
------------------------------------------+

- Value:    1 -> student with assessement data

            2 -> student with forecast data
            
            [ ** Year to date ** ]

            3 -> student with latest pre aggregated score average

            4 -> student with old/previous pre aggregate scored average

----------------------------------------------------------------------------------------------+
- Col:      Performance: This defines the score bands according to district score definitions. 
            It allows for grouping and sorting as per system requirements
----------------------------------------------------------------------------------------------+

- Value:    excellent -> scores range 88 => 100

            satisfactory -> scores range 69 => 88

            needs improvement -> scores range 55 => 68
            
            unsatisfactory -> scores range 0 => 54

The main sequence of operation follows the following order.

1. The daily pull starts as the source of truth. After sorting and updating student data, the dataframes will contain appropriate scores data

2. After the daily pull completes, the most used software commences where it generates software data per student data type. Whether its assessment data
   Or forecast data.

3. Finally the recommendations process is executed to generate based only off assessment data only

Onbording a district

This sequence creates data for visualization on the dashboard per district which involves running the three main data generation processes above. Here's how we onboard a new district:

1. Create the dataframes table as per district id and selected applications
2. Add the district id to the global list in the `./trigger.py`
3. Run the data generation sequence.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.github/workflows		.github/workflows
app		app
predictor		predictor
.DS_Store		.DS_Store
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
dockerignore		dockerignore
from_caapdb.py		from_caapdb.py
manage.py		manage.py
requirements.txt		requirements.txt
wsgi-entrypoint.sh		wsgi-entrypoint.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ml - analytics cron service

Onbording a district

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

zanderswai/research-analytics-ml

Folders and files

Latest commit

History

Repository files navigation

Ml - analytics cron service

Onbording a district

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages