Effective online education should start with careful planning and a focused understanding of subject requirements and student needs. Compared to on-campus students, online students usually lack opportunities to interact with their peers and lecturers in a face-to-face way.

The online discussion board provides students with a channel to interact with others by submitting, replying to and reading posts. Analyzing posts in the discussion boards is therefore important for better understanding our online students and the characteristics of their posted messages.

In this work, we answer two major questions – what messages students post and how they post – by analyzing the posted data of the two subjects.


We collected the students’ posts on the discussion broads of two subjects: MGT510 and ITC114. The statistics of the posts are as follows.

Table 1: Statistics of the posts in MGT510 and ITC114

Subject nameSessions#
# students
posting messages
MGT510Strategic Management201830BD4642
ITC114Introduction to
Database Systems
201560 AD40858
201590 AD13125
201590 SI119
201630 SI11
201660 AD14144
201690 AD1210
201730 SI21
201760 AD5617


For the data cleaning and pre-processing, we performed the following steps on each discussion board post:

  • Erase punctuation
  • Convert the text data to lowercase
  • Remove stop words
  • Remove words with 2 or fewer characters, and words with 15 or greater characters.

After pre-processing the data, we conducted an analysis of data from different aspects such as word count analysis and then visualized the results.


1.Know your students from their introduction posts

In MGT510, the students were asked to provide a self-introduction to kick off the subject:

From this word cloud, we may conclude:

  • The students who posted messages are mostly from Sydney, Brisbane, and Canberra;
  • It seems that the students have several years of experiences in management positions; and
  • The students are interested in management, strategies, business and services, which are exactly the main contents of this subject. So we can assume the subject content will meet the expectations of the students.

2. Know your students from the frequencies of words in their posts

We counted word pairs (bigrams) in messages, and found the top 10 bigrams as follows:

Table 2: Top 10 word pairs


From this table, we could see that the most important and difficult concepts related to the subject content have appeared in the student posts. In addition, it seems that the students care most about the online quiz and meetings.

3. The semantic similarities between the pairs of the first 50 posts in ITC 114

4. The number of posts over different sessions

5. The number of posts on different days

The percentages of posts in different days are 22.05% for Monday, 17.32% for Sunday, 14.96% for Tuesday, 13.39% Wednesday, 13.12% for Saturday, 12.20% Tuesday, and 6.96% for Friday.

6. The number of posts over hours

7. The numbers of words in the posts

8. The distribution of posts by a student

Turning data into actionable insight

After we briefly investigating what and how students post messages, we need to consider next steps—actionable insight. Actionable insight is the result of data-driven analytics of patterns that occur in students’ posts.

By analyzing the posts—important data regarding online students—the lecturers are able to develop an understanding of students’ needs and expectations. More importantly, we can make data-informed decisions. For example, we may post our questions and read students’ posts on the discussion boards on Mondays and Tuesdays.

This comprehensive data analysis will shed light into optimal ways of creating meaningful learning experiences for our distance students. This work is just a showcase.