A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education

Park, Youngki; Shin, Youhyun

doi:10.3390/app122413008

Open AccessArticle

A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education

by

Youngki Park

¹

and

Youhyun Shin

^2,*

¹

Department of Computer Education, Chuncheon National University of Education, Chuncheon 24328, Republic of Korea

²

Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(24), 13008; https://0-doi-org.brum.beds.ac.uk/10.3390/app122413008

Submission received: 8 November 2022 / Revised: 12 December 2022 / Accepted: 13 December 2022 / Published: 18 December 2022

Download

Browse Figures

Versions Notes

Abstract

:

The existing block-based machine learning educational environments have a drawback in that they do not support model training based on large-scale data. This makes it difficult for young students to learn the importance of large amounts of data when creating machine learning models. In this paper, we present a novel programming environment in which students can easily train machine learning models based on large-scale data using a block-based programming language. We redefine the interfaces of existing machine learning blocks and also develop an effective model training algorithm suitable for block-based programming languages to enable “instant training” and “large-scale training”. As example educational applications based on this environment, we presented what is termed a “Question-Answering Chatbot” program trained on 11,822 text data instances with 7784 classes as well as a “Celebrity Look-Alike” program trained on 4431 image data instances with 7 classes. The experimental results show that teachers and pre-service teachers give high scores on all four evaluation measures for this environment.

Keywords:

Scratch; Tooee; K-12 education; natural language processing; large-scale training

1. Introduction

Block-based programming languages and their environments, such as Scratch [1,2], are widely used for primary and secondary school programming education. These programming languages are often used for the purpose of fostering students’ computational thinking skills [3,4,5,6]. Recently, many approaches that teach machine learning to young students using block-based programming languages have been devised. Two of the most popular machine learning educational environments are Machine Learning for Kids [7] and Cognimates [8]. Using these environments, students can create various types of programs, including machine learning applications that classify text or images.

However, it is difficult for students to train large-scale data using the existing environments for machine learning education. For example, when using Machine Learning for Kids, students cannot train a machine learning model on more than 100 images. As another example, when using Cognimates, students cannot train the model on more than 20 images for each class. Considering that the MNIST [9] and CIFAR-100 [10] datasets have 60,000 and 50,000 training samples, respectively, the existing block-based programming environments allow students to use relatively scant data as they train the model. Although students can recognize the importance of “more data” even with a small number of data samples [11,12,13], if students can experience training on a large number of data samples, they will understand this importance more clearly.

In this paper, we present a novel block-based programming environment in which primary and secondary school students can train machine learning models on their own large-scale data. First, we redefine the Tooee [14] block interfaces, after which we implement effective blocks that (1) make it easy for students to train machine learning models based on large-scale data and that (2) produce immediate results whenever students click on our machine learning blocks.

The rest of this paper is organized as follows: In Section 2, we introduce a machine learning educational environment for primary and secondary education using block-based programming languages. In Section 3, we propose a novel environment for machine learning education that allows young students to train large-scale data and present example applications for training text and image data based on our environment. Section 4 summarizes the survey results obtained through four crash courses conducted for teachers and pre-service teachers. Finally, we conclude the paper in Section 5.

Our research questions are as follows:

Research Question 1 (easiness): Can primary and secondary school students (K-12 students) easily train a machine learning model based on a large amount of data (more than 10,000 training examples) with the blocks we propose in this paper?
Research Question 2 (interestingness): Can primary and secondary school students (K-12 students) be interested in training a machine learning model based on a large amount of data (more than 10,000 training examples) with the blocks we propose in this paper?
Research Question 3 (importance): Is it important for primary and secondary school students (K-12 students) to gain experience in implementing machine learning applications trained on a lot of data?

2. Related Work

Machine Learning For Kids [7] and Cognimates [8] are two of the most popular block-based machine learning educational environments. Both environments are based on Scratch and provide features that allow students to train machine learning models using text data or image data. They provide separate web pages for training machine learning models. For example, through these web pages, we can easily add images from the web to the training data or add images taken with a webcam to the training data. When we click the training button, the machine learning model is trained, and we can switch to the Scratch screen and use the trained model using the provided Scratch extension blocks.

Teachable Machine [15] is one of the most popular environments for machine learning education, allowing students to train machine learning models using images, sounds or poses without programming. On the Teachable Machine website, they can simply use the mouse to upload or create their training data and click the “Train Model” button to train their machine learning model. In the second version of Teachable Machine, students can download the trained model as a TensorFlow model, so they can use the model using a programming language such as Python or JavaScript.

Entry [16] is one of the most used machine learning educational environments in South Korea. The blocks provided by Entry are similar to those provided by Scratch. However, these blocks are characterized in that they have been modified to reflect the country’s primary and secondary school curriculum. Similar to Machine Learning for Kids and Cognimates, Entry can train a machine learning model and use it to perform inference. The interfaces of Entry that train a machine learning model and make inferences using the trained model are similar to those of Machine Learning for Kids and Cognimates: we can input training data through the model training website and exploit the trained model through the Entry machine learning blocks.

Scratch Text Classifier [17] also provides machine learning blocks to train text data, similar to those provided by other programming environments. It differs from other programming environments in that it exploits Google’s Universal Sentence Encoder [18] and the K-Nearest Neighbors library provided by TensorFlow.js [19] for text classification. Moreover, it provides a machine learning menu within Scratch. Both Scratch Text Classifier and Cognimates were created at the MIT Media Lab.

Tooee [14,20] provides Scratch extension blocks to train machine learning models and perform inference. One feature of this environment that sets it apart from Machine Learning For Kids, Cognimates, Teachable Machine, Scratch Text Classifier and Entry is that it trains a machine learning model using only programming blocks. In other words, students do not need to access a separate model training webpage when using this environment. The interfaces of these blocks are designed as blocks in the form of conversations with a virtual friend named Tooee.

Unfortunately, as of October 2022, all of the block-based machine learning educational environments presented above are not suitable for training large amounts of data. For example, we cannot train more than 100 images using Machine Learning for Kids, and we cannot train more than 20 images for each class using Cognimates. As another example, we cannot use Entry to train images with more than ten classes. Moreover, the current version of Tooee requires inordinate amounts of time when training a machine learning model using a large amount of data, and Scratch Text Classifier and Teachable Machine do not support training machine learning models based on programming blocks. In addition, although many other programming environments for machine learning education [21,22,23,24,25,26,27] have been proposed, it is still difficult for students to easily train machine learning models based on large amounts of data through programming blocks. In order to cope with this problem, we present a novel block-based approach for training machine learning models in Section 3.

3. An Interactive Environment for Large-Scale Machine Learning Education

3.1. Requirements of Our Programming Environment

In this paper, we present a novel programming environment that meets the following two requirements for effective primary/secondary machine learning education:

Instant Training: Students should be easily able to train a machine learning model on a single sample immediately (e.g., within 0.1 s).
Large-Scale Training: Students should be easily able to train a machine learning model on a large number of samples (e.g., 10,000 samples or more).

For the sake of simplicity, we assume that this programming environment only supports the supervised learning of image and text data.

In order to create this type of programming environment, we reuse the basic architecture of Tooee [14], which we developed earlier. Tooee provides Scratch blocks that allow users to send and receive messages to and from a local or remote web server. Note that Scratch does not have blocks that can train a machine learning model. If we use Tooee blocks, we can then enact the training process on the web server instead, which can have an effect similar to training the machine learning model in Scratch. However, the current version of Tooee does not meet the above two requirements. In order to cope with this problem, we redefine Tooee’s interface (Section 3.2) and present a novel web server implementation (Section 3.3).

3.2. Machine Learning Blocks

3.2.1. Basic Block Interfaces

The following shows the five basic Tooee blocks [14]:

B1.: Set Tooee’s address to ( ).
B2.: Tooee, what is ( ).
B3.: Tooee, ( ) is ( ).
B4.: Answer from Tooee.
B5.: Screen.

Here, each block can be used by dragging and dropping with the mouse, and the contents in parentheses can be filled in by typing with the keyboard.

The first block (B1) specifies the IP address and port number of the web server with which to communicate. The second and third blocks (B2 and B3) send the strings in parentheses to the web server. Whenever the web server receives a string, it sends its reply, which is stored in the fourth block (B4). The fifth block (B5) can be used with the second or third blocks (B2 or B3). Below, we describe the method used to train images and text with the five blocks above (the B1 to B5 blocks).

3.2.2. Machine Learning Blocks for Image Classification

We can train image data using the B3 block introduced in Section 3.2.1. The first argument of the B3 block is the image (or images) to be trained on, and the second argument is the name of the class. Here, we allow five types of images to be input as the first argument as follows: (1) the screen block (the B5 block), (2) an image file path, (3) a URL starting with “http,” (4) a URL starting with “https,” and (5) a directory containing image files.

Tooee, (screen) is (cat).
Tooee, (cat.png) is (cat).
Tooee, (http://tooee.org/cat.png) is (cat).
Tooee, (https://tooee.org/cat.jpg) is (cat).
Tooee, (cat/) is (cat).

Here, when the first block is executed, the “cat” class is trained using the current screen. When the second, third and fourth blocks are executed, the “cat” class is trained using the given image files. When the fifth block is executed, the “cat” class is trained using all images in the “cat/” directory.

When performing inference on an image file, a screen block or an image file can be used. However, we do not allow directories containing image files to be used for inference. Several examples of blocks that perform inference using images are shown below.

Tooee, what is (screen).
Tooee, what is (cat.png).
Tooee, what is (http://tooee.org/cat.png).
Tooee, what is (https://tooee.org/cat.jpg).

If we execute one of these blocks above, the B4 block will have the class name “cat”.

3.2.3. Machine Learning Blocks for Text Classification

Similar to how we train images, we also use the B3 blocks to train text. The text to be trained is given as the first argument of the B3 block, and the class name is given as the second argument of this block. A block that trains the “cat” class using the string “meow” is shown below.

Tooee, (meow) is (cat).

Similarly, we can use the B2 block when performing inference. For example, if the following block is executed, the “answer from the Tooee” block (the B4 block) will have the “cat” class name.

Tooee, what is (meow).

In addition, we can easily collect data using the B2 block. If we input a csv file as an argument of the B2 block, then (1) the file is divided into several tokens using “,” and newline characters as delimiters, and (2) the kth token among the divided tokens is returned. Here, k is 1 initially, and it increases by 1 every time the same block is executed. If the last token is returned, k then becomes 1 again. For example, assuming that the “cat.csv” file contains “meow,meow!!”, if we execute the “Tooee, what is (cat.csv)” block, the string “meow” is returned. If we run this block again, the string “meow!!” is returned. Similarly, if we pass a “.txt” file as an argument to this block, it works almost the same as executing a csv file block, except that tabs and newline characters are used as delimiters. We present examples of these blocks used for text data collection below:

Tooee, what is (cat.txt).
Tooee, what is (cat.csv).
Tooee, what is (http://tooee.org/cat.txt).
Tooee, what is (https://tooee.org/cat.csv).

3.3. Implementation of the Machine Learning Blocks

Recall that we use the B3 block introduced in Section 3.2.1 when training our machine learning model. If we train a large artificial neural network every time we execute this block, “instant training” would be impossible. To alleviate this problem, the size of the neural network can be reduced or fine-tuning can be conducted. However, it is still difficult to train a machine learning model immediately in this way on slow desktop computers used by young students.

In this paper, we used an approach that does not train the neural network when executing the above blocks. Instead, we prepared a Sentence-BERT-based [28] machine learning model that was trained on a lot of data in advance. If we execute the B3 block, an embedding vector for the first argument is created using our model. If the B2 block is executed, then (1) an embedding vector for the argument of this block is generated, (2) cosine similarities between this vector and the embedding vectors generated through the B3 block are calculated and (3) the most similar vector is found and the corresponding class name is returned. This simple algorithm effectively enables the “instant training” and “large-scale training” processes described in Section 3.1.

3.4. Example Applications Using the Machine Learning Blocks

3.4.1. Example Datasets

Table 1 summarizes the example datasets used in our example applications. The first dataset is the chatbot dataset [29], which has 11,822 question-answering pairs. The second dataset consists of a total of 4,431 photos of seven celebrities.

3.4.2. Question-Answering Chatbot

Figure 1 shows the example screenshots of the question-answering chatbot program. First, the robot says “Ask Tooee anything”. If the user inputs “I have a headache”, the robot uses this sentence to perform inference based on the trained model and outputs a result such as “Let’s take a break”.

Figure 2 shows a source code example of the type used to implement this application. When the program starts, it connects to port 9998 of the local computer. Then, the text data described in “http://tooee.org/chat.txt” is fetched incrementally, with the machine learning model then trained based on these data. For example, if we assume that this file contains the contents “I have a headache.\tLet’s take a break.\nHi\tHello\n.”, then the “Tooee, (I have a headache.) is (Let’s take a break.)” block is executed within the first loop and the “Tooee, (Hi) is (Hello)” block is executed within the second loop. If the space key is pressed, inference is performed using the string input by the user and the result is displayed on the screen.

3.4.3. Celebrity Look-Alike Program

Figure 3 shows screenshots of the “Celebrity Look-Alike” example program. Here, due to copyright issues, we replaced the celebrity photos with character photos. First, this program takes a picture of a user’s face. Then, it performs inference on the trained model and displays the picture of the celebrity with the highest probability value.

Figure 4 shows the source code example for this program. In this code, a total of four celebrities are trained. For example, in order to train the “Ben” class, this program uses all images in the “Ben/” directory.

4. Experiments

In this section, we conduct a survey of teachers (or pre-service teachers) to verify the effectiveness of our approach and to answer the research questions presented in Section 1. We conduct four crash courses for the teachers and conduct a two-tailed paired t-test using pre- and post-test results. We describe our experimental setup in detail in Section 4.1 and summarize the experimental results in Section 4.2.

4.1. Experimental Setup

We conducted four crash courses for a total of 36 teachers and pre-service teachers. Through these courses, the survey participants learned how to use the blocks presented in Section 3.2 and implemented the example applications presented in Section 3.4. Table 2 shows the target audience and the number of survey participants for each course. All the participants were teachers or pre-service teachers who have graduated from or are attending Chuncheon National University of Education in South Korea. Two courses were conducted online (Course A and Course D) and the other two courses were conducted offline (Course B and Course C). Course A participants were all primary or secondary school teachers. Participants in Course B, C and D were all pre-service primary school teachers. The contents of the lessons used in each course were the same.

Figure 5 summarizes the years of prior experience in block-based programming and text-based programming of the teachers and pre-service teachers who participated in the survey. The survey participants had about two years of block-based programming experience and about one year of text-based programming experience. Note that 24 out of 36 survey respondents had previous experience in training machine learning models, and the most commonly used machine learning environments were Scratch, Entry, Teachable Machine and Machine Learning for Kids.

Teachers (or pre-service teachers) answered the following questions on a five-point Likert scale before and after taking the course. A score closer to 1 indicates disagreement, and a score closer to 5 indicates agreement. In addition, we allowed them freely to express their opinions through open-ended questions. The experimental results are summarized in the following subsection.

Q1.: (interestingness) Do you think primary and secondary school students will be interested in implementing machine learning applications trained on 10,000+ data instances?
Q2.: (easiness) Do you think sixth-grade students (primary school students) with more than one year of programming experience will be able easily to implement machine learning applications trained on 10,000+ data instances?
Q3.: (easiness) Do you think ninth-grade students (secondary school students) with more than two years of programming experience will be able easily to implement machine learning applications trained on 10,000+ data instances?
Q4.: (importance) Do you think it is important for primary and secondary school students to gain experience in implementing machine learning applications trained on a lot of data?

4.2. Experimental Results

Table 3 summarizes the experimental results. In the pre-test and post-test results, the values outside the parentheses represent the means, and the values inside the parentheses represent the standard deviations. For all four questions, the teachers and pre-service teachers gave higher scores after taking one of the courses than before taking it. Note that after taking the course, they assigned an average score of 4 or higher for all the questions. In other words, after taking the course, they contended that primary and secondary school students would be interested in implementing machine learning models based on a lot of data (Q1), as well as being able to create them easily (Q2 and Q3). They also thought that machine learning education based on a lot of data was important (Q4).

In particular, the differences were statistically significant for the first, second and third questions. As a result of conducting a two-tailed paired t-test on the pre/post-test results of each question, the p-values for the first, second and third questions were found to be lower than 0.05. In other words, after taking this course, the teachers and pre-service teachers came to realize that not only secondary school students but also primary school students could easily and interestingly implement machine learning models based on relatively more data.

Among the comments submitted by teachers and pre-service teachers, the most common stated that our machine learning programming environment was easy and fun. One pre-service teacher noted that through this programming environment, students’ understanding of data and machine learning will increase because they can see more accurate results as the number of training data increases. Another teacher was impressed as the proposed environment is the first set of materials for primary and secondary education that trains machine learning models based on a large amount of data.

Table 4 summarizes the participants’ responses for each course. Here, the numbers before the slashes represent the pre-test values, and the numbers after the slashes represent the post-test values. If a post-test value differs by more than or equal to 0.5 from the corresponding pre-test value, the values are marked in bold. The table shows that most of the post-test values were higher than the pre-test values, regardless of which course the participants took. In particular, the difference between the pre- and post-results for the second question was very large for each course. This means that through our programming environment, even primary school students can now train large-scale machine learning models, which was difficult to do in the past.

5. Conclusions

In this paper, we present a novel programming environment for students to create machine learning models that train large-scale data based on a block-based programming language. One main feature of our machine learning educational environment that sets it apart from the existing environments is that it supports not only “large-scale training” but also “instant training”. We present examples of a text data training application (the question-answering chatbot) and an image data training application (the celebrity look-alike program) using our machine learning blocks. These example applications use only about 10 lines of short code for large-scale training and inference.

The experimental results show that the teachers and pre-service teachers who responded to our surveys suspect that even primary school students would easily be able to train large-scale data using our educational environment (Research Question 1). The results also show that primary and secondary school students would be interested in creating large-scale machine learning applications using our blocks (Research Question 2). In particular, in terms of easiness and interestingness, the differences between the pre- and post-test results were large, which means that our programming environment effectively supports large-scale training. Finally, most teachers and pre-service teachers expressed the opinion that it is important for young students to experience large-scale training (Research Question 3). To the best of our knowledge, this is the first environment to enable large-scale machine learning education using only programming blocks of a block-based programming language.

One limitation of this study is that there are still not enough block-based programming language-based educational materials to teach large-scale machine learning. In addition to the example programs presented in Section 3.4, we plan to provide more educational materials on our “http://tooee.org” website, which will be released soon. We also plan to conduct further research to verify the effectiveness of our educational materials from a pedagogical point of view.

Author Contributions

Methodology, Y.P. and Y.S.; Data curation, Y.P.; Writing—original draft, Y.P. and Y.S.; Writing—review & editing, Y.P. and Y.S.; Project administration, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by an Incheon National University Research Grant in 2020. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2020R1I1A3068836).

Institutional Review Board Statement

Chuncheon National University of Education Approval No.: 2022-16.

Conflicts of Interest

The authors declare no conflict of interest.

References

Resnick, M.; Maloney, J.; Monroy-Hernéz, A.; Rusk, N.; Eastmond, E.; Brennan, K.; Milner, A.; Rosenbaum, E.; Silver, J.; Silverman, B.; et al. Scratch: Programming for all. Commun. ACM 2009, 52, 60–67. [Google Scholar] [CrossRef] [Green Version]
Maloney, J.; Resnick, M.; Rusk, N.; Silverman, B.; Eastmond, E. The Scratch programming language and environment. ACM Trans. Comput. Educ. 2010, 10, 1–15. [Google Scholar] [CrossRef] [Green Version]
Park, Y.; Shin, Y. Comparing the effectiveness of scratch and app inventor with regard to learning computational thinking concepts. Electronics 2019, 8, 1269. [Google Scholar] [CrossRef] [Green Version]
Wing, J.M. Computational thinking. Commun. ACM 2006, 49, 33–35. [Google Scholar] [CrossRef]
Wing, J.M. Research notebook: Computational thinking—What and why. Link Mag. 2011, 6, 20–23. [Google Scholar]
Wing, J.M. Computational thinking’s influence on research and education for all. Ital. J. Educ. Technol. 2017, 25, 7–14. [Google Scholar]
Lane, D. Machine Learning for Kids: An Interactive Introduction to Artificial Intelligence; No Starch Press: San Francisco, CA, USA, 2021. [Google Scholar]
Druga, S. Growing up with AI: Cognimates: From Coding to Teaching Machines. Ph.D. Dissertation, Program in Media Arts and Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA, 2018. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. 2009. Available online: www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf (accessed on 12 December 2022).
AI for Oceans. Available online: https://curriculum.code.org/hoc/plugged.pdf (accessed on 12 December 2022).
What Students Learn about AI/ML. Available online: https://machinelearningforkids.co.uk/#!/stories/correlation-of-quantity-with-accuracy (accessed on 28 November 2022).
Vartiainen, H.; Toivonen, T.; Jormanainen, I.; Kahila, J.; Tedre, M.; Valtonen, T. Machine learning for middle schoolers: Learning through data-driven design. Int. J. Child-Comput. Interact. 2021, 29, 100281. [Google Scholar] [CrossRef]
Park, Y.; Shin, Y. Tooee: A Novel Scratch Extension for K-12 Big Data and Artificial Intelligence Education Using Text-Based programming blocks. IEEE Access 2021, 9, 149630–149646. [Google Scholar] [CrossRef]
Carney, M.; Webster, B.; Alvarado, K.; Phillips, K.; Howell, N.; Griffith, J.; Joneejan, J.; Pitaru, A.; Chen, A. Teachable machine: Approachable web-based tool for exploring machine learning classification. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1–8. [Google Scholar]
The Entry Programming Environment. Available online: https://playentry.org (accessed on 1 November 2022).
Williams, R.; Kaputsos, S.P.; Breazeal, C. Teacher Perspectives on How To Train Your Robot: A Middle School AI and Ethics Curriculum. Proc. Aaai Conf. Artif. Intell. 2021, 35, 15678–15686. [Google Scholar] [CrossRef]
Cer, D.; Yang, Y.; Kong, S.; Hua, N.; Limtiaco, N.; St. John, R.; Constant, N.; Guajardo-Cespedes, M.; Yuan, S.; Tar, C.; et al. Universal sentence encoder. arXiv 2018, arXiv:1803.11175. [Google Scholar]
Smilkov, D.; Thorat, N.; Assogba, Y.; Yuan, A.; Kreeger, N.; Yu, P.; Zhang, K.; Cai, S.; Nielsen, E.; Soergel, D.; et al. Tensorflow.js: Machine learning for the web and beyond. Proc. Mach. Learn. Syst. 2019, 1, 309–321. [Google Scholar]
Park, Y.; Shin, Y. Novel Scratch Programming Blocks for Web Scraping. Electronics 2022, 11, 2584. [Google Scholar] [CrossRef]
Alturayeif, N.; Alturaief, N.; Alhathloul, Z. DeepScratch: Scratch programming language extension for deep learning education. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 642–650. [Google Scholar] [CrossRef]
Kahn, K.; Megasari, R.; Piantari, E.; Junaeti, E. AI programming by children using snap! Block programming in a developing country. In Proceedings of the 13th European Conference on Technology Enhanced Learning 2018, Leeds, UK, 3–5 September 2018; pp. 1–14. [Google Scholar]
Williams, R.; Park, H.W.; Oh, L.; Breazeal, C. Popbots: Designing an artificial intelligence curriculum for early childhood education. Proc. Aaai Conf. Artif. Intell. 2019, 33, 9729–9736. [Google Scholar] [CrossRef]
García, J.D.R.; Moreno-León, J.; Román-González, M.; Robles, G. LearningML: A tool to foster computational thinking skills through practical artificial intelligence projects. Rev. Educ. Distancia (RED) 2020, 20, 1–37. [Google Scholar]
Agassi, A.; Erel, H.; Wald, I.Y.; Zuckerman, O. Scratch nodes ML: A playful system for children to create gesture recognition classifiers. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1–6. [Google Scholar]
Rao, A.; Bihani, A.; Nair, M. Milo: A visual programming environment for data science education. In Proceedings of the 2018 IEEE Symposium on Visual Languages and Human-Centric Computing, Lisbon, Portugal, 1–4 October 2018; pp. 211–215. [Google Scholar]
Artificial Intelligence with MIT App Inventor. Available online: https://appinventor.mit.edu/explore/ai-with-mit-app-inventor (accessed on 28 November 2022).
Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, 3–7 November 2019; pp. 3982–3992. [Google Scholar]
The Chatbot Dataset. Available online: https://github.com/songys/Chatbot_data (accessed on 1 November 2022).

Figure 1. A chatbot program trained on 10,000 question-and-answer pairs. In this example, the trained chatbot replies “Let’s take a break” to the query “I have a headache”.

Figure 2. Source code example that trains 10,000 Q&A pairs when the green flag is clicked and that performs inference when the space key is pressed.

Figure 3. The “celebrity look-alike” program. In this program, based on the user’s face photographed through the camera, a celebrity resembling the user is found and displayed on the screen. Here, celebrity images are replaced with character images.

Figure 4. Example source code that trains 4431 celebrity images when the green flag is clicked and performs inference when the space key is pressed. Here, a forward slash (/) indicates all images in the directory.

Figure 5. Summary of the programming experiences of the survey participants.

Table 1. Example datasets to be used in the programs introduced in Section 3.4.2 and Section 3.4.3.

Dataset Name	Data Type	# Training Samples	# Classes
Chatbot [29]	Text	11,822	7784
Celebrity Look-Alike	Image	4431	7

Table 2. Number of survey participants for each of the four crash courses.

Crash Course ID	Target Audience	Online/Offline	# of Participants
Course A	Teachers	Online	11
Course B	Pre-Service Teachers	Offline	5
Course C	Pre-Service Teachers	Offline	16
Course D	Pre-Service Teachers	Online	4

Table 3. Pre/post-test results for each question. Here, the values outside the parentheses represent the means and those inside the parentheses represent the standard deviations.

Question	Pre-Test	Post-Test	p-Value
Q1	4.00 (0.89)	4.39 (0.80)	0.02088 (*).
Q2	3.31 (1.24)	4.19 (1.01)	0.00016 (***)
Q3	4.19 (0.71)	4.56 (0.81)	0.01025 (*)
Q4	4.47 (0.56)	4.56 (0.61)	0.32417

Table 4. Pre/post-test results of participants who took each course.

Question	Course A	Course B	Course C	Course D
Q1	3.88/4.19	4.40/4.60	4.09/4.45	3.75/4.75
Q2	3.50/4.00	2.60/4.60	3.18/4.27	3.75/4.25
Q3	4.38/4.75	4.00/4.80	4.00/4.27	4.25/4.25
Q4	4.44/4.31	4.80/4.80	4.36/4.73	4.50/4.75

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Park, Y.; Shin, Y. A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education. Appl. Sci. 2022, 12, 13008. https://0-doi-org.brum.beds.ac.uk/10.3390/app122413008

AMA Style

Park Y, Shin Y. A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education. Applied Sciences. 2022; 12(24):13008. https://0-doi-org.brum.beds.ac.uk/10.3390/app122413008

Chicago/Turabian Style

Park, Youngki, and Youhyun Shin. 2022. "A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education" Applied Sciences 12, no. 24: 13008. https://0-doi-org.brum.beds.ac.uk/10.3390/app122413008

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Block-Based Interactive Programming Environment for Large-Scale Machine Learning Education

Abstract

1. Introduction

2. Related Work

3. An Interactive Environment for Large-Scale Machine Learning Education

3.1. Requirements of Our Programming Environment

3.2. Machine Learning Blocks

3.2.1. Basic Block Interfaces

3.2.2. Machine Learning Blocks for Image Classification

3.2.3. Machine Learning Blocks for Text Classification

3.3. Implementation of the Machine Learning Blocks

3.4. Example Applications Using the Machine Learning Blocks

3.4.1. Example Datasets

3.4.2. Question-Answering Chatbot

3.4.3. Celebrity Look-Alike Program

4. Experiments

4.1. Experimental Setup

4.2. Experimental Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI