Code problem similarity detection using code clones and pretrained models
There are many websites hosting code contests such as Leetcode, Codeforces and Codechef. These code contests on average attract 20k technology enthusiasts to participate, as getting a good rank in such contests can improve their problem solving skills and enhance their resume during job search. Thes...
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/165850 |
_version_ | 1811681705307668480 |
---|---|
author | Yeo, Geremie Yun Siang |
author2 | Anwitaman Datta |
author_facet | Anwitaman Datta Yeo, Geremie Yun Siang |
author_sort | Yeo, Geremie Yun Siang |
collection | NTU |
description | There are many websites hosting code contests such as Leetcode, Codeforces and Codechef. These code contests on average attract 20k technology enthusiasts to participate, as getting a good rank in such contests can improve their problem solving skills and enhance their resume during job search. These contests typically support solving code problems in multiple programming languages, such as Python, C++ and Java. However, due to the vast number of code problems that exist on these sites, it is inevitable that some of these will be duplicated or very similar to one another. Duplicated code problems during a contest is not ideal as contestants may copy solution source codes from the old problem which was published before the contest, gaining undeserved points and as such making the standings unfair. This paper proposes a solution to detect similar code problems on Codeforces, the world’s most popular competitive programming website with over 100k active users. The similarity is determined based on accepted solution source codes (*not the problem text) to determine which problems are similar to one another. |
first_indexed | 2024-10-01T03:45:11Z |
format | Final Year Project (FYP) |
id | ntu-10356/165850 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T03:45:11Z |
publishDate | 2023 |
publisher | Nanyang Technological University |
record_format | dspace |
spelling | ntu-10356/1658502024-09-24T07:30:17Z Code problem similarity detection using code clones and pretrained models Yeo, Geremie Yun Siang Anwitaman Datta Patrick Pun Chi Seng School of Computer Science and Engineering Anwitaman@ntu.edu.sg, cspun@ntu.edu.sg Engineering::Computer science and engineering There are many websites hosting code contests such as Leetcode, Codeforces and Codechef. These code contests on average attract 20k technology enthusiasts to participate, as getting a good rank in such contests can improve their problem solving skills and enhance their resume during job search. These contests typically support solving code problems in multiple programming languages, such as Python, C++ and Java. However, due to the vast number of code problems that exist on these sites, it is inevitable that some of these will be duplicated or very similar to one another. Duplicated code problems during a contest is not ideal as contestants may copy solution source codes from the old problem which was published before the contest, gaining undeserved points and as such making the standings unfair. This paper proposes a solution to detect similar code problems on Codeforces, the world’s most popular competitive programming website with over 100k active users. The similarity is determined based on accepted solution source codes (*not the problem text) to determine which problems are similar to one another. Bachelor of Science in Mathematical and Computer Sciences 2023-04-14T01:13:14Z 2023-04-14T01:13:14Z 2023 Final Year Project (FYP) Yeo, G. Y. S. (2023). Code problem similarity detection using code clones and pretrained models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165850 https://hdl.handle.net/10356/165850 en SCSE22-0384 10.21979/N9/VPCR7H application/pdf Nanyang Technological University |
spellingShingle | Engineering::Computer science and engineering Yeo, Geremie Yun Siang Code problem similarity detection using code clones and pretrained models |
title | Code problem similarity detection using code clones and pretrained models |
title_full | Code problem similarity detection using code clones and pretrained models |
title_fullStr | Code problem similarity detection using code clones and pretrained models |
title_full_unstemmed | Code problem similarity detection using code clones and pretrained models |
title_short | Code problem similarity detection using code clones and pretrained models |
title_sort | code problem similarity detection using code clones and pretrained models |
topic | Engineering::Computer science and engineering |
url | https://hdl.handle.net/10356/165850 |
work_keys_str_mv | AT yeogeremieyunsiang codeproblemsimilaritydetectionusingcodeclonesandpretrainedmodels |