Brute-force sentence pattern extortion from harmful messages for cyberbullying detection
Authors:
- Michał Ptaszyński,
- Paweł Lempa,
- Fumito Masui,
- Yasutomo Kimura,
- Rafał Rzepka,
- Kenji Araki,
- Michał Wroczyński,
- Gniewosz Leliwa
Abstract
Cyberbullying, or humiliating people using the Internet, has existed almost since the beginning of Internet communication. The relatively recent introduction of smartphones and tablet computers has caused cyberbullying to evolve into a serious social problem. In Japan, members of a parent-teacher association (PTA) attempted to address the problem by scanning the Internet for cyberbullying entries. To help these PTA members and other interested parties confront this difficult task we propose a novel method for automatic detection of malicious Internet content. This method is based on a combinatorial approach resembling brute-force search algorithms, but applied in language classification. The method extracts sophisticated patterns from sentences and uses them in classification. The experiments performed on actual cyberbullying data reveal an advantage of our method vis-à-vis previous methods. Next, we implemented the method into an application for Android smartphones to automatically detect possible harmful content in messages. The method performed well in the Android environment, but still needs to be optimized for time efficiency in order to be used in practice. Surprisingly, the developed method can also be effectively used in the engineering documentation environment, giving powerful search tools in a semantic context, not just a simple lexical one. Thus, in the field of Mechanical Engineering, it can significantly support the work of both a machine builder and a machine production technologist, as it allows searching by descriptions, not just keywords that require careful tagging.
- Record ID
- CUT17b71782d89f4c3c971cdd05f05b2a0d
- Publication categories
- ;
- Author
- Journal series
- Journal of the Association for Information Systems, ISSN 1536-9323, e-ISSN 1558-3457
- Issue year
- 2019
- Vol
- 20
- No
- 8
- Pages
- 1075-1127
- Other elements of collation
- rys.; tab.; wykr.; Bibliografia (na s.) - 1105-1107; Oznaczenie streszczenia - Abstr.; Numeracja w czasopiśmie - Vol. 20, Iss. 8
- Keywords in English
- automatic cyberbullying detection, natural language processing, language combinatorics, technical documentation, semantic search, detection, machine designing
- DOI
- DOI:10.17705/1jais.00562 Opening in a new tab
- URL
- https://aisel.aisnet.org/jais/vol20/iss8/4/ Opening in a new tab
- Language
- eng (en) English
- License
- Score (nominal)
- 140
- Additional fields
- Indeksowana w: Web of Science, Scopus
- Uniform Resource Identifier
- https://cris.pk.edu.pl/info/article/CUT17b71782d89f4c3c971cdd05f05b2a0d/
- URN
urn:pkr-prod:CUT17b71782d89f4c3c971cdd05f05b2a0d
* presented citation count is obtained through Internet information analysis, and it is close to the number calculated by the Publish or PerishOpening in a new tab system.