Natural Language Processing and Artificial Intelligence Methods in Software Engineering
Software development is a complex activity that requires, in addition to professional knowledge and technical skills, an analytical approach to the phenomena of the world around us. For software to perform its task at the appropriate level requires a thorough understanding and analysis of the phenomenon it is modeling, based on the functional and non-functional requirements formulated by the experts and stakeholders of the problem and detailed in the business analysis. The requirements are predominantly available in natural language and are formulated using the particular language of the business area for which the software is being developed. In addition, a significant part of the requirements is not only conveyed in the form of verified business documents and policies but also in the form of ideas and concepts expressed verbally during interviews. The latter are often ambiguous, contradictory, or incomplete. In addition, the use of business terminology also implies tacit knowledge that is self-evident to the stakeholders but often unknown to the development team.
The use of natural language processing methods and artificial intelligence tools can help the development team automate processing requirements given in natural language forms. The present research has focused on classifying requirements and developing tools for reconciling and managing semantic obstacles arising from different language usage by the stakeholders and the developers. Based on the results of the research presented in this thesis, it can be stated that natural language processing methods, as well as artificial intelligence tools, machine learning, and semantic networks, are suitable to achieve the declared objectives of supporting the requirements engineering tasks of software developers, which can be extended to other cases of interactions between the development and business domains, such as the end-user support processes.
Software development has undergone a significant transformation in recent deca-des. Nowadays, no one is expected to have universal knowledge of the vast array of frameworks, programming languages, and tools. At the same time, development is increasingly relying on reusable elements that have been previously built and made available to the development communities. Because of the high specialization, it can be stated that it is increasingly important for developers to share their knowledge efficiently and to ask specialists from different professional areas for help in solving problems they encounter in their daily work. Addressing the problem that arose from the specialization, a number of Q\&A sites have been created on the Internet to share the experiences of the developers and help them solve problems they encounter in their work. The best known such sites is Stack Overflow, which is dedicated to supporting developers’ work by creating a repository of knowledge of their solved problems.
Maintaining the quality of a Q\&A site while its traffic is growing is a significant challenge for moderators, but at the same time, formulating questions that meet the expectations of the community of that particular site is not always straightforward. Our research focused on predicting the quality of the questions asked on Stack Overflow using natural language processing tools and deep learning, respectively, and the prediction of the likelihood for their subsequent closure. The research also investigated the possible reasons for the question closures. The results have led to models that can be used in practice to check the compliance of questions under assembling by the user with the requirements of the Stack Overflow, thus reducing both the moderator workload and the likelihood of the subsequent closures thus indirectly preserving the professional quality of the portal.
https://doktori.bibl.u-szeged.hu/id/eprint/11239/1/dissertation.pdf